gvisor - Container Runtime Sandbox

Age	Commit message (Collapse)	Author
2021-07-12	[syserror] Update syserror to linuxerr for more errors.	Zach Koopmans
	Update the following from syserror to the linuxerr equivalent: EEXIST EFAULT ENOTDIR ENOTTY EOPNOTSUPP ERANGE ESRCH PiperOrigin-RevId: 384329869
2021-07-01	[syserror] Update several syserror errors to linuxerr equivalents.	Zach Koopmans
	Update/remove most syserror errors to linuxerr equivalents. For list of removed errors, see //pkg/syserror/syserror.go. PiperOrigin-RevId: 382574582
2021-06-30	[syserror] Update syserror to linuxerr for EACCES, EBADF, and EPERM.	Zach Koopmans
	Update all instances of the above errors to the faster linuxerr implementation. With the temporary linuxerr.Equals(), no logical changes are made. PiperOrigin-RevId: 382306655
2021-06-29	[syserror] Change syserror to linuxerr for E2BIG, EADDRINUSE, and EINVAL	Zach Koopmans
	Remove three syserror entries duplicated in linuxerr. Because of the linuxerr.Equals method, this is a mere change of return values from syserror to linuxerr definitions. Done with only these three errnos as CLs removing all grow to a significantly large size. PiperOrigin-RevId: 382173835
2021-06-22	[syserror] Add conversions to linuxerr with temporary Equals method.	Zach Koopmans
	Add Equals method to compare syserror and unix.Errno errors to linuxerr errors. This will facilitate removal of syserror definitions in a followup, and finding needed conversions from unix.Errno to linuxerr. PiperOrigin-RevId: 380909667
2021-06-10	Minor VFS2 xattr changes.	Jamie Liu
	- Allow the gofer client to use most xattr namespaces. As documented by the updated comment, this is consistent with e.g. Linux's FUSE client, and allows gofers to provide extended attributes from FUSE filesystems. - Make tmpfs' listxattr omit xattrs in the "trusted" namespace for non-privileged users. PiperOrigin-RevId: 378778854
2021-06-07	Implement RENAME_NOREPLACE for all VFS2 filesystem implementations.	Jamie Liu
	PiperOrigin-RevId: 377966969
2021-05-14	Resolve remaining O_PATH TODOs.	Dean Deng
	O_PATH is now implemented in vfs2. Fixes #2782. PiperOrigin-RevId: 373861410
2021-04-02	Internal change.	gVisor bot
	PiperOrigin-RevId: 366462448
2021-03-29	[syserror] Split usermem package	Zach Koopmans
	Split usermem package to help remove syserror dependency in go_marshal. New hostarch package contains code not dependent on syserror. PiperOrigin-RevId: 365651233
2021-03-11	Report filesystem-specific mount options.	Rahat Mahmood
	PiperOrigin-RevId: 362406813
2021-02-10	Support setgid directories in tmpfs and kernfs	Kevin Krakauer
	PiperOrigin-RevId: 356868412
2021-02-09	pipe: writeLocked has to return ErrWouldBlock if the pipe is full	Andrei Vagin
	PiperOrigin-RevId: 356450303
2021-02-03	[vfs] Make sticky bit check consistent with Linux.	Ayush Ranjan
	Our implementation of vfs.CheckDeleteSticky was not consistent with Linux, specifically not consistent with fs/linux.h:check_sticky(). One of the biggest differences was that the vfs implementation did not allow the owner of the sticky directory to delete files inside it that belonged to other users. This change makes our implementation consistent with Linux. Also adds an integration test to check for this. This bug is also present in VFS1. Updates #3027 PiperOrigin-RevId: 355557425
2021-01-22	Implement F_GETLK fcntl.	Dean Deng
	Fixes #5113. PiperOrigin-RevId: 353313374
2021-01-20	Move Lock/UnlockPOSIX into LockFD util.	Dean Deng
	PiperOrigin-RevId: 352904728
2021-01-14	Check for existence before permissions	Fabricio Voznika
	Return EEXIST when overwritting a file as long as the caller has exec permission on the parent directory, even if the caller doesn't have write permission. Also reordered the mount write check, which happens before permission is checked. Closes #5164 PiperOrigin-RevId: 351868123
2020-11-18	Port filesystem metrics to VFS2.	Jamie Liu
	PiperOrigin-RevId: 343196927
2020-11-17	tmpfs: make sure that a dentry will not be destroyed before the open() call	Andrei Vagin
	If we don't hold a reference, the dentry can be destroyed by another thread. Reported-by: syzbot+f2132e50060c41f6d41f@syzkaller.appspotmail.com PiperOrigin-RevId: 342951940
2020-11-13	fs/tmpfs: change regularFile.size atomically	Andrei Vagin
	PiperOrigin-RevId: 342221309
2020-11-13	fs/tmpfs: use atomic operations to access inode.mode	Andrei Vagin
	PiperOrigin-RevId: 342214859
2020-11-11	Read fsimpl/tmpfs timestamps atomically.	Jamie Liu
	PiperOrigin-RevId: 341982672
2020-11-09	Initialize references with a value of 1.	Dean Deng
	This lets us avoid treating a value of 0 as one reference. All references using the refsvfs2 template must call InitRefs() before the reference is incremented/decremented, or else a panic will occur. Therefore, it should be pretty easy to identify missing InitRef calls during testing. Updates #1486. PiperOrigin-RevId: 341411151
2020-11-03	Make pipe min/max sizes match linux.	Nicolas Lacasse
	The default pipe size already matched linux, and is unchanged. Furthermore `atomicIOBytes` is made a proper constant (as it is in Linux). We were plumbing usermem.PageSize everywhere, so this is no functional change. PiperOrigin-RevId: 340497006
2020-10-23	Support VFS2 save/restore.	Jamie Liu
	Inode number consistency checks are now skipped in save/restore tests for reasons described in greatest detail in StatTest.StateDoesntChangeAfterRename. They pass in VFS1 due to the bug described in new test case SimpleStatTest.DifferentFilesHaveDifferentDeviceInodeNumberPairs. Fixes #1663 PiperOrigin-RevId: 338776148
2020-10-23	Rewrite reference leak checker without finalizers.	Dean Deng
	Our current reference leak checker uses finalizers to verify whether an object has reached zero references before it is garbage collected. There are multiple problems with this mechanism, so a rewrite is in order. With finalizers, there is no way to guarantee that a finalizer will run before the program exits. When an unreachable object with a finalizer is garbage collected, its finalizer will be added to a queue and run asynchronously. The best we can do is run garbage collection upon sandbox exit to make sure that all finalizers are enqueued. Furthermore, if there is a chain of finalized objects, e.g. A points to B points to C, garbage collection needs to run multiple times before all of the finalizers are enqueued. The first GC run will register the finalizer for A but not free it. It takes another GC run to free A, at which point B's finalizer can be registered. As a result, we need to run GC as many times as the length of the longest such chain to have a somewhat reliable leak checker. Finally, a cyclical chain of structs pointing to one another will never be garbage collected if a finalizer is set. This is a well-known issue with Go finalizers (https://github.com/golang/go/issues/7358). Using leak checking on filesystem objects that produce cycles will not work and even result in memory leaks. The new leak checker stores reference counted objects in a global map when leak check is enabled and removes them once they are destroyed. At sandbox exit, any remaining objects in the map are considered as leaked. This provides a deterministic way of detecting leaks without relying on the complexities of finalizers and garbage collection. This approach has several benefits over the former, including: - Always detects leaks of objects that should be destroyed very close to sandbox exit. The old checker very rarely detected these leaks, because it relied on garbage collection to be run in a short window of time. - Panics if we forgot to enable leak check on a ref-counted object (we will try to remove it from the map when it is destroyed, but it will never have been added). - Can store extra logging information in the map values without adding to the size of the ref count struct itself. With the size of just an int64, the ref count object remains compact, meaning frequent operations like IncRef/DecRef are more cache-efficient. - Can aggregate leak results in a single report after the sandbox exits. Instead of having warnings littered in the log, which were non-deterministically triggered by garbage collection, we can print all warning messages at once. Note that this could also be a limitation--the sandbox must exit properly for leaks to be detected. Some basic benchmarking indicates that this change does not significantly affect performance when leak checking is enabled, which is understandable since registering/unregistering is only done once for each filesystem object. Updates #1486. PiperOrigin-RevId: 338685972
2020-10-13	Don't read beyond EOF when inserting into sentry page cache.	Jamie Liu
	The sentry page cache stores file contents at page granularity; this is necessary for memory mappings. Thus file offset ranges passed to fsutil.FileRangeSet.Fill() must be page-aligned. If the read callback passed to Fill() returns (partial read, nil error) when reading up to EOF (which is the case for p9.ClientFile.ReadAt() since 9P's Rread cannot convey both a partial read and EOF), Fill() will re-invoke the read callback to try to read from EOF to the end of the containing page, which is harmless but needlessly expensive. Fix this by handling file size explicitly in fsutil.FileRangeSet.Fill(). PiperOrigin-RevId: 336934075
2020-10-13	[vfs2] Don't take reference in Task.MountNamespaceVFS2 and MountNamespace.Root.	Dean Deng
	This fixes reference leaks related to accidentally forgetting to DecRef() after calling one or the other. PiperOrigin-RevId: 336918922
2020-10-13	[vfs2] Destroy all tmpfs files when the filesystem is released.	Dean Deng
	In addition to fixing reference leaks, this change also releases memory used by regular tmpfs files once the containing filesystem is released. PiperOrigin-RevId: 336833111
2020-10-13	[vfs2] Add FilesystemType.Release to avoid reference leaks.	Dean Deng
	Singleton filesystem like devpts and devtmpfs have a single filesystem shared among all mounts, so they acquire a "self-reference" when initialized that must be released when the entire virtual filesystem is released at sandbox exit. PiperOrigin-RevId: 336828852
2020-09-28	Support inotify in overlayfs.	Dean Deng
	Fixes #1479, #317. PiperOrigin-RevId: 334258052
2020-09-24	Add basic stateify annotations.	Adin Scannell
	Updates #1663 PiperOrigin-RevId: 333539293
2020-09-18	Use a tmpfs file for shared anonymous and /dev/zero mmap on VFS2.	Jamie Liu
	This is more consistent with Linux (see comment on MM.NewSharedAnonMappable()). We don't do the same thing on VFS1 for reasons documented by the updated comment. PiperOrigin-RevId: 332514849
2020-09-17	fsimpl: improve the "implements" comments	Tiwei Bie
	As noticed by @ayushr2, the "implements" comments are not consistent, e.g. // IterDirents implements kernfs.inodeDynamicLookup. // Generate implements vfs.DynamicBytesSource.Generate. This patch improves this by making the comments like this consistently include the package name (when the interface and struct are not in the same package) and method name. Signed-off-by: Tiwei Bie <tiwei.btw@antgroup.com>
2020-09-08	Honor readonly flag for root mount	Fabricio Voznika
	Updates #1487 PiperOrigin-RevId: 330580699
2020-09-08	[vfs] Capitalize x in the {Get/Set/Remove/List}xattr functions.	Ayush Ranjan
	PiperOrigin-RevId: 330554450
2020-09-02	[vfs] Implement xattr for overlayfs.	Ayush Ranjan
	PiperOrigin-RevId: 329825497
2020-08-27	unix: return ECONNREFUSE if a socket file exists but a socket isn't bound to it	Andrei Vagin
	PiperOrigin-RevId: 328843560
2020-08-26	tmpfs: Allow xattrs in the trusted namespace if creds has CAP_SYS_ADMIN.	Nicolas Lacasse
	This is needed to support the overlay opaque attribute. PiperOrigin-RevId: 328552985
2020-08-25	Return non-zero size for tmpfs statfs(2).	Jamie Liu
	This does not implement accepting or enforcing any size limit, which will be more complex and has performance implications; it just returns a fixed non-zero size. Updates #1936 PiperOrigin-RevId: 328428588
2020-08-21	Make mounts ReadWrite first, then later change to ReadOnly.	Nicolas Lacasse
	This lets us create "synthetic" mountpoint directories in ReadOnly mounts during VFS setup. Also add context.WithMountNamespace, as some filesystems (like overlay) require a MountNamespace on ctx to handle vfs.Filesystem Operations. PiperOrigin-RevId: 327874971
2020-08-20	Consistent precondition formatting	Michael Pratt
	Our "Preconditions:" blocks are very useful to determine the input invariants, but they are bit inconsistent throughout the codebase, which makes them harder to read (particularly cases with 5+ conditions in a single paragraph). I've reformatted all of the cases to fit in simple rules: 1. Cases with a single condition are placed on a single line. 2. Cases with multiple conditions are placed in a bulleted list. This format has been added to the style guide. I've also mentioned "Postconditions:", though those are much less frequently used, and all uses already match this style. PiperOrigin-RevId: 327687465
2020-08-18	Avoid holding locks when opening files in VFS2.	Jamie Liu
	Fixes #3243, #3521 PiperOrigin-RevId: 327308890
2020-08-17	[vfs] Do O_DIRECTORY check after resolving symlinks.	Ayush Ranjan
	Fixes python runtime test test_glob. Updates #3515 We were checking is the to-be-opened dentry is a dir or not before resolving symlinks. We should check that after resolving symlinks. This was preventing us from opening a symlink which pointed to a directory with O_DIRECTORY. Also added this check in tmpfs and removed a duplicate check. PiperOrigin-RevId: 327085895
2020-08-12	Add reference leak checking to vfs2 tmpfs.inode.	Dean Deng
	Updates #1486. PiperOrigin-RevId: 326354750
2020-08-12	[vfs2][gofer] Return appropriate errors when opening and creating files.	Ayush Ranjan
	Fixes php test ext/standard/tests/file/touch_variation5.phpt on vfs2. Updates #3516 Also spotted a bug with O_EXCL, where we did not return EEXIST when we tried to open the root of the filesystem with O_EXCL \| O_CREAT. Added some more tests for open() corner cases. PiperOrigin-RevId: 326346863
2020-08-05	Correctly decrement link counts in tmpfs rename operations.	Dean Deng
	When a directory is replaced by a rename operation, its link count should reach zero. We were missing the link from `dir/.` PiperOrigin-RevId: 325141730
2020-08-05	Add missing case in tmpfs.inode.direntType.	Dean Deng
	This was discovered by syzkaller. PiperOrigin-RevId: 325025193
2020-08-04	Automated rollback of changelist 324906582	Dean Deng
	PiperOrigin-RevId: 324931854
2020-08-04	Add reference counting utility to VFS2.	Dean Deng
	The utility has several differences from the VFS1 equivalent: - There are no weak references, which have a significant overhead - In order to print useful debug messages with the type of the reference- counted object, we use a generic Refs object with the owner type as a template parameter. In vfs1, this was accomplished by storing a type name and caller stack directly in the ref count (as in vfs1), which increases the struct size by 6x. (Note that the caller stack was needed because fs types like Dirent were shared by all fs implementations; in vfs2, each impl has its own data structures, so this is no longer necessary.) As an example, the utility is added to tmpfs.inode. Updates #1486. PiperOrigin-RevId: 324906582