summaryrefslogtreecommitdiffhomepage
path: root/pkg/sentry
AgeCommit message (Collapse)Author
2020-09-09Add function to get error from a tcpip.EndpointGhanan Gowripalan
In an upcoming CL, socket option types are made to implement a marker interface with pointer receivers. Since this results in calling methods of an interface with a pointer, we incur an allocation when attempting to get an Endpoint's last error with the current implementation. When calling the method of an interface, the compiler is unable to determine what the interface implementation does with the pointer (since calling a method on an interface uses virtual dispatch at runtime so the compiler does not know what the interface method will do) so it allocates on the heap to be safe incase an implementation continues to hold the pointer after the functioon returns (the reference escapes the scope of the object). In the example below, the compiler does not know what b.foo does with the reference to a it allocates a on the heap as the reference to a may escape the scope of a. ``` var a int var b someInterface b.foo(&a) ``` This change removes the opportunity for that allocation. RELNOTES: n/a PiperOrigin-RevId: 328796559
2020-09-09ip6tables: (de)serialize ip6tables structsKevin Krakauer
More implementation+testing to follow. #3549. PiperOrigin-RevId: 328770160
2020-09-09Make flag propagation automaticFabricio Voznika
Use reflection and tags to provide automatic conversion from Config to flags. This makes adding new flags less error-prone, skips flags using default values (easier to read), and makes tests correctly use default flag values for test Configs. Updates #3494 PiperOrigin-RevId: 328662070
2020-09-09Device major number greater than 2 digits in /proc/self/maps on arm64 N1 machineBin Lu
Signed-off-by: Bin Lu <bin.lu@arm.com>
2020-09-09Support stdlib analyzers with nogo.Adin Scannell
This immediately revealed an escape analysis violation (!), where the sync.Map was being used in a context that escapes were not allowed. This is a relatively minor fix and is included. PiperOrigin-RevId: 328611237
2020-09-09Remove spurious fd.IncRef().Nicolas Lacasse
PiperOrigin-RevId: 328583461
2020-09-09tmpfs: Allow xattrs in the trusted namespace if creds has CAP_SYS_ADMIN.Nicolas Lacasse
This is needed to support the overlay opaque attribute. PiperOrigin-RevId: 328552985
2020-09-09Use new reference count utility throughout gvisor.Dean Deng
This uses the refs_vfs2 template in vfs2 as well as objects common to vfs1 and vfs2. Note that vfs1-only refcounts are not replaced, since vfs1 will be deleted soon anyway. The following structs now use the new tool, with leak check enabled: devpts:rootInode fuse:inode kernfs:Dentry kernfs:dir kernfs:readonlyDir kernfs:StaticDirectory proc:fdDirInode proc:fdInfoDirInode proc:subtasksInode proc:taskInode proc:tasksInode vfs:FileDescription vfs:MountNamespace vfs:Filesystem sys:dir kernel:FSContext kernel:ProcessGroup kernel:Session shm:Shm mm:aioMappable mm:SpecialMappable transport:queue And the following use the template, but because they currently are not leak checked, a TODO is left instead of enabling leak check in this patch: kernel:FDTable tun:tunEndpoint Updates #1486. PiperOrigin-RevId: 328460377
2020-09-09Return non-zero size for tmpfs statfs(2).Jamie Liu
This does not implement accepting or enforcing any size limit, which will be more complex and has performance implications; it just returns a fixed non-zero size. Updates #1936 PiperOrigin-RevId: 328428588
2020-09-09Expose basic coverage information to userspace through kcov interface.Dean Deng
In Linux, a kernel configuration is set that compiles the kernel with a custom function that is called at the beginning of every basic block, which updates the memory-mapped coverage information. The Go coverage tool does not allow us to inject arbitrary instructions into basic blocks, but it does provide data that we can convert to a kcov-like format and transfer them to userspace through a memory mapping. Note that this is not a strict implementation of kcov, which is especially tricky to do because we do not have the same coverage tools available in Go that that are available for the actual Linux kernel. In Linux, a kernel configuration is set that compiles the kernel with a custom function that is called at the beginning of every basic block to write program counters to the kcov memory mapping. In Go, however, coverage tools only give us a count of basic blocks as they are executed. Every time we return to userspace, we collect the coverage information and write out PCs for each block that was executed, providing userspace with the illusion that the kcov data is always up to date. For convenience, we also generate a unique synthetic PC for each block instead of using actual PCs. Finally, we do not provide thread-specific coverage data (each kcov instance only contains PCs executed by the thread owning it); instead, we will supply data for any file specified by -- instrumentation_filter. Also, fix issue in nogo that was causing pkg/coverage:coverage_nogo compilation to fail. PiperOrigin-RevId: 328426526
2020-09-09[go-marshal] Enable auto-marshalling for host tty.Ayush Ranjan
PiperOrigin-RevId: 328415633
2020-09-09overlay: clonePrivateMount must pass a Dentry reference to MakeVirtualDentry.Nicolas Lacasse
PiperOrigin-RevId: 328410065
2020-09-09remove iptables sockopt special casesKevin Krakauer
iptables sockopts were kludged into an unnecessary check, this properly relegates them to the {get,set}SockOptIP functions. PiperOrigin-RevId: 328395135
2020-09-09Change "Fd" member to "FD" according to convensiongVisor bot
PiperOrigin-RevId: 328374775
2020-09-09Support SO_LINGER socket option.Nayana Bidari
When SO_LINGER option is enabled, the close will not return until all the queued messages are sent and acknowledged for the socket or linger timeout is reached. If the option is not set, close will return immediately. This option is mainly supported for connection oriented protocols such as TCP. PiperOrigin-RevId: 328350576
2020-09-09Fix TCP_LINGER2 behavior to match linux.Bhasker Hariharan
We still deviate a bit from linux in how long we will actually wait in FIN-WAIT-2. Linux seems to cap it with TIME_WAIT_LEN and it's not completely obvious as to why it's done that way. For now I think we can ignore that and fix it if it really is an issue. PiperOrigin-RevId: 328324922
2020-09-09Fix deadlock in gofer direct IO.Dean Deng
Fixes several java runtime tests: java/nio/channels/FileChannel/directio/ReadDirect.java java/nio/channels/FileChannel/directio/PreadDirect.java Updates #3576. PiperOrigin-RevId: 328281849
2020-09-09Flush in fsimpl/gofer.regularFileFD.OnClose() if there are no dirty pages.Jamie Liu
This is closer to indistinguishable from VFS1 behavior. PiperOrigin-RevId: 328256068
2020-09-09Bump build constraints to 1.17Michael Pratt
This enables pre-release testing with 1.16. The intention is to replace these with a nogo check before the next release. PiperOrigin-RevId: 328193911
2020-09-09Update inotify documentation for gofer filesystem.Dean Deng
We now allow hard links to be created within gofer fs (see github.com/google/gvisor/commit/f20e63e31b56784c596897e86f03441f9d05f567). Update the inotify documentation accordingly. PiperOrigin-RevId: 328177485
2020-09-09Implement GetFilesystem for verity fsgVisor bot
verity GetFilesystem is implemented by mounting the underlying file system, save the mount, and store both the underlying root dentry and root Merkle file dentry in verity's root dentry. PiperOrigin-RevId: 327959334
2020-09-09[vfs] Allow mountpoint to be an existing non-directory.Ayush Ranjan
Unlike linux mount(2), OCI spec allows mounting on top of an existing non-directory file. PiperOrigin-RevId: 327914342
2020-09-09Provide fdReader/Writer for FileDescriptiongVisor bot
fdReader/Writer implements io.Reader/Writer so that they can be passed to Merkle tree library. PiperOrigin-RevId: 327901376
2020-09-09Internal change.gVisor bot
PiperOrigin-RevId: 327892274
2020-09-09Clarify seek behaviour for kernfs.GenericDirectoryFD.Rahat Mahmood
- Remove comment about GenericDirectoryFD not being compatible with dynamic directories. It is currently being used to implement dynamic directories. - Try to handle SEEK_END better than setting the offset to infinity. SEEK_END is poorly defined for dynamic directories anyways, so at least try make it work correctly for the static entries. Updates #1193. PiperOrigin-RevId: 327890128
2020-09-09Pass overlay credentials via context in copy up.Nicolas Lacasse
Some VFS operations (those which operate on FDs) get their credentials via the context instead of via an explicit creds param. For these cases, we must pass the overlay credentials on the context. PiperOrigin-RevId: 327881259
2020-09-09Make mounts ReadWrite first, then later change to ReadOnly.Nicolas Lacasse
This lets us create "synthetic" mountpoint directories in ReadOnly mounts during VFS setup. Also add context.WithMountNamespace, as some filesystems (like overlay) require a MountNamespace on ctx to handle vfs.Filesystem Operations. PiperOrigin-RevId: 327874971
2020-09-09Fix parent directory creation in CreateDeviceFile.Nicolas Lacasse
It was not properly creating recursive directories. Added tests for this case. Updates #1196 PiperOrigin-RevId: 327850811
2020-09-09[vfs] Create recursive dir creation util.Ayush Ranjan
Refactored the recursive dir creation util in runsc/boot/vfs.go to be more flexible. PiperOrigin-RevId: 327719100
2020-09-09Add reference count checking to the fsimpl/host package.Dean Deng
Includes a minor refactor for inode construction. Updates #1486. PiperOrigin-RevId: 327694933
2020-09-09Consistent precondition formattingMichael Pratt
Our "Preconditions:" blocks are very useful to determine the input invariants, but they are bit inconsistent throughout the codebase, which makes them harder to read (particularly cases with 5+ conditions in a single paragraph). I've reformatted all of the cases to fit in simple rules: 1. Cases with a single condition are placed on a single line. 2. Cases with multiple conditions are placed in a bulleted list. This format has been added to the style guide. I've also mentioned "Postconditions:", though those are much less frequently used, and all uses already match this style. PiperOrigin-RevId: 327687465
2020-09-09Skip listening TCP ports when trying to bind a free port.Bhasker Hariharan
PiperOrigin-RevId: 327686558
2020-09-09Fix tabs in lock-ordering doc.Nicolas Lacasse
PiperOrigin-RevId: 327654207
2020-09-09Remove path walk from localFile.MknodFabricio Voznika
Replace mknod call with mknodat equivalent to protect against symlink attacks. Also added Mknod tests. Remove goferfs reliance on gofer to check for file existence before creating a synthetic entry. Updates #2923 PiperOrigin-RevId: 327544516
2020-09-09ip6tables: move ipv4-specific logic into its own fileKevin Krakauer
A later change will introduce the equivalent IPv6 logic. #3549 PiperOrigin-RevId: 327499064
2020-08-19Return appropriate errors when file locking is unsuccessful.Dean Deng
test_eintr now passes in the Python runtime tests. Updates #3515. PiperOrigin-RevId: 327441081
2020-08-19[vfs] Allow offsets for special files other than regular files.Ayush Ranjan
Some character and block devices can be seekable. So allow their FD to maintain file offset. PiperOrigin-RevId: 327370684
2020-08-19Get rid of kernfs.Inode.Destroy.Dean Deng
This interface method is unneeded. PiperOrigin-RevId: 327370325
2020-08-19Move ERESTART* error definitions to syserror package.Dean Deng
This is needed to avoid circular dependencies between the vfs and kernel packages. PiperOrigin-RevId: 327355524
2020-08-19Don't set atime if mount is readonlyFabricio Voznika
Updates #1035 PiperOrigin-RevId: 327351475
2020-08-19Add more information to panic when device ID don't matchFabricio Voznika
PiperOrigin-RevId: 327351357
2020-08-19Avoid holding locks when opening files in VFS2.Jamie Liu
Fixes #3243, #3521 PiperOrigin-RevId: 327308890
2020-08-19[vfs2] Implement /proc/sys/net/ipv4/tcp_rmem and /proc/sys/net/ipv4/tcp_wmem.Ayush Ranjan
Updates #1035 PiperOrigin-RevId: 327253907
2020-08-19Add a skeleton for verity file systemgVisor bot
PiperOrigin-RevId: 327123477
2020-08-19Stop masking the IO error in handleIOError.Nicolas Lacasse
PiperOrigin-RevId: 327123331
2020-08-19[vfs] Do O_DIRECTORY check after resolving symlinks.Ayush Ranjan
Fixes python runtime test test_glob. Updates #3515 We were checking is the to-be-opened dentry is a dir or not before resolving symlinks. We should check that after resolving symlinks. This was preventing us from opening a symlink which pointed to a directory with O_DIRECTORY. Also added this check in tmpfs and removed a duplicate check. PiperOrigin-RevId: 327085895
2020-08-19Remove weak references from unix sockets.Dean Deng
The abstract socket namespace no longer holds any references on sockets. Instead, TryIncRef() is used when a socket is being retrieved in BoundEndpoint(). Abstract sockets are now responsible for removing themselves from the namespace they are in, when they are destroyed. Updates #1486. PiperOrigin-RevId: 327064173
2020-08-19[vfs] Return EIO when opening /dev/tty.Ayush Ranjan
This is in compliance with VFS1. See pkg/sentry/fs/dev/tty.go in the struct ttyInodeOperations. Fixes the failure of python runtime test_ioctl. Updates #3515 PiperOrigin-RevId: 327042758
2020-08-13[vfs2][gofer] Fix file creation flags sent to gofer.Ayush Ranjan
Fixes php runtime test ext/standard/tests/file/readfile_basic.phpt Fixes #3516 fsgofers only want the access mode in the OpenFlags passed to Create(). If more flags are supplied (like O_APPEND in this case), read/write from that fd will fail with EBADF. See runsc/fsgofer/fsgofer.go:WriteAt() VFS2 was providing more than just access modes. So filtering the flags using p9.OpenFlagsModeMask == linux.O_ACCMODE fixes the issue. Gofer in VFS1 also only extracts the access mode flags while making the create RPC. See pkg/sentry/fs/gofer/path.go:Create() Even in VFS2, when we open a handle, we extract out only the access mode flags + O_TRUNC. See third_party/gvisor/pkg/sentry/fsimpl/gofer/handle.go:openHandle() Added a test for this. PiperOrigin-RevId: 326574829
2020-08-13Migrate to PacketHeader API for PacketBuffer.Ting-Yu Wang
Formerly, when a packet is constructed or parsed, all headers are set by the client code. This almost always involved prepending to pk.Header buffer or trimming pk.Data portion. This is known to prone to bugs, due to the complexity and number of the invariants assumed across netstack to maintain. In the new PacketHeader API, client will call Push()/Consume() method to construct/parse an outgoing/incoming packet. All invariants, such as slicing and trimming, are maintained by the API itself. NewPacketBuffer() is introduced to create new PacketBuffer. Zero value is no longer valid. PacketBuffer now assumes the packet is a concatenation of following portions: * LinkHeader * NetworkHeader * TransportHeader * Data Any of them could be empty, or zero-length. PiperOrigin-RevId: 326507688