summaryrefslogtreecommitdiffhomepage
AgeCommit message (Collapse)Author
2020-06-25conntrack refactor, no behavior changesKevin Krakauer
- Split connTrackForPacket into 2 functions instead of switching on flag - Replace hash with struct keys. - Remove prefixes where possible - Remove unused connStatus, timeout - Flatten ConnTrack struct a bit - some intermediate structs had no meaning outside of the context of their parent. - Protect conn.tcb with a mutex - Remove redundant error checking (e.g. when is pkt.NetworkHeader valid) - Clarify that HandlePacket and CreateConnFor are the expected entrypoints for ConnTrack PiperOrigin-RevId: 318407168
2020-06-25Avoid an allocation in epollTamir Duberstein
PiperOrigin-RevId: 318346153
2020-06-25Drop unused markdown links.Adin Scannell
PiperOrigin-RevId: 318284693
2020-06-24Fix procfs bugs in vfs2.Dean Deng
- Support writing on proc/[pid]/{uid,gid}map - Return EIO for writing to static files. Updates #2923. PiperOrigin-RevId: 318188503
2020-06-24Internal change.gVisor bot
PiperOrigin-RevId: 318180382
2020-06-24Port /dev/net/tun device to VFS2.Nicolas Lacasse
Updates #2912 #1035 PiperOrigin-RevId: 318162565
2020-06-24Remove waiter.Entry.ContextTamir Duberstein
This field is redundant since state can be stored in the callback. PiperOrigin-RevId: 318134855
2020-06-24Add support for Stack level options.Bhasker Hariharan
Linux controls socket send/receive buffers using a few sysctl variables - net.core.rmem_default - net.core.rmem_max - net.core.wmem_max - net.core.wmem_default - net.ipv4.tcp_rmem - net.ipv4.tcp_wmem The first 4 control the default socket buffer sizes for all sockets raw/packet/tcp/udp and also the maximum permitted socket buffer that can be specified in setsockopt(SOL_SOCKET, SO_(RCV|SND)BUF,...). The last two control the TCP auto-tuning limits and override the default specified in rmem_default/wmem_default as well as the max limits. Netstack today only implements tcp_rmem/tcp_wmem and incorrectly uses it to limit the maximum size in setsockopt() as well as uses it for raw/udp sockets. This changelist introduces the other 4 and updates the udp/raw sockets to use the newly introduced variables. The values for min/max match the current tcp_rmem/wmem values and the default value buffers for UDP/RAW sockets is updated to match the linux value of 212KiB up from the really low current value of 32 KiB. Updates #3043 Fixes #3043 PiperOrigin-RevId: 318089805
2020-06-23Support for saving pointers to fields in the state package.Adin Scannell
Previously, it was not possible to encode/decode an object graph which contained a pointer to a field within another type. This was because the encoder was previously unable to disambiguate a pointer to an object and a pointer within the object. This CL remedies this by constructing an address map tracking the full memory range object occupy. The encoded Refvalue message has been extended to allow references to children objects within another object. Because the encoding process may learn about object structure over time, we cannot encode any objects under the entire graph has been generated. This CL also updates the state package to use standard interfaces intead of reflection-based dispatch in order to improve performance overall. This includes a custom wire protocol to significantly reduce the number of allocations and take advantage of structure packing. As part of these changes, there are a small number of minor changes in other places of the code base: * The lists used during encoding are changed to use intrusive lists with the objectEncodeState directly, which required that the ilist Len() method is updated to work properly with the ElementMapper mechanism. * A bug is fixed in the list code wherein Remove() called on an element that is already removed can corrupt the list (removing the element if there's only a single element). Now the behavior is correct. * Standard error wrapping is introduced. * Compressio was updated to implement the new wire.Reader and wire.Writer inteface methods directly. The lack of a ReadByte and WriteByte caused issues not due to interface dispatch, but because underlying slices for a Read or Write call through an interface would always escape to the heap! * Statify has been updated to support the new APIs. See README.md for a description of how the new mechanism works. PiperOrigin-RevId: 318010298
2020-06-23Resolve remaining inotify TODOs.Dean Deng
Also refactor HandleDeletion(). Updates #1479. PiperOrigin-RevId: 317989000
2020-06-23Clean up hostfs TODOs.Dean Deng
This CL does a handful of things: - Support O_DSYNC, O_SYNC - Support O_APPEND and document an unavoidable race condition - Ignore O_DIRECT; we probably don't want to allow applications to set O_DIRECT on the host fd itself. - Leave a TODO for supporting O_NONBLOCK, which is a simple fix once RWF_NOWAIT is supported. - Get rid of caching TODO; force_page_cache is not configurable for host fs in vfs1 or vfs2 after whitelist fs was removed. - For the remaining TODOs, link to more specific bugs. Fixes #1672. PiperOrigin-RevId: 317985269
2020-06-23Add support for SO_REUSEADDR to TCP sockets/endpoints.Ian Gudger
For TCP sockets, SO_REUSEADDR relaxes the rules for binding addresses. gVisor/netstack already supported a behavior similar to SO_REUSEADDR, but did not allow disabling it. This change brings the SO_REUSEADDR behavior closer to the behavior implemented by Linux and adds a new SO_REUSEADDR disabled behavior. Like Linux, SO_REUSEADDR is now disabled by default. PiperOrigin-RevId: 317984380
2020-06-23Port /dev/tty device to VFS2.Nicolas Lacasse
Support is limited to the functionality that exists in VFS1. Updates #2923 #1035 PiperOrigin-RevId: 317981417
2020-06-23Complete inotify IN_EXCL_UNLINK implementation in VFS2.Dean Deng
Events were only skipped on parent directories after their children were unlinked; events on the unlinked file itself need to be skipped as well. As a result, all Watches.Notify() calls need to know whether the dentry where the call came from was unlinked. Updates #1479. PiperOrigin-RevId: 317979476
2020-06-23Nit fix: Create and use a std::string object for `const char*`.Ting-Yu Wang
PiperOrigin-RevId: 317973144
2020-06-23Support inotify in vfs2 gofer fs.Dean Deng
Because there is no inode structure stored in the sandbox, inotify watches must be held on the dentry. This would be an issue in the presence of hard links, where multiple dentries would need to share the same set of watches, but in VFS2, we do not support the internal creation of hard links on gofer fs. As a result, we make the assumption that every dentry corresponds to a unique inode. Furthermore, dentries can be cached and then evicted, even if the underlying file has not be deleted. We must prevent this from occurring if there are any watches that would be lost. Note that if the dentry was deleted or invalidated (d.vfsd.IsDead()), we should still destroy it along with its watches. Additionally, when a dentry’s last watch is removed, we cache it if it also has zero references. This way, the dentry can eventually be evicted from memory if it is no longer needed. This is accomplished with a new dentry method, OnZeroWatches(), which is called by Inotify.RmWatch and Inotify.Release. Note that it must be called after all inotify locks are released to avoid violating lock order. Stress tests are added to make sure that inotify operations don't deadlock with gofer.OnZeroWatches. Updates #1479. PiperOrigin-RevId: 317958034
2020-06-23Deflake proc test: Don't fail on DT_UNKNOWN.Ting-Yu Wang
Per manual page: "All applications must properly handle a return of DT_UNKNOWN." PiperOrigin-RevId: 317957013
2020-06-23Port readahead to VFS2.Nicolas Lacasse
It preserves the same functionality (almost none) as in VFS1. Updates #2923 #1035 PiperOrigin-RevId: 317943522
2020-06-23Internal change.gVisor bot
PiperOrigin-RevId: 317941748
2020-06-23Merge pull request #2272 from lubinszARM:pr_serr_injectiongVisor bot
PiperOrigin-RevId: 317933650
2020-06-22Only allow regular files, sockets, pipes, and char devices to be imported.Dean Deng
PiperOrigin-RevId: 317796028
2020-06-22Fix the way PR build clones gVisor.Ayush Ranjan
Copybara force-pushes to the PR immediately before merging which triggers a PR build. Since the PR is merged, the refspec +refs/pull/{pr_num}/merge is not available and the build fails causing all master commit CI builds to show a failure. This change remove the clone step from travis and clone manually in a way which always would succeed. We fetch +refs/pull/{pr_num}/head and cherry pick that onto the target branch. I have tested this in https://github.com/ayushr2/gvisor/pull/1 and https://github.com/ayushr2/gvisor/pull/2. PiperOrigin-RevId: 317759891
2020-06-22Check for invalid trailing / when traversing path in gofer OpenAt.Dean Deng
Updates #2923. PiperOrigin-RevId: 317700049
2020-06-22Extract common nested LinkEndpoint patternBruno Dal Bo
... and unify logic for detached netsted endpoints. sniffer.go caused crashes if a packet delivery is attempted when the dispatcher is nil. Extracted the endpoint nesting logic into a common composable type so it can be used by the Fuchsia Netstack (the pattern is widespread there). PiperOrigin-RevId: 317682842
2020-06-22Allow readdir(/proc/[tid]/net) to return EINVAL on a zombie task.Nicolas Lacasse
Despite what the man page says, linux will return EINVAL when calling getdents() an a /proc/[tid]/net file corresponding to a zombie task. This causes readdir() to return a null pointer AND errno=EINVAL. See fs/proc/proc_net.c:proc_tgid_net_readdir() for where this occurs. We have tests that recursively read /proc, and are likely to hit this when running natively, so we must catch and handle this case. PiperOrigin-RevId: 317674168
2020-06-21Fix vfs2 extended attributes.Dean Deng
Correct behavior when given zero size arguments and trying to set user.* xattrs on files other than regular files or directories. Updates #2923. PiperOrigin-RevId: 317590409
2020-06-19Enable passing vfs2 tests.Dean Deng
I forgot to update getdents earlier. Several thousand runs of the fsync and proc_net_unix tests all passed as well. Updates #2923. PiperOrigin-RevId: 317415488
2020-06-19Fix bugs in vfs2 to make symlink tests pass.Dean Deng
- Return ENOENT if target path is empty. - Make sure open(2) with O_CREAT|O_EXCL returns EEXIST when necessary. - Correctly update atime in tmpfs using touchATime(). Updates #2923. PiperOrigin-RevId: 317382655
2020-06-19Use internal tmpfs in test runner, even when running with overlay.Nicolas Lacasse
PiperOrigin-RevId: 317377571
2020-06-19Fix vfs2 proc/self/fd dirent iteration.Dean Deng
Make proc/self/fd iteration work properly. Also, the comment on kernfs.Inode.IterDirents did not accurately reflect how parameters should be used/were used in kernfs.Inode impls other than fdDir. Updates #2923. PiperOrigin-RevId: 317370325
2020-06-19Port fadvise64 to vfs2.Dean Deng
Like vfs1, we have a trivial implementation that ignores all valid advice. Updates #2923. PiperOrigin-RevId: 317349505
2020-06-19Implement UDP cheksum verification.gVisor bot
Test: - TestIncrementChecksumErrors Fixes #2943 PiperOrigin-RevId: 317348158
2020-06-19Fix vfs2 handling of preadv2/pwritev2 flags.Dean Deng
Check for unsupported flags, and silently support RWF_HIPRI by doing nothing. From pkg/abi/linux/file.go: "gVisor does not implement the RWF_HIPRI feature, but the flag is accepted as a valid flag argument for preadv2/pwritev2." Updates #2923. PiperOrigin-RevId: 317330631
2020-06-19Don't adjust parent link count if we replace a child dir with another.Dean Deng
Updates #2923. PiperOrigin-RevId: 317314460
2020-06-19Support all seek options in gofer specialFileFD.Seek.Dean Deng
Updates #2923. PiperOrigin-RevId: 317298186
2020-06-19Fix synthetic file bugs in gofer fs.Dean Deng
Always check if a synthetic file already exists at a location before creating a file there, and do not try to delete synthetic gofer files from the remote fs. This fixes runsc_ptrace socket tests that create/unlink synthetic, named socket files. Updates #2923. PiperOrigin-RevId: 317293648
2020-06-18Fix vfs2 tmpfs link permission checks.Dean Deng
Updates #2923. PiperOrigin-RevId: 317246916
2020-06-18socket/unix: (*connectionedEndpoint).State() has to take the endpoint lockAndrei Vagin
It accesses e.receiver which is protected by the endpoint lock. WARNING: DATA RACE Write at 0x00c0006aa2b8 by goroutine 189: pkg/sentry/socket/unix/transport.(*connectionedEndpoint).Connect.func1() pkg/sentry/socket/unix/transport/connectioned.go:359 +0x50 pkg/sentry/socket/unix/transport.(*connectionedEndpoint).BidirectionalConnect() pkg/sentry/socket/unix/transport/connectioned.go:327 +0xa3c pkg/sentry/socket/unix/transport.(*connectionedEndpoint).Connect() pkg/sentry/socket/unix/transport/connectioned.go:363 +0xca pkg/sentry/socket/unix.(*socketOpsCommon).Connect() pkg/sentry/socket/unix/unix.go:420 +0x13a pkg/sentry/socket/unix.(*SocketOperations).Connect() <autogenerated>:1 +0x78 pkg/sentry/syscalls/linux.Connect() pkg/sentry/syscalls/linux/sys_socket.go:286 +0x251 Previous read at 0x00c0006aa2b8 by goroutine 270: pkg/sentry/socket/unix/transport.(*baseEndpoint).Connected() pkg/sentry/socket/unix/transport/unix.go:789 +0x42 pkg/sentry/socket/unix/transport.(*connectionedEndpoint).State() pkg/sentry/socket/unix/transport/connectioned.go:479 +0x2f pkg/sentry/socket/unix.(*socketOpsCommon).State() pkg/sentry/socket/unix/unix.go:714 +0xc3e pkg/sentry/socket/unix.(*socketOpsCommon).SendMsg() pkg/sentry/socket/unix/unix.go:466 +0xc44 pkg/sentry/socket/unix.(*SocketOperations).SendMsg() <autogenerated>:1 +0x173 pkg/sentry/syscalls/linux.sendTo() pkg/sentry/syscalls/linux/sys_socket.go:1121 +0x4c5 pkg/sentry/syscalls/linux.SendTo() pkg/sentry/syscalls/linux/sys_socket.go:1134 +0x87 Reported-by: syzbot+c2be37eedc672ed59a86@syzkaller.appspotmail.com PiperOrigin-RevId: 317236996
2020-06-18iptables: skip iptables if no rules are setKevin Krakauer
Users that never set iptables rules shouldn't incur the iptables performance cost. Suggested by Ian (@iangudger). PiperOrigin-RevId: 317232921
2020-06-18iptables: remove metadata structKevin Krakauer
Metadata was useful for debugging and safety, but enough tests exist that we should see failures when (de)serialization is broken. It made stack initialization more cumbersome and it's also getting in the way of ip6tables. PiperOrigin-RevId: 317210653
2020-06-18Enable more VFS2 syscall testsFabricio Voznika
Updates #2923 PiperOrigin-RevId: 317185798
2020-06-18Acquire lock when accessing MultiDevice's cache in String().Ting-Yu Wang
PiperOrigin-RevId: 317180925
2020-06-18Ensure ip6tables module installedKevin Krakauer
This module isn't always loaded automatically. PiperOrigin-RevId: 317164471
2020-06-18Remove various uses of 'whitelist'Michael Pratt
Updates #2972 PiperOrigin-RevId: 317113059
2020-06-18Support setsockopt SO_SNDBUF/SO_RCVBUF for raw/udp sockets.Bhasker Hariharan
Updates #173,#6 Fixes #2888 PiperOrigin-RevId: 317087652
2020-06-18Cleanup tcp.timer and tcpip.RouteGhanan Gowripalan
When a tcp.timer or tcpip.Route is no longer used, clean up its resources so that unused memory may be released. PiperOrigin-RevId: 317046582
2020-06-17Implement Sync() to directoriesFabricio Voznika
Updates #1035, #1199 PiperOrigin-RevId: 317028108
2020-06-17Add TempTmpMount testFabricio Voznika
This currently doesn't work with VSF2. Add test to ensure it's not missed. Updates #1487 PiperOrigin-RevId: 317013792
2020-06-17Move mount configutation to RunOptsFabricio Voznika
Separate mount configuration from links and move it to RunOpts, like the other options. PiperOrigin-RevId: 317010158
2020-06-17Increase timeouts for NDP testsGhanan Gowripalan
... to help reduce flakes. When waiting for an event to occur, use a timeout of 10s. When waiting for an event to not occur, use a timeout of 1s. Test: Ran test locally w/ run count of 1000 with and without gotsan. PiperOrigin-RevId: 316998128