gvisor - Container Runtime Sandbox

Age	Commit message (Collapse)	Author
2019-11-07	Add support for TIME_WAIT timeout.	Bhasker Hariharan
	This change adds explicit support for honoring the 2MSL timeout for sockets in TIME_WAIT state. It also adds support for the TCP_LINGER2 option that allows modification of the FIN_WAIT2 state timeout duration for a given socket. It also adds an option to modify the Stack wide TIME_WAIT timeout but this is only for testing. On Linux this is fixed at 60s. Further, we also now correctly process RST's in CLOSE_WAIT and close the socket similar to linux without moving it to error state. We also now handle SYN in ESTABLISHED state as per RFC5961#section-4.1. Earlier we would just drop these SYNs. Which can result in some tests that pass on linux to fail on gVisor. Netstack now honors TIME_WAIT correctly as well as handles the following cases correctly. - TCP RSTs in TIME_WAIT are ignored. - A duplicate TCP FIN during TIME_WAIT extends the TIME_WAIT and a dup ACK is sent in response to the FIN as the dup FIN indicates potential loss of the original final ACK. - An out of order segment during TIME_WAIT generates a dup ACK. - A new SYN w/ a sequence number > the highest sequence number in the previous connection closes the TIME_WAIT early and opens a new connection. Further to make the SYN case work correctly the ISN (Initial Sequence Number) generation for Netstack has been updated to be as per RFC. Its not a pure random number anymore and follows the recommendation in https://tools.ietf.org/html/rfc6528#page-3. The current hash used is not a cryptographically secure hash function. A separate change will update the hash function used to Siphash similar to what is used in Linux. PiperOrigin-RevId: 279106406
2019-11-04	Add NETLINK_KOBJECT_UEVENT socket support	Michael Pratt
	NETLINK_KOBJECT_UEVENT sockets send udev-style messages for device events. gVisor doesn't have any device events, so our sockets don't need to do anything once created. systemd's device manager needs to be able to create one of these sockets. It also wants to install a BPF filter on the socket. Since we'll never send any messages, the filter would never be invoked, thus we just fake it out. Fixes #1117 Updates #1119 PiperOrigin-RevId: 278405893
2019-10-18	Cleanup host UDS support	Michael Pratt
	This change fixes several issues with the fsgofer host UDS support. Notably, it adds support for SOCK_SEQPACKET and SOCK_DGRAM sockets [1]. It also fixes unsafe use of unet.Socket, which could cause a panic if Socket.FD is called when err != nil, and calls to Socket.FD with nothing to prevent the garbage collector from destroying and closing the socket. A set of tests is added to exercise host UDS access. This required extracting most of the syscall test runner into a library that can be used by custom tests. Updates #235 Updates #1003 [1] N.B. SOCK_DGRAM sockets are likely not particularly useful, as a server can only reply to a client that binds first. We don't allow bind, so these are unlikely to be used. PiperOrigin-RevId: 275558502
2019-10-07	Implement IP_TTL.	Ian Gudger
	Also change the default TTL to 64 to match Linux. PiperOrigin-RevId: 273430341
2019-10-02	Increase itimer test timeout	Michael Pratt
	https://github.com/google/gvisor/commit/dd69b49ed1103bab82a6b2ac95221b89b46f3376 makes this test take longer. PiperOrigin-RevId: 272535892
2019-09-24	Stub out readahead implementation.	Adin Scannell
	Closes #261 PiperOrigin-RevId: 270973347
2019-09-19	Job control: controlling TTYs and foreground process groups.	Kevin Krakauer
	Adresses a deadlock with the rolled back change: https://github.com/google/gvisor/commit/b6a5b950d28e0b474fdad160b88bc15314cf9259 Creating a session from an orphaned process group was causing a lock to be acquired twice by a single goroutine. This behavior is addressed, and a test (OrphanRegression) has been added to pty.cc. Implemented the following ioctls: - TIOCSCTTY - set controlling TTY - TIOCNOTTY - remove controlling tty, maybe signal some other processes - TIOCGPGRP - get foreground process group. Also enables tcgetpgrp(). - TIOCSPGRP - set foreground process group. Also enabled tcsetpgrp(). Next steps are to actually turn terminal-generated control characters (e.g. C^c) into signals to the proper process groups, and to send SIGTTOU and SIGTTIN when appropriate. PiperOrigin-RevId: 270088599
2019-09-04	Run proc_net tests.	Ian Gudger
	PiperOrigin-RevId: 267280086
2019-09-03	Impose order on test scripts.	Adin Scannell
	The simple test script has gotten out of control. Shard this script into different pieces and attempt to impose order on overall test structure. This change helps lay some of the foundations for future improvements. * The runsc/test directories are moved into just test/. * The runsc/test/testutil package is split into logical pieces. * The scripts/ directory contains new top-level targets. * Each test is now responsible for building targets it requires. * The install functionality is moved into `runsc` itself for simplicity. * The existing kokoro run_tests.sh file now just calls all (can be split). After this change is merged, I will create multiple distinct workflows for Kokoro, one for each of the scripts currently targeted by `run_tests.sh` today, which should dramatically reduce the time-to-run for the Kokoro tests, and provides a better foundation for further improvements to the infrastructure. PiperOrigin-RevId: 267081397
2019-08-30	Automated rollback of changelist 261387276	Bhasker Hariharan
	PiperOrigin-RevId: 266491264
2019-08-29	Implement /proc/net/udp.	Rahat Mahmood
	PiperOrigin-RevId: 266229756
2019-08-20	Add tests for raw AF_PACKET sockets.	Kevin Krakauer
	PiperOrigin-RevId: 264494359
2019-08-20	tests: syscall_test_runner should not run tests in parallel	Andrei Vagin
	bazel runs a few instances of syscall_test_runner in parallel and then syscall_test_runner runs test cases in parallel. It might be a reason why we see that test hosts are overloaded and sandboxes start slowly. It should be better to control how many tests are running in parallel from one place, so let's try to disable this feature in syscall_test_runner. PiperOrigin-RevId: 264434674
2019-08-19	Read iptables via sockopts.	Kevin Krakauer
	PiperOrigin-RevId: 264180125
2019-08-15	Add tests for "cooked" AF_PACKET sockets.	Kevin Krakauer
	PiperOrigin-RevId: 263666789
2019-08-06	Fix for a panic due to writing to a closed accept channel.	Bhasker Hariharan
	This can happen because endpoint.Close() closes the accept channel first and then drains/resets any accepted but not delivered connections. But there can be connections that are connected but not delivered to the channel as the channel was full. But closing the channel can cause these writes to fail with a write to a closed channel. The correct solution is to abort any connections in SYN-RCVD state and drain/abort all completed connections before closing the accept channel. PiperOrigin-RevId: 261951132
2019-08-02	Job control: controlling TTYs and foreground process groups.	Kevin Krakauer
	(Don't worry, this is mostly tests.) Implemented the following ioctls: - TIOCSCTTY - set controlling TTY - TIOCNOTTY - remove controlling tty, maybe signal some other processes - TIOCGPGRP - get foreground process group. Also enables tcgetpgrp(). - TIOCSPGRP - set foreground process group. Also enabled tcsetpgrp(). Next steps are to actually turn terminal-generated control characters (e.g. C^c) into signals to the proper process groups, and to send SIGTTOU and SIGTTIN when appropriate. PiperOrigin-RevId: 261387276
2019-07-25	Automated rollback of changelist 255679453	Fabricio Voznika
	PiperOrigin-RevId: 260047477
2019-07-12	Add IPPROTO_RAW, which allows raw sockets to write IP headers.	Kevin Krakauer
	iptables also relies on IPPROTO_RAW in a way. It opens such a socket to manipulate the kernel's tables, but it doesn't actually use any of the functionality. Blegh. PiperOrigin-RevId: 257903078
2019-06-28	Automated rollback of changelist 255263686	Nicolas Lacasse
	PiperOrigin-RevId: 255679453
2019-06-27	Complete pipe support on overlayfs	Fabricio Voznika
	Get/Set pipe size and ioctl support were missing from overlayfs. It required moving the pipe.Sizer interface to fs so that overlay could get access. Fixes #318 PiperOrigin-RevId: 255511125
2019-06-26	Preserve permissions when checking lower	Fabricio Voznika
	The code was wrongly assuming that only read access was required from the lower overlay when checking for permissions. This allowed non-writable files to be writable in the overlay. Fixes #316 PiperOrigin-RevId: 255263686
2019-06-21	Fix the logic for sending zero window updates.	Bhasker Hariharan
	Today we have the logic split in two places between endpoint Read() and the worker goroutine which actually sends a zero window. This change makes it so that when a zero window ACK is sent we set a flag in the endpoint which can be read by the endpoint to decide if it should notify the worker to send a nonZeroWindow update. The worker now does not do the check again but instead sends an ACK and flips the flag right away. Similarly today when SO_RECVBUF is set the SetSockOpt call has logic to decide if a zero window update is required. Rather than do that we move the logic to the worker goroutine and it can check the zeroWindow flag and send an update if required. PiperOrigin-RevId: 254505447
2019-06-19	Mark tcp_socket test flaky (for real)	Michael Pratt
	The tag on the binary has no effect. It must be on the test. PiperOrigin-RevId: 254103480
2019-06-06	Remove tmpfs restriction from test	Fabricio Voznika
	runsc supports UDS over gofer mounts and tmpfs is not needed for this test. PiperOrigin-RevId: 251944870
2019-06-06	Add overlay dimension to FS related syscall tests	Fabricio Voznika
	PiperOrigin-RevId: 251929314
2019-06-03	Remove duplicate socket tests	Michael Pratt
	socket_unix_abstract.cc: Subset of socket_abstract.cc socket_unix_filesystem.cc: Subset of socket_filesystem.cc PiperOrigin-RevId: 251297117
2019-05-30	gvisor: socket() returns EPROTONOSUPPORT if protocol is not supported	Andrei Vagin
	PiperOrigin-RevId: 250426407
2019-05-21	Clean up pipe internals and add fcntl support	Adin Scannell
	Pipe internals are made more efficient by avoiding garbage collection. A pool is now used that can be shared by all pipes, and buffers are chained via an intrusive list. The documentation for pipe structures and methods is also simplified and clarified. The pipe tests are now parameterized, so that they are run on all different variants (named pipes, small buffers, default buffers). The pipe buffer sizes are exposed by fcntl, which is now supported by this change. A size change test has been added to the suite. These new tests uncovered a bug regarding the semantics of open named pipes with O_NONBLOCK, which is also fixed by this CL. This fix also addresses the lack of the O_LARGEFILE flag for named pipes. PiperOrigin-RevId: 249375888 Change-Id: I48e61e9c868aedb0cadda2dff33f09a560dee773
2019-05-21	Add basic plumbing for splice and stub implementation.	Adin Scannell
	This does not actually implement an efficient splice or sendfile. Rather, it adds a generic plumbing to the file internals so that this can be added. All file implementations use the stub fileutil.NoSplice implementation, which causes sendfile and splice to fall back to an internal copy. A basic splice system call interface is added, along with a test. PiperOrigin-RevId: 249335960 Change-Id: Ic5568be2af0a505c19e7aec66d5af2480ab0939b
2019-05-16	Add test for duplicate proc entries.	Ian Gudger
	The issue with duplicate /proc/sys entries seems to have been fixed in: PiperOrigin-RevId 229305982 Git hash dc8450b5676d4c4ac9bcfa23cabd862e0060527d Fixes google/gvisor#125 PiperOrigin-RevId: 248571903 Change-Id: I76ff3b525c93dafb92da6e5cf56e440187f14579
2019-04-29	Allow and document bug ids in gVisor codebase.	Nicolas Lacasse
	PiperOrigin-RevId: 245818639 Change-Id: I03703ef0fb9b6675955637b9fe2776204c545789
2019-04-02	Add build rule for raw socket tests so they are runnable via:	Kevin Krakauer
	bazel test test/syscalls:raw_socket_ipv4_test_{native,runsc_ptrace,runsc_kvm} PiperOrigin-RevId: 241640049 Change-Id: Iac4dbdd7fd1827399a472059ac7d85fb6b506577
2019-03-08	Fix tests which fail in kokoro	Andrei Vagin
	* open_create_test_runsc_ptrace_shared doesn't expect the write access to / * exec_test_runsc_ptrace_shared could not find /usr/share/zoneinfo/ * clock_gettime_test_runsc_ptrace_shared didn't expect that a thread cpu time can be zero. * affinity_test_runsc_ptrace_shared expected minimum 3 cpus PiperOrigin-RevId: 237509429 Change-Id: I477937e5d2cdf3f8720836bfa972abd35d8220a3
2019-03-06	Increase ipv4_udp_unbound_loopback size to medium	Michael Pratt
	Now that tests aren't running in parallel, this test occassionally takes too long and times out. PiperOrigin-RevId: 237106971 Change-Id: I195a4b77315c9f5511c9e8ffadddb7aaa78beafd
2019-03-04	Deflake socket_ipv4_udp_unbound_loopback.	Ian Gudger
	When run in parallel, multicast packets can be received by the wrong test. The tests in the target are run in an isolated network namespace, but if parallelism is enabled, multiple tests from the same target will run in parallel within the target's network namespace. Disabling parallelism only allows one test to run in the network namespace at a time, which prevents interaction. PiperOrigin-RevId: 236709160 Change-Id: If828db44f0ae4002af36de6097866137c8d9da5c
2019-03-01	Mark socket_ipv4_udp_unbound_loopback flaky	Michael Pratt
	To do so, we must add the ability to add tags to the syscall tests. PiperOrigin-RevId: 236380371 Change-Id: I76d15feb2700f20115b27aab362a88cebe8c7a6a
2019-02-19	Break /proc/[pid]/{uid,gid}_map's dependence on seqfile.	Jamie Liu
	In addition to simplifying the implementation, this fixes two bugs: - seqfile.NewSeqFile unconditionally creates an inode with mode 0444, but {uid,gid}_map have mode 0644. - idMapSeqFile.Write implements fs.FileOperations.Write ... but it doesn't implement any other fs.FileOperations methods and is never used as fs.FileOperations. idMapSeqFile.GetFile() => seqfile.SeqFile.GetFile() uses seqfile.seqFileOperations instead, which rejects all writes. PiperOrigin-RevId: 234638212 Change-Id: I4568f741ab07929273a009d7e468c8205a8541bc
2019-02-11	gvisor: Run syscall tests in kokoro on the rbe cluster	Andrei Vagin
	PiperOrigin-RevId: 233458853 Change-Id: I92c734b8075aa31e040fe7b4770bcf608e271e7a
2019-02-07	Plumb IP_ADD_MEMBERSHIP and IP_DROP_MEMBERSHIP to netstack.	Ian Gudger
	Also includes a few fixes for IPv4 multicast support. IPv6 support is coming in a followup CL. PiperOrigin-RevId: 233008638 Change-Id: If7dae6222fef43fda48033f0292af77832d95e82
2019-02-07	Implement /proc/net/unix.	Rahat Mahmood
	PiperOrigin-RevId: 232948478 Change-Id: Ib830121e5e79afaf5d38d17aeef5a1ef97913d23
2019-01-31	Remove license comments	Michael Pratt
	Nothing reads them and they can simply get stale. Generated with: $ sed -i "s/licenses($.$)./licenses(\1)/" **/BUILD PiperOrigin-RevId: 231818945 Change-Id: Ibc3f9838546b7e94f13f217060d31f4ada9d4bf0
2019-01-24	Increase gofer coverage in tests	Fabricio Voznika
	Lots of tests use /tmp for the tests. Force /tmp to be mounted over fsgofer instead of tmpfs. PiperOrigin-RevId: 230788985 Change-Id: Id6597ed88133232d15e808c48126bf77cb32673e
2019-01-16	Prevent internal tmpfs mount to override files in /tmp	Fabricio Voznika
	Runsc wants to mount /tmp using internal tmpfs implementation for performance. However, it risks hiding files that may exist under /tmp in case it's present in the container. Now, it only mounts over /tmp iff: - /tmp was not explicitly asked to be mounted - /tmp is empty If any of this is not true, then /tmp maps to the container's image /tmp. Note: checkpoint doesn't have sentry FS mounted to check if /tmp is empty. It simply looks for explicit mounts right now. PiperOrigin-RevId: 229607856 Change-Id: I10b6dae7ac157ef578efc4dfceb089f3b94cde06
2019-01-14	Automated rollback of changelist 228945914	Nicolas Lacasse
	PiperOrigin-RevId: 229214698 Change-Id: Ib4ea2e330e61ee34bf913938d6120a52ecc38ce1
2019-01-11	Make syscall_test_runner binary testonly.	Nicolas Lacasse
	PiperOrigin-RevId: 228945914 Change-Id: Idfa0a3c27434655b5f9ac241f1726e0bc9ef0392
2019-01-09	Allow to specify a custom path to runsc for syscall-test-runner	Andrei Vagin
	PiperOrigin-RevId: 228574092 Change-Id: Id93abcca1ce964eb595907df9355702d469bc33b
2018-12-19	Implement pwritev2.	Zach Koopmans
	Implement pwritev2 and associated unit tests. Clean up preadv2 unit tests. Tag RWF_ flags in both preadv2 and pwritev2 with associated bug tickets. PiperOrigin-RevId: 226222119 Change-Id: Ieb22672418812894ba114bbc88e67f1dd50de620
2018-12-14	Add blocking recv tests	Ian Gudger
	PiperOrigin-RevId: 225646045 Change-Id: Ic712ebc627587ef4a9486f0b39fe8c96100f10ff
2018-12-14	Shard the syscall tests.	Nicolas Lacasse
	PiperOrigin-RevId: 225574278 Change-Id: If5060a37e8a9b0120bec2b5de4037354f0eaba16