gvisor - Container Runtime Sandbox

Age	Commit message (Collapse)	Author
2019-09-18	Shard the runtime tests.	Nicolas Lacasse
	Default of 20 shards was arbitrary and will need fine-tuning in later CLs. PiperOrigin-RevId: 269922871
2019-09-18	Signalfd support	Adin Scannell
	Note that the exact semantics for these signalfds are slightly different from Linux. These signalfds are bound to the process at creation time. Reads, polls, etc. are all associated with signals directed at that task. In Linux, all signalfd operations are associated with current, regardless of where the signalfd originated. In practice, this should not be an issue given how signalfds are used. In order to fix this however, we will need to plumb the context through all the event APIs. This gets complicated really quickly, because the waiter APIs are all netstack-specific, and not generally exposed to the context. Probably not worthwhile fixing immediately. PiperOrigin-RevId: 269901749
2019-09-17	Follow-up fixes for image tests.	Nicolas Lacasse
	- Fix ARG syntax in Dockerfiles. - Fix curl commands in Dockerfiles. - Fix some paths in proctor binaries. - Check error from Walk in search helper. PiperOrigin-RevId: 269641686
2019-09-16	Refactor and clean up image tests.	Nicolas Lacasse
	* Use multi-stage builds in Dockerfiles. * Combine all proctor binaries into a single binary. * Change the TestRunner interface to reduce code duplication. PiperOrigin-RevId: 269462101
2019-09-16	Migrate from gflags to absl flags	Michael Pratt
	absl flags are more modern and we can easily depend on them directly. The repo now successfully builds with --incompatible_load_cc_rules_from_bzl. PiperOrigin-RevId: 269387081
2019-09-16	Bring back to life features lost in recent refactor	Fabricio Voznika
	- Sandbox logs are generated when running tests - Kokoro uploads the sandbox logs - Supports multiple parallel runs - Revive script to install locally built runsc with docker PiperOrigin-RevId: 269337274
2019-09-13	gvisor: return ENOTDIR from the unlink syscall	Andrei Vagin
	ENOTDIR has to be returned when a component used as a directory in pathname is not, in fact, a directory. PiperOrigin-RevId: 269037893
2019-09-12	Implement splice methods for pipes and sockets.	Adin Scannell
	This also allows the tee(2) implementation to be enabled, since dup can now be properly supported via WriteTo. Note that this change necessitated some minor restructoring with the fs.FileOperations splice methods. If the *fs.File is passed through directly, then only public API methods are accessible, which will deadlock immediately since the locking is already done by fs.Splice. Instead, we pass through an abstract io.Reader or io.Writer, which elide locks and use the underlying fs.FileOperations directly. PiperOrigin-RevId: 268805207
2019-09-10	Fix minor Kokoro issues.	Adin Scannell
	A recent Kokoro change pointed to go_tests.cfg (in line with the other configurations), which unfortunately broke the presubmits. This change also enabled the KVM tests, which were still using a remote execution strategy. This fixes both of these issues and allows presubmits to pass. One additional test was caught with this case, which seems to have been broken. It's unclear why this was not being caught. PiperOrigin-RevId: 268166291
2019-09-06	Load C++ rules from @rules_cc	Michael Pratt
	See https://github.com/bazelbuild/bazel/issues/8743. This will be required in Bazel 1.0. Protobuf was updated in https://github.com/protocolbuffers/protobuf/commit/bf0c69e1302fe9568fbe310cc54b37d20a9d16a3#diff-96239ee297e0a92ac6ff96a6bc434ef0. GoogleTest was updated in https://github.com/google/googletest/commit/6fd262ecf787d0dc2a91696fd4bf1d3ee1ebfa14. gflags has not yet been updated, so the repo still won't build with --incompatible_load_cc_rules_from_bzl. Tested with buildifier -warnings=native-cc -lint=warn **/BUILD. PiperOrigin-RevId: 267638515
2019-09-05	Ignore the root container when calculating oom_score_adj for the sandbox.	Ian Lewis
	This is done because the root container for CRI is the infrastructure (pause) container and always gets a low oom_score_adj. We do this to ensure that only the oom_score_adj of user containers is used to calculated the sandbox oom_score_adj. Implemented in runsc rather than the containerd shim as it's a bit cleaner to implement here (in the shim it would require overwriting the oomScoreAdj and re-writing out the config.json again). This processing is Kubernetes(CRI) specific but we are currently only supporting CRI for multi-container support anyway. PiperOrigin-RevId: 267507706
2019-09-05	Fix bug in proc_test.	Bhasker Hariharan
	TestNoDuplicates is racy as it tries to read the /proc file system while the test is running. But it's possible that from the time a directory entries are read and each entry processed something could change and in some cases the entry being processed could have been deleted. In such cases we should not fail the test but just ignore the error and move on. PiperOrigin-RevId: 267483094
2019-09-05	Deflake aio_test.	Jamie Liu
	- Most AIO tests call io_setup(nr_events = 128). sizeof(struct io_event) (12832 = 4096). However, the actual size of the mapping created by io_setup() is determined by: (from fs/aio.c:ioctx_alloc()) / * We keep track of the number of available ringbuffer slots, to prevent * overflow (reqs_available), and we also use percpu counters for this. * * So since up to half the slots might be on other cpu's percpu counters * and unavailable, double nr_events so userspace sees what they * expected: additionally, we move req_batch slots to/from percpu * counters at a time, so make sure that isn't 0: / nr_events = max(nr_events, num_possible_cpus() 4); nr_events = 2; (from fs/aio.c:aio_setup_ring()) / Compensate for the ring buffer's head/tail overlap entry / nr_events += 2; / 1 is required, 2 for good luck / size = sizeof(struct aio_ring); size += sizeof(struct io_event) nr_events; nr_pages = PFN_UP(size); When we mremap() only the first page of a multi-page AIO ring buffer mapping, fs/aio.c:aio_ring_mremap() updates struct kioctx::mmap_base - but struct kioctx::mmap_size is untouched, so sys_io_destroy() => kill_ioctx() vm_unmaps() the mremapped page, plus some number of pages after it. Just get the actual size of the mapping from /proc/self/maps. - Delete test case MremapOver; while it is correct that Linux will not complain if you overwrite the AIO ring buffer with another mapping, it won't actually work in the sense that AIO events will not be written to the new mapping, because Linux stores the struct pages of the ring buffer in struct kioctx::ring_pages and writes to those through kmap() rather than using userspace addresses. - Don't munmap() after mremap(MREMAP_FIXED) returns EFAULT; see new comment in factored-out test case MremapExpansion. PiperOrigin-RevId: 267482903
2019-09-04	Run proc_net tests.	Ian Gudger
	PiperOrigin-RevId: 267280086
2019-09-03	Impose order on test scripts.	Adin Scannell
	The simple test script has gotten out of control. Shard this script into different pieces and attempt to impose order on overall test structure. This change helps lay some of the foundations for future improvements. * The runsc/test directories are moved into just test/. * The runsc/test/testutil package is split into logical pieces. * The scripts/ directory contains new top-level targets. * Each test is now responsible for building targets it requires. * The install functionality is moved into `runsc` itself for simplicity. * The existing kokoro run_tests.sh file now just calls all (can be split). After this change is merged, I will create multiple distinct workflows for Kokoro, one for each of the scripts currently targeted by `run_tests.sh` today, which should dramatically reduce the time-to-run for the Kokoro tests, and provides a better foundation for further improvements to the infrastructure. PiperOrigin-RevId: 267081397
2019-08-30	Automated rollback of changelist 261387276	Bhasker Hariharan
	PiperOrigin-RevId: 266491264
2019-08-30	Fix async-signal-unsafety in MlockallTest_Future.	Jamie Liu
	PiperOrigin-RevId: 266491246
2019-08-30	Return correct buffer size for ioctl(socket, FIONREAD)	Fabricio Voznika
	Ioctl was returning just the buffer size from epsocket.endpoint and it was not considering data from epsocket.SocketOperations that was read from the endpoint, but not yet sent to the caller. PiperOrigin-RevId: 266485461
2019-08-30	Add C++ toolchain and fix compile issues.	Adin Scannell
	This was accidentally introduced in 31f05d5d4f62c4cd4fe3b95b333d0130aae4b2c1. Fixes #788. PiperOrigin-RevId: 266462843
2019-08-29	Handle new representation of abstract UDS paths.	Rahat Mahmood
	When abstract unix domain socket paths are displayed in /proc/net/unix, Linux historically emitted null bytes as padding at the end of the path. Newer versions of Linux (v4.9, e7947ea770d0de434d38a0f823e660d3fd4bebb5) display these as '@' characters. Update proc_net_unix test to handle both version of the padding. PiperOrigin-RevId: 266230200
2019-08-29	Implement /proc/net/udp.	Rahat Mahmood
	PiperOrigin-RevId: 266229756
2019-08-29	Compile procter binaries during image creation.	Nicolas Lacasse
	Using "go run ..." in the ENTRYPOINT causes the go compiler to run each time the container is started. We can just compile the binary once as part of the image. PiperOrigin-RevId: 266212462
2019-08-29	Internal change.	gVisor bot
	PiperOrigin-RevId: 266199211
2019-08-27	Fix pwritev2 flaky test.	Zach Koopmans
	Fix a uninitialized memory bug in pwritev2 test. PiperOrigin-RevId: 265772176
2019-08-27	Fix sendfile(2) error code	Fabricio Voznika
	When output file is in append mode, sendfile(2) should fail with EINVAL and not EBADF. Closes #721 PiperOrigin-RevId: 265718958
2019-08-26	Internal change.	gVisor bot
	PiperOrigin-RevId: 265535438
2019-08-23	Second try at flaky futex test.	Zach Koopmans
	The flake had the call to futex_unlock_pi() returning EINVAL with the FUTEX_OWNER_DIED set. In this case, userspace has to clean up stale state. So instead of calling FUTEX_UNLOCK_PI outright, we'll use the advised atomic compare_exchange as advised in the man page. PiperOrigin-RevId: 265163920
2019-08-22	test: set shard_count to 5 by default	Andrei Vagin
	In cl/264434674 and cl/264498919, we stop running test cases in parallel to not overload test hosts. But now tests requires more time to run, so we need to increase a default number of shards or a default test timeout. Let's start with increasing the number of shards and see how it will works. PiperOrigin-RevId: 264917055
2019-08-22	Remove ASSERT from fork child	Michael Pratt
	The gunit macros are not safe to use in the child. PiperOrigin-RevId: 264904348
2019-08-21	Support binding to multicast and broadcast addresses	Chris Kuiper
	This fixes the issue of not being able to bind to either a multicast or broadcast address as well as to send and receive data from it. The way to solve this is to treat these addresses similar to the ANY address and register their transport endpoint ID with the global stack's demuxer rather than the NIC's. That way there is no need to require an endpoint with that multicast or broadcast address. The stack's demuxer is in fact the only correct one to use, because neither broadcast- nor multicast-bound sockets care which NIC a packet was received on (for multicast a join is still needed to receive packets on a NIC). I also took the liberty of refactoring udp_test.go to consolidate a lot of duplicate code and make it easier to create repetitive tests that test the same feature for a variety of packet and socket types. For this purpose I created a "flowType" that represents two things: 1) the type of packet being sent or received and 2) the type of socket used for the test. E.g., a "multicastV4in6" flow represents a V4-mapped multicast packet run through a V6-dual socket. This allows writing significantly simpler tests. A nice example is testTTL(). PiperOrigin-RevId: 264766909
2019-08-21	tests: retry connect if it fails with EINTR	Andrei Vagin
	test/syscalls/linux/proc_net_tcp.cc:252: Failure Value of: connect(client->get(), &addr, addrlen) Expected: not -1 (success) Actual: -1 (of type int), with errno PosixError(errno=4 Interrupted system call) PiperOrigin-RevId: 264743815
2019-08-20	test: reset a signal handler before closing a signal channel	Andrei Vagin
	goroutine 5 [running]: os/signal.process(0x10e21c0, 0xc00050c280) third_party/go/gc/src/os/signal/signal.go:227 +0x164 os/signal.loop() third_party/go/gc/src/os/signal/signal_unix.go:23 +0x3e created by os/signal.init.0 third_party/go/gc/src/os/signal/signal_unix.go:29 +0x41 PiperOrigin-RevId: 264518530
2019-08-20	Don't run runtime tests in parallel.	Nicolas Lacasse
	We need real sharding, and will let Bazel handle the parallelization. That is coming soon. Until then, remove this call to t.Parallel() so that we can run the tests without eating all CPU. PiperOrigin-RevId: 264498919
2019-08-20	Add tests for raw AF_PACKET sockets.	Kevin Krakauer
	PiperOrigin-RevId: 264494359
2019-08-20	Fix flaky futex test.	Zach Koopmans
	The test is long running (175128 ms or so) which causes timeouts. The test simply makes sure that private futexes can acquire locks concurrently. Dropping current threads and increasing the number of locks each thread tests the same concurrency concerns but drops execution time to ~1411 ms. PiperOrigin-RevId: 264476144
2019-08-20	tests: syscall_test_runner should not run tests in parallel	Andrei Vagin
	bazel runs a few instances of syscall_test_runner in parallel and then syscall_test_runner runs test cases in parallel. It might be a reason why we see that test hosts are overloaded and sandboxes start slowly. It should be better to control how many tests are running in parallel from one place, so let's try to disable this feature in syscall_test_runner. PiperOrigin-RevId: 264434674
2019-08-19	Read iptables via sockopts.	Kevin Krakauer
	PiperOrigin-RevId: 264180125
2019-08-16	netstack: disconnect an unix socket only if the address family is AF_UNSPEC	Andrei Vagin
	Linux allows to call connect for ANY and the zero port. PiperOrigin-RevId: 263892534
2019-08-15	Add tests for "cooked" AF_PACKET sockets.	Kevin Krakauer
	PiperOrigin-RevId: 263666789
2019-08-14	Improve SendMsg performance.	Bhasker Hariharan
	SendMsg before this change would copy all the data over into a new slice even if the underlying socket could only accept a small amount of data. This is really inefficient with non-blocking sockets and under high throughput where large writes could get ErrWouldBlock or if there was say a timeout associated with the sendmsg() syscall. With this change we delay copying bytes in till they are needed and only copy what can be potentially sent/held in the socket buffer. Reducing the need to repeatedly copy data over. Also a minor fix to change state FIN-WAIT-1 when shutdown(..., SHUT_WR) is called instead of when we transmit the actual FIN. Otherwise the socket could remain in CONNECTED state even though the user has called shutdown() on the socket. Updates #627 PiperOrigin-RevId: 263430505
2019-08-13	tests: print stack traces if test failed by timeout	Andrei Vagin
	PiperOrigin-RevId: 263184083
2019-08-13	Bump Bazel to v0.28.0	Nicolas Lacasse
	The new version has a change in behavior when using a custom platform: * Old behavior: rules that don't require a toolchain used host_platform, no matter what execution platforms are specified. * New behavior: rules that don't require a toolchain use standard platform resolution that starts with execution platforms. As part of this change, we cannot use the "extra_exectution_platforms" flag provided by the default bazelrc. I got rid of the default bazelrc file, and made our custom .bazelrc as minimal as possible. PiperOrigin-RevId: 263176802
2019-08-12	Compute size of struct tcp_info instead of hardcoding it.	Rahat Mahmood
	PiperOrigin-RevId: 263040624
2019-08-09	netlink: return an error in nlmsgerr	Andrei Vagin
	Now if a process sends an unsupported netlink requests, an error is returned from the send system call. The linux kernel works differently in this case. It returns errors in the nlmsgerr netlink message. Reported-by: syzbot+571d99510c6f935202da@syzkaller.appspotmail.com PiperOrigin-RevId: 262690453
2019-08-09	Create tests for common.Search().	Brett Landau
	Using the path_test.go file built by the Golang devs as a base, tests have been created to verify the functionality of common.Search(). A mock file system is created and fake test files are generated to see if they get picked up by common.Search(). Also included in this CL is a bug fix for proctor-nodejs that was discovered using this test. proctor-nodejs used to allow multiple "-" in its test name filter. The regex has been updated to prevent this. PiperOrigin-RevId: 262647263
2019-08-06	Fix for a panic due to writing to a closed accept channel.	Bhasker Hariharan
	This can happen because endpoint.Close() closes the accept channel first and then drains/resets any accepted but not delivered connections. But there can be connections that are connected but not delivered to the channel as the channel was full. But closing the channel can cause these writes to fail with a write to a closed channel. The correct solution is to abort any connections in SYN-RCVD state and drain/abort all completed connections before closing the accept channel. PiperOrigin-RevId: 261951132
2019-08-06	Require pread/pwrite for splice file offsets	Michael Pratt
	If there is an offset, the file must support pread/pwrite. See fs/splice.c:do_splice. PiperOrigin-RevId: 261944932
2019-08-05	Alter Dockerfiles to include common.go and use a prebuilt JDK.	Samantha Sample
	After the refactoring of the proctor binaries, the Dockerfiles for each language must be altered to copy the common folder into their image. Additionally, Java has been changed to use the pre-built version of JDK-11 from Ubuntu, instead of building it from the source. This allows for a smaller image and faster test execution within the container. PiperOrigin-RevId: 261805158
2019-08-05	Expand runtimes test suite to include Go, Java, PHP, and Python.	Samantha Sample
	This change adds functionality for running more languages using the runtimes test suite. It divides the languages into separate test functions, which each call the helper testLang function in the runtimes_test.go file. This allows them to be run individually or as a group. PiperOrigin-RevId: 261791935
2019-08-02	Job control: controlling TTYs and foreground process groups.	Kevin Krakauer
	(Don't worry, this is mostly tests.) Implemented the following ioctls: - TIOCSCTTY - set controlling TTY - TIOCNOTTY - remove controlling tty, maybe signal some other processes - TIOCGPGRP - get foreground process group. Also enables tcgetpgrp(). - TIOCSPGRP - set foreground process group. Also enabled tcsetpgrp(). Next steps are to actually turn terminal-generated control characters (e.g. C^c) into signals to the proper process groups, and to send SIGTTOU and SIGTTIN when appropriate. PiperOrigin-RevId: 261387276