gvisor - Container Runtime Sandbox

Age	Commit message (Collapse)	Author
2020-05-13	Enable overlayfs_stale_read by default for runsc.	Jamie Liu
	Linux 4.18 and later make reads and writes coherent between pre-copy-up and post-copy-up FDs representing the same file on an overlay filesystem. However, memory mappings remain incoherent: - Documentation/filesystems/overlayfs.rst, "Non-standard behavior": "If a file residing on a lower layer is opened for read-only and then memory mapped with MAP_SHARED, then subsequent changes to the file are not reflected in the memory mapping." - fs/overlay/file.c:ovl_mmap() passes through to the underlying FD without any management of coherence in the overlay. - Experimentally on Linux 5.2: ``` $ cat mmap_cat_page.c #include <err.h> #include <fcntl.h> #include <stdio.h> #include <string.h> #include <sys/mman.h> #include <unistd.h> int main(int argc, char *argv) { if (argc < 2) { errx(1, "syntax: %s [FILE]", argv[0]); } const int fd = open(argv[1], O_RDONLY); if (fd < 0) { err(1, "open(%s)", argv[1]); } const size_t page_size = sysconf(_SC_PAGE_SIZE); void page = mmap(NULL, page_size, PROT_READ, MAP_SHARED, fd, 0); if (page == MAP_FAILED) { err(1, "mmap"); } for (;;) { write(1, page, strnlen(page, page_size)); if (getc(stdin) == EOF) { break; } } return 0; } $ gcc -O2 -o mmap_cat_page mmap_cat_page.c $ mkdir lowerdir upperdir workdir overlaydir $ echo old > lowerdir/file $ sudo mount -t overlay -o "lowerdir=lowerdir,upperdir=upperdir,workdir=workdir" none overlaydir $ ./mmap_cat_page overlaydir/file old ^Z [1]+ Stopped ./mmap_cat_page overlaydir/file $ echo new > overlaydir/file $ cat overlaydir/file new $ fg ./mmap_cat_page overlaydir/file old ``` Therefore, while the VFS1 gofer client's behavior of reopening read FDs is only necessary pre-4.18, replacing existing memory mappings (in both sentry and application address spaces) with mappings of the new FD is required regardless of kernel version, and this latter behavior is common to both VFS1 and VFS2. Re-document accordingly, and change the runsc flag to enabled by default. New test: - Before this CL: https://source.cloud.google.com/results/invocations/5b222d2c-e918-4bae-afc4-407f5bac509b - After this CL: https://source.cloud.google.com/results/invocations/f28c747e-d89c-4d8c-a461-602b33e71aab PiperOrigin-RevId: 311361267
2020-05-04	Enable TestRunNonRoot on VFS2	Fabricio Voznika
	Also added back the default test dimension back which was dropped in a previous refactor. PiperOrigin-RevId: 309797327
2020-05-04	Add TTY support on VFS2 to runsc	Fabricio Voznika
	Updates #1623, #1487 PiperOrigin-RevId: 309777922
2020-04-29	Merge pull request #2487 from moricho:fix/bindmount	gVisor bot
	PiperOrigin-RevId: 309082540
2020-04-28	Merge pull request #2558 from prattmic:forward_signal	gVisor bot
	PiperOrigin-RevId: 308829800
2020-04-27	container: use sighandling package	Michael Pratt
	Use the sighandling package for Container.ForwardSignals, for consistency with other signal forwarding. Fixes #2546
2020-04-27	Update container.go	kevin.xu
	typo, should be `start` in comments
2020-04-26	refactor and add test for bindmount	moricho
	Signed-off-by: moricho <ikeda.morito@gmail.com>
2020-04-25	Add container tests passing with VFS2	Zach Koopmans
	Several tests are passing after getting TestAppExitStatus (run /bin/true) changes. Make versions that run via VFS2 so that we know what is and isn't working. In addition, fix bug in VFSFile ReadFull. For the TestExePath test in container_test.go, the case "unmasked" will return 0 bytes read with no EOF err, causing the ReadFull call to spin. PiperOrigin-RevId: 308428126
2020-04-23	Simplify Docker test infrastructure.	Adin Scannell
	This change adds a layer of abstraction around the internal Docker APIs, and eliminates all direct dependencies on Dockerfiles in the infrastructure. A subsequent change will automated the generation of local images (with efficient caching). Note that this change drops the use of bazel container rules, as that experiment does not seem to be viable. PiperOrigin-RevId: 308095430
2020-04-17	Add test name to boot and gofer log files	Fabricio Voznika
	This is to make easier to find corresponding logs in case test fails. PiperOrigin-RevId: 307104283
2020-04-17	Get /bin/true to run on VFS2	Zach Koopmans
	Included: - loader_test.go RunTest and TestStartSignal VFS2 - container_test.go TestAppExitStatus on VFS2 - experimental flag added to runsc to turn on VFS2 Note: shared mounts are not yet supported. PiperOrigin-RevId: 307070753
2020-04-08	Fix all printf formatting errors.	Adin Scannell
	Updates #2243
2020-04-07	Update TODO to #238	Ian Lewis
	Move TODO to #238 so that proper synchronization of operations is handled when we create the urpc client. Issue #238 Fixes #512 PiperOrigin-RevId: 305383924
2020-03-12	Kill sandbox process when parent process terminates	Fabricio Voznika
	When the sandbox runs in attached more, e.g. runsc do, runsc run, the sandbox lifetime is controlled by the parent process. This wasn't working in all cases because PR_GET_PDEATHSIG doesn't propagate through execve when the process changes uid/gid. So it was getting dropped when the sandbox execve's to change to user nobody. PiperOrigin-RevId: 300601247
2020-03-05	tests: Don't print log messages on stdout	Andrei Vagin
	A parser of test results doesn't expect to see any extra messages. PiperOrigin-RevId: 299174138
2020-03-04	tests: Don't print log messages on stdout	Andrei Vagin
	A parser of test results doesn't expect to see any extra messages. PiperOrigin-RevId: 298966577
2020-02-27	Log oom_score_adj value on error	Fabricio Voznika
	Updates #1873 PiperOrigin-RevId: 297695241
2020-02-25	Add log during process wait in tests	Fabricio Voznika
	TestMultiContainerKillAll timed out under --race. Without logging, we cannot tell if the process list is still increasing, but slowly, or is stuck. PiperOrigin-RevId: 297158834
2020-02-10	Add flag package to limit visibility.	Adin Scannell
	PiperOrigin-RevId: 294297004
2020-02-06	Fix TestPauseResume in container test failed with connection refused.	Ting-Yu Wang
	Sometimes we get this error under TSAN: """ error getting process data from container: connecting to control server at PID XXXX: connection refused """ The theory is that the top "sleep 20" was too short for TSAN, and the container already exited, so we get connected refused. This commit changes the test to let container signaling it's running by touching a file repeatedly forever during the test. PiperOrigin-RevId: 293710957
2020-02-05	Add notes to relevant tests.	Adin Scannell
	These were out-of-band notes that can help provide additional context and simplify automated imports. PiperOrigin-RevId: 293525915
2020-02-04	Increase container_test size.	Kevin Krakauer
	container_test was flaking because a small percentage of runs timed out. Tested this fix with --runs_per_test=100. PiperOrigin-RevId: 293240102
2020-01-27	Standardize on tools directory.	Adin Scannell
	PiperOrigin-RevId: 291745021
2020-01-09	New sync package.	Ian Gudger
	* Rename syncutil to sync. * Add aliases to sync types. * Replace existing usage of standard library sync package. This will make it easier to swap out synchronization primitives. For example, this will allow us to use primitives from github.com/sasha-s/go-deadlock to check for lock ordering violations. Updates #1472 PiperOrigin-RevId: 289033387
2019-12-18	Increase waitForProcessList timeout	Fabricio Voznika
	It can take more than 10 seconds when running under --race. PiperOrigin-RevId: 286296060
2019-12-11	runsc/debug: add an option to list all processes	Andrei Vagin
	runsc debug --ps list all processes with all threads. This option is added to the debug command but not to the ps command, because it is going to be used for debug purposes and we want to add any useful information without thinking about backward compatibility. This will help to investigate syzkaller issues. PiperOrigin-RevId: 285013668
2019-12-06	Implement TTY field in control.Processes().	Nicolas Lacasse
	Threadgroups already know their TTY (if they have one), which now contains the TTY Index, and is returned in the Processes() call. PiperOrigin-RevId: 284263850
2019-12-06	Make annotations OCI compliant	Fabricio Voznika
	Changed annotation to follow the standard defined here: https://github.com/opencontainers/image-spec/blob/master/annotations.md PiperOrigin-RevId: 284254847
2019-10-30	Fix container locking	Fabricio Voznika
	Sandbox root dir was not being saved with the Container state, so it would point to the wrong directory location when attempting to lock the sandbox. This led to race conditions saving and loading container state. Fixing it, led to multiple deadlocks. I've moved the saving and locking logic to a separate struct and moved the lock file inside the RootDir (instead of container root dir), which allows the lock to be taken inside Destroy, and removes the need to lock the sandbox. PiperOrigin-RevId: 277599612
2019-10-24	Fix early deletion of rootDir	Fabricio Voznika
	container.startContainers() cannot be called twice in a test (e.g. TestMultiContainerLoadSandbox) because the cleanup function deletes the rootDir, together with information from all other containers that may exist. PiperOrigin-RevId: 276591806
2019-10-20	Add runsc OCI annotations to support CRI-O.	Tom Lanyon
	Obligatory https://xkcd.com/927 Fixes #626
2019-10-16	Fix problem with open FD when copy up is triggered in overlayfs	Fabricio Voznika
	Linux kernel before 4.19 doesn't implement a feature that updates open FD after a file is open for write (and is copied to the upper layer). Already open FD will continue to read the old file content until they are reopened. This is especially problematic for gVisor because it caches open files. Flag was added to force readonly files to be reopenned when the same file is open for write. This is only needed if using kernels prior to 4.19. Closes #1006 It's difficult to really test this because we never run on tests on older kernels. I'm adding a test in GKE which uses kernels with the overlayfs problem for 1.14 and lower. PiperOrigin-RevId: 275115289
2019-10-08	Ignore mount options that are not supported in shared mounts	Fabricio Voznika
	Options that do not change mount behavior inside the Sentry are irrelevant and should not be used when looking for possible incompatibilities between master and slave mounts. PiperOrigin-RevId: 273593486
2019-10-01	Prevent CAP_NET_RAW from appearing in exec	Fabricio Voznika
	'docker exec' was getting CAP_NET_RAW even when --net-raw=false because it was not filtered out from when copying container's capabilities. PiperOrigin-RevId: 272260451
2019-09-16	Bring back to life features lost in recent refactor	Fabricio Voznika
	- Sandbox logs are generated when running tests - Kokoro uploads the sandbox logs - Supports multiple parallel runs - Revive script to install locally built runsc with docker PiperOrigin-RevId: 269337274
2019-09-05	Ignore the root container when calculating oom_score_adj for the sandbox.	Ian Lewis
	This is done because the root container for CRI is the infrastructure (pause) container and always gets a low oom_score_adj. We do this to ensure that only the oom_score_adj of user containers is used to calculated the sandbox oom_score_adj. Implemented in runsc rather than the containerd shim as it's a bit cleaner to implement here (in the shim it would require overwriting the oomScoreAdj and re-writing out the config.json again). This processing is Kubernetes(CRI) specific but we are currently only supporting CRI for multi-container support anyway. PiperOrigin-RevId: 267507706
2019-09-04	Resolve flakes with TestMultiContainerDestroy	Fabricio Voznika
	Some processes are reparented to the root container depending on the kill order and the root container would not reap in time. So some zombie processes were still present when the test checked. Fix it by running the second container inside a PID namespace. PiperOrigin-RevId: 267278591
2019-09-03	Impose order on test scripts.	Adin Scannell
	The simple test script has gotten out of control. Shard this script into different pieces and attempt to impose order on overall test structure. This change helps lay some of the foundations for future improvements. * The runsc/test directories are moved into just test/. * The runsc/test/testutil package is split into logical pieces. * The scripts/ directory contains new top-level targets. * Each test is now responsible for building targets it requires. * The install functionality is moved into `runsc` itself for simplicity. * The existing kokoro run_tests.sh file now just calls all (can be split). After this change is merged, I will create multiple distinct workflows for Kokoro, one for each of the scripts currently targeted by `run_tests.sh` today, which should dramatically reduce the time-to-run for the Kokoro tests, and provides a better foundation for further improvements to the infrastructure. PiperOrigin-RevId: 267081397
2019-08-27	Mount volumes as super user	Fabricio Voznika
	This used to be the case, but regressed after a recent change. Also made a few fixes around it and clean up the code a bit. Closes #720 PiperOrigin-RevId: 265717496
2019-08-07	Set gofer's OOM score adjustment	Fabricio Voznika
	Updates #512 PiperOrigin-RevId: 262195448
2019-08-06	Make loading container in a sandbox more robust	Fabricio Voznika
	PiperOrigin-RevId: 262071646
2019-08-02	Stops container if gofer is killed	Fabricio Voznika
	Each gofer now has a goroutine that polls on the FDs used to communicate with the sandbox. The respective gofer is destroyed if any of the FDs is closed. Closes #601 PiperOrigin-RevId: 261383725
2019-08-01	Set sandbox oom_score_adj	Ian Lewis
	Set /proc/self/oom_score_adj based on oomScoreAdj specified in the OCI bundle. When new containers are added to the sandbox oom_score_adj for the sandbox and all other gofers are adjusted so that oom_score_adj is equal to the lowest oom_score_adj of all containers in the sandbox. Fixes #512 PiperOrigin-RevId: 261242725
2019-07-24	Use different pidns among different containers	chris.zn
	The different containers in a sandbox used only one pid namespace before. This results in that a container can see the processes in another container in the same sandbox. This patch use different pid namespace for different containers. Signed-off-by: chris.zn <chris.zn@antfin.com>
2019-07-23	Give each container a distinct MountNamespace.	Nicolas Lacasse
	This keeps all container filesystem completely separate from eachother (including from the root container filesystem), and allows us to get rid of the "__runsc_containers__" directory. It also simplifies container startup/teardown as we don't have to muck around in the root container's filesystem. PiperOrigin-RevId: 259613346
2019-07-08	Don't try to execute a file that is not regular.	Nicolas Lacasse
	PiperOrigin-RevId: 257037608
2019-07-03	Avoid importing platforms from many source files	Andrei Vagin
	PiperOrigin-RevId: 256494243
2019-06-27	Fix various spelling issues in the documentation	Michael Pratt
	Addresses obvious typos, in the documentation only. COPYBARA_INTEGRATE_REVIEW=https://github.com/google/gvisor/pull/443 from Pixep:fix/documentation-spelling 4d0688164eafaf0b3010e5f4824b35d1e7176d65 PiperOrigin-RevId: 255477779
2019-06-18	Kill sandbox process when 'runsc do' exits	Fabricio Voznika
	PiperOrigin-RevId: 253882115