gvisor - Container Runtime Sandbox

Age	Commit message (Collapse)	Author
2019-04-09	runsc: set UID and GID if gofer is executed in a new user namespace	Andrei Vagin
	Otherwise, we will not have capabilities in the user namespace. And this patch adds the noexec option for mounts. https://github.com/google/gvisor/issues/145 PiperOrigin-RevId: 242706519 Change-Id: I1b78b77d6969bd18038c71616e8eb7111b71207c
2019-03-29	Set container.CreatedAt in Create().	Nicolas Lacasse
	PiperOrigin-RevId: 241056805 Change-Id: I13ea8f5dbfb01ca02a3b0ab887b8c3bdf4d556a6
2019-03-18	Add support for mount propagation	Fabricio Voznika
	Properly handle propagation options for root and mounts. Now usage of mount options shared, rshared, and noexec cause error to start. shared/ rshared breaks sandbox=>host isolation. slave however can be supported because changes propagate from host to sandbox. Root FS setup moved inside the gofer. Apart from simplifying the code, it keeps all mounts inside the namespace. And they are torn down when the namespace is destroyed (DestroyFS is no longer needed). PiperOrigin-RevId: 239037661 Change-Id: I8b5ee4d50da33c042ea34fa68e56514ebe20e6e0
2019-02-25	Fix cgroup when path is relative	Fabricio Voznika
	This can happen when 'docker run --cgroup-parent=' flag is set. PiperOrigin-RevId: 235645559 Change-Id: Ieea3ae66939abadab621053551bf7d62d412e7ee
2019-01-31	gvisor/gofer: Use pivot_root instead of chroot	Andrei Vagin
	PiperOrigin-RevId: 231864273 Change-Id: I8545b72b615f5c2945df374b801b80be64ec3e13
2019-01-28	runsc: Only uninstall cgroup for sandbox stop.	Lantao Liu
	PiperOrigin-RevId: 231263114 Change-Id: I57467a34fe94e395fdd3685462c4fe9776d040a3
2019-01-25	Fix a nil pointer dereference bug in Container.Destroy()	ShiruRen
	In Container.Destroy(), we call c.stop() before calling executeHooksBestEffort(), therefore, when we call executeHooksBestEffort(c.Spec.Hooks.Poststop, c.State()) to execute the poststop hook, it results in a nil pointer dereference since it reads c.Sandbox.Pid in c.State() after the sandbox has been destroyed. To fix this bug, we can change container's status to "stopped" before executing the poststop hook. Signed-off-by: ShiruRen <renshiru2000@gmail.com> Change-Id: I4d835e430066fab7e599e188f945291adfc521ef PiperOrigin-RevId: 230975505
2019-01-22	Don't bind-mount runsc into a sandbox mntns	Andrei Vagin
	PiperOrigin-RevId: 230437407 Change-Id: Id9d8ceeb018aad2fe317407c78c6ee0f4b47aa2b
2019-01-18	Scrub runsc error messages	Fabricio Voznika
	Removed "error" and "failed to" prefix that don't add value from messages. Adjusted a few other messages. In particular, when the container fail to start, the message returned is easier for humans to read: $ docker run --rm --runtime=runsc alpine foobar docker: Error response from daemon: OCI runtime start failed: <path> did not terminate sucessfully: starting container: starting root container [foobar]: starting sandbox: searching for executable "foobar", cwd: "/", $PATH="/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin": no such file or directory Closes #77 PiperOrigin-RevId: 230022798 Change-Id: I83339017c70dae09e4f9f8e0ea2e554c4d5d5cd1
2019-01-11	runsc: Collect zombies of sandbox and gofer processes	Andrei Vagin
	And we need to wait a gofer process before cgroup.Uninstall, because it is running in the sandbox cgroups. PiperOrigin-RevId: 228904020 Change-Id: Iaf8826d5b9626db32d4057a1c505a8d7daaeb8f9
2019-01-09	Restore to original cgroup after sandbox and gofer processes are created	Fabricio Voznika
	The original code assumed that it was safe to join and not restore cgroup, but Container.Run will not exit after calling start, making cgroup cleanup fail because there were still processes inside the cgroup. PiperOrigin-RevId: 228529199 Change-Id: I12a48d9adab4bbb02f20d71ec99598c336cbfe51
2018-12-13	container.Destroy should clean up container metadata even if other cleanups fail	Nicolas Lacasse
	If the sandbox process is dead (because of a panic or some other problem), container.Destroy will never remove the container metadata file, since it will always fail when calling container.stop(). This CL changes container.Destroy() to always perform the three necessary cleanup operations: * Stop the sandbox and gofer processes. * Remove the container fs on the host. * Delete the container metadata directory. Errors from these three operations will be concatenated and returned from Destroy(). PiperOrigin-RevId: 225448164 Change-Id: I99c6311b2e4fe5f6e2ca991424edf1ebeae9df32
2018-11-15	Allow sandbox.Wait to be called after the sandbox has exited.	Nicolas Lacasse
	sandbox.Wait is racey, as the sandbox may have exited before it is called, or even during. We already had code to handle the case that the sandbox exits during the Wait call, but we were not properly handling the case where the sandbox has exited before the call. The best we can do in such cases is return the sandbox exit code as the application exit code. PiperOrigin-RevId: 221702517 Change-Id: I290d0333cc094c7c1c3b4ce0f17f61a3e908d787
2018-11-05	Fix race between start and destroy	Fabricio Voznika
	Before this change, a container starting up could race with destroy (aka delete) and leave processes behind. Now, whenever a container is created, Loader.processes gets a new entry. Start now expects the entry to be there, and if it's not it means that the container was deleted. I've also fixed Loader.waitPID to search for the process using the init process's PID namespace. We could use a few more tests for signal and wait. I'll send them in another cl. PiperOrigin-RevId: 220224290 Change-Id: I15146079f69904dc07d43c3b66cc343a2dab4cc4
2018-11-01	Use spec with clean paths for gofer	Fabricio Voznika
	Otherwise the gofer's attach point may be different from sandbox when there symlinks in the path. PiperOrigin-RevId: 219730492 Change-Id: Ia9c4c2d16228c6a1a9e790e0cb673fd881003fe1
2018-10-21	Updated cleanup code to be more explicit about ignoring errors.	Ian Lewis
	Errors are shown as being ignored by assigning to the blank identifier. PiperOrigin-RevId: 218103819 Change-Id: I7cc7b9d8ac503a03de5504ebdeb99ed30a531cf2
2018-10-19	Use correct company name in copyright header	Ian Gudger
	PiperOrigin-RevId: 217951017 Change-Id: Ie08bf6987f98467d07457bcf35b5f1ff6e43c035
2018-10-18	Resolve mount paths while setting up root fs mount	Fabricio Voznika
	It's hard to resolve symlinks inside the sandbox because rootfs and mounts may be read-only, forcing us to create mount points inside lower layer of an overlay, before the volumes are mounted. Since the destination must already be resolved outside the sandbox when creating mounts, take this opportunity to rewrite the spec with paths resolved. "runsc boot" will use the "resolved" spec to load mounts. In addition, symlink traversals were disabled while mounting containers inside the sandbox. It haven't been able to write a good test for it. So I'm relying on manual tests for now. PiperOrigin-RevId: 217749904 Change-Id: I7ac434d5befd230db1488446cda03300cc0751a9
2018-10-17	runsc: Add --pid flag to runsc kill.	Kevin Krakauer
	--pid allows specific processes to be signalled rather than the container root process or all processes in the container. containerd needs to SIGKILL exec'd processes that timeout and check whether processes are still alive. PiperOrigin-RevId: 217547636 Change-Id: I2058ebb548b51c8eb748f5884fb88bad0b532e45
2018-10-16	Bump sandbox start and stop timeouts.	Nicolas Lacasse
	PiperOrigin-RevId: 217433699 Change-Id: Icef08285728c23ee7dd650706aaf18da51c25dff
2018-10-11	Make the gofer process enter namespaces	Fabricio Voznika
	This is done to further isolate the gofer from the host. PiperOrigin-RevId: 216790991 Change-Id: Ia265b77e4e50f815d08f743a05669f9d75ad7a6f
2018-10-11	Add bare bones unsupported syscall logging	Fabricio Voznika
	This change introduces a new flags to create/run called --user-log. Logs to this files are visible to users and are meant to help debugging problems with their images and containers. For now only unsupported syscalls are sent to this log, and only minimum support was added. We can build more infrastructure around it as needed. PiperOrigin-RevId: 216735977 Change-Id: I54427ca194604991c407d49943ab3680470de2d0
2018-10-10	Add sandbox to cgroup	Fabricio Voznika
	Sandbox creation uses the limits and reservations configured in the OCI spec and set cgroup options accordinly. Then it puts both the sandbox and gofer processes inside the cgroup. It also allows the cgroup to be pre-configured by the caller. If the cgroup already exists, sandbox and gofer processes will join the cgroup but it will not modify the cgroup with spec limits. PiperOrigin-RevId: 216538209 Change-Id: If2c65ffedf55820baab743a0edcfb091b89c1019
2018-10-09	Add tests to verify gofer is chroot'ed	Fabricio Voznika
	PiperOrigin-RevId: 216472439 Change-Id: Ic4cb86c8e0a9cb022d3ceed9dc5615266c307cf9
2018-10-03	runsc: Allow state transition from Creating to Stopped.	Nicolas Lacasse
	This can happen if an error is encountered during Create() which causes the container to be destroyed and set to state Stopped. Without this transition, errors during Create get hidden by the later panic. PiperOrigin-RevId: 215599193 Change-Id: Icd3f42e12c685cbf042f46b3929bccdf30ad55b0
2018-10-01	runsc: Support job control signals in "exec -it".	Nicolas Lacasse
	Terminal support in runsc relies on host tty file descriptors that are imported into the sandbox. Application tty ioctls are sent directly to the host fd. However, those host tty ioctls are associated in the host kernel with a host process (in this case runsc), and the host kernel intercepts job control characters like ^C and send signals to the host process. Thus, typing ^C into a "runsc exec" shell will send a SIGINT to the runsc process. This change makes "runsc exec" handle all signals, and forward them into the sandbox via the "ContainerSignal" urpc method. Since the "runsc exec" is associated with a particular container process in the sandbox, the signal must be associated with the same container process. One big difficulty is that the signal should not necessarily be sent to the sandbox process started by "exec", but instead must be sent to the foreground process group for the tty. For example, we may exec "bash", and from bash call "sleep 100". A ^C at this point should SIGINT sleep, not bash. To handle this, tty files inside the sandbox must keep track of their foreground process group, which is set/get via ioctls. When an incoming ContainerSignal urpc comes in, we look up the foreground process group via the tty file. Unfortunately, this means we have to expose and cache the tty file in the Loader. Note that "runsc exec" now handles signals properly, but "runs run" does not. That will come in a later CL, as this one is complex enough already. Example: root@:/usr/local/apache2# sleep 100 ^C root@:/usr/local/apache2# sleep 100 ^Z [1]+ Stopped sleep 100 root@:/usr/local/apache2# fg sleep 100 ^C root@:/usr/local/apache2# PiperOrigin-RevId: 215334554 Change-Id: I53cdce39653027908510a5ba8d08c49f9cf24f39
2018-10-01	Make multi-container the default mode for runsc	Fabricio Voznika
	And remove multicontainer option. PiperOrigin-RevId: 215236981 Change-Id: I9fd1d963d987e421e63d5817f91a25c819ced6cb
2018-09-30	Removed duplicate/stale TODOs	Fabricio Voznika
	PiperOrigin-RevId: 215162121 Change-Id: I35f06ac3235cf31c9e8a158dcf6261a7ded6c4c4
2018-09-28	runsc: allow `kill --all` when container is in stopped state.	Lantao Liu
	PiperOrigin-RevId: 215009105 Change-Id: I1ab12eddf7694c4db98f6dafca9dae352a33f7c4
2018-09-28	Make runsc kill and delete more conformant to the "spec"	Fabricio Voznika
	PiperOrigin-RevId: 214976251 Change-Id: I631348c3886f41f63d0e77e7c4f21b3ede2ab521
2018-09-27	Implement 'runsc kill --all'	Fabricio Voznika
	In order to implement kill --all correctly, the Sentry needs to track all tasks that belong to a given container. This change introduces ContainerID to the task, that gets inherited by all children. 'kill --all' then iterates over all tasks comparing the ContainerID field to find all processes that need to be signalled. PiperOrigin-RevId: 214841768 Change-Id: I693b2374be8692d88cc441ef13a0ae34abf73ac6
2018-09-21	The "action" in container.Signal should be "signal".	Nicolas Lacasse
	PiperOrigin-RevId: 214038776 Change-Id: I4ad212540ec4ef4fb5ab5fdcb7f0865c4f746895
2018-09-21	runsc: Synchronize container metadata changes with a file lock.	Nicolas Lacasse
	Each container has associated metadata (particularly the container status) that is manipulated by various runsc commands. This metadata is stored in a file identified by the container id. Different runsc processes may manipulate the same container metadata, and each will read/write to the metadata file. This CL adds a file lock per container which must be held when reading the container metadata file, and when modifying and writing the container metadata. PiperOrigin-RevId: 214019179 Change-Id: Ice4390ad233bc7f216c9a9a6cf05fb456c9ec0ad
2018-09-20	Set Sandbox.Chroot so it gets cleaned up upon destruction	Fabricio Voznika
	I've made several attempts to create a test, but the lack of permission from the test user makes it nearly impossible to test anything useful. PiperOrigin-RevId: 213922174 Change-Id: I5b502ca70cb7a6645f8836f028fb203354b4c625
2018-09-20	runsc: Fix a bug that `runsc wait` doesn't work after container exits.	Lantao Liu
	PiperOrigin-RevId: 213849165 Change-Id: I5120b2f568850c0c42a08e8706e7f8653ef1bd94
2018-09-19	Add container.Destroy urpc method.	Nicolas Lacasse
	This method will: 1. Stop the container process if it is still running. 2. Unmount all sanadbox-internal mounts for the container. 3. Delete the contaner root directory inside the sandbox. Destroy is idempotent, and safe to call concurrantly. This fixes a bug where after stopping a container, we cannot unmount the container root directory on the host. This bug occured because the sandbox dirent cache was holding a dirent with a host fd corresponding to a file inside the container root on the host. The dirent cache did not know that the container had exited, and kept the FD open, preventing us from unmounting on the host. Now that we unmount (and flush) all container mounts inside the sandbox, any host FDs donated by the gofer will be closed, and we can unmount the container root on the host. PiperOrigin-RevId: 213737693 Change-Id: I28c0ff4cd19a08014cdd72fec5154497e92aacc9
2018-09-18	Added state machine checks for Container.Status	Fabricio Voznika
	For my own sanitity when thinking about possible transitions and state. PiperOrigin-RevId: 213559482 Change-Id: I25588c86cf6098be4eda01f4e7321c102ceef33c
2018-09-18	Handle children processes better in tests	Fabricio Voznika
	Reap children more systematically in container tests. Previously, container_test was taking ~5 mins to run because constainer.Destroy() would timeout waiting for the sandbox process to exit. Now the test running in less than a minute. Also made the contract around Container and Sandbox destroy clearer. PiperOrigin-RevId: 213527471 Change-Id: Icca84ee1212bbdcb62bdfc9cc7b71b12c6d1688d
2018-09-17	runsc: Enable waiting on exited processes.	Kevin Krakauer
	This makes `runsc wait` behave more like waitpid()/wait4() in that: - Once a process has run to completion, you can wait on it and get its exit code. - Processes not waited on will consume memory (like a zombie process) PiperOrigin-RevId: 213358916 Change-Id: I5b5eca41ce71eea68e447380df8c38361a4d1558
2018-09-13	runsc: Support container signal/wait.	Lantao Liu
	This CL: 1) Fix `runsc wait`, it now also works after the container exits; 2) Generate correct container state in Load; 2) Make sure `Destory` cleanup everything before successfully return. PiperOrigin-RevId: 212900107 Change-Id: Ie129cbb9d74f8151a18364f1fc0b2603eac4109a
2018-09-12	runsc: Add exec flag that specifies where to save the sandbox-internal pid.	Kevin Krakauer
	This is different from the existing -pid-file flag, which saves a host pid. PiperOrigin-RevId: 212713968 Change-Id: I2c486de8dd5cfd9b923fb0970165ef7c5fc597f0
2018-09-07	Remove '--file-access=direct' option	Fabricio Voznika
	It was used before gofer was implemented and it's not supported anymore. BREAKING CHANGE: proxy-shared and proxy-exclusive options are now: shared and exclusive. PiperOrigin-RevId: 212017643 Change-Id: If029d4073fe60583e5ca25f98abb2953de0d78fd
2018-09-05	runsc: Support runsc kill multi-container.	Kevin Krakauer
	Now, we can kill individual containers rather than the entire sandbox. PiperOrigin-RevId: 211748106 Change-Id: Ic97e91db33d53782f838338c4a6d0aab7a313ead
2018-09-05	Enabled bind mounts in sub-containers	Fabricio Voznika
	With multi-gofers, bind mounts in sub-containers should just work. Removed restrictions and added test. There are also a few cleanups along the way, e.g. retry unmounting in case cleanup races with gofer teardown. PiperOrigin-RevId: 211699569 Change-Id: Ic0a69c29d7c31cd7e038909cc686c6ac98703374
2018-09-05	Running container should have a valid sandbox	Fabricio Voznika
	PiperOrigin-RevId: 211693868 Change-Id: Iea340dd78bf26ae6409c310b63c17cc611c2055f
2018-08-28	runsc: unmount volume mounts when destroy container.	Lantao Liu
	PiperOrigin-RevId: 210579178 Change-Id: Iae20639c5186b1a976cbff6d05bda134cd00d0da
2018-08-27	Put fsgofer inside chroot	Fabricio Voznika
	Now each container gets its own dedicated gofer that is chroot'd to the rootfs path. This is done to add an extra layer of security in case the gofer gets compromised. PiperOrigin-RevId: 210396476 Change-Id: Iba21360a59dfe90875d61000db103f8609157ca0
2018-08-21	Initial change for multi-gofer support	Fabricio Voznika
	PiperOrigin-RevId: 209647293 Change-Id: I980fca1257ea3fcce796388a049c353b0303a8a5
2018-08-15	runsc fsgofer: Support dynamic serving of filesystems.	Kevin Krakauer
	When multiple containers run inside a sentry, each container has its own root filesystem and set of mounts. Containers are also added after sentry boot rather than all configured and known at boot time. The fsgofer needs to be able to serve the root filesystem of each container. Thus, it must be possible to add filesystems after the fsgofer has already started. This change: * Creates a URPC endpoint within the gofer process that listens for requests to serve new content. * Enables the sentry, when starting a new container, to add the new container's filesystem. * Mounts those new filesystems at separate roots within the sentry. PiperOrigin-RevId: 208903248 Change-Id: Ifa91ec9c8caf5f2f0a9eead83c4a57090ce92068
2018-07-18	Moved restore code out of create and made to be called after create.	Justine Olshan
	Docker expects containers to be created before they are restored. However, gVisor restoring requires specificactions regarding the kernel and the file system. These actions were originally in booting the sandbox. Now setting up the file system is deferred until a call to a call to runsc start. In the restore case, the kernel is destroyed and a new kernel is created in the same process, as we need the same process for Docker. These changes required careful execution of concurrent processes which required the use of a channel. Full docker integration still needs the ability to restore into the same container. PiperOrigin-RevId: 205161441 Change-Id: Ie1d2304ead7e06855319d5dc310678f701bd099f