gvisor - Container Runtime Sandbox

Age	Commit message (Collapse)	Author
2018-10-15	Never send boot process stdio to application stdio.	Nicolas Lacasse
	We treat handle the boot process stdio separately from the application stdio (which gets passed via flags), but we were still sending both to same place. As a result, some logs that are written directly to os.Stderr by the boot process were ending up in the application logs. This CL starts sendind boot process stdio to the null device (since we don't have any better options). The boot process is already configured to send all logs (and panics) to the log file, so we won't miss anything important. PiperOrigin-RevId: 217173020 Change-Id: I5ab980da037f34620e7861a3736ba09c18d73794
2018-10-12	Added spec command to create OCI spec config.json	Ian Lewis
	The spec command is analygous to the 'runc spec' command and allows for the convenient creation of a config.json file for users that don't have runc handy. Change-Id: Ifdfec37e023048ea461c32da1a9042a45b37d856 PiperOrigin-RevId: 216907826
2018-10-11	Make the gofer process enter namespaces	Fabricio Voznika
	This is done to further isolate the gofer from the host. PiperOrigin-RevId: 216790991 Change-Id: Ia265b77e4e50f815d08f743a05669f9d75ad7a6f
2018-10-11	Fix reference leak in tests.	Nicolas Lacasse
	PiperOrigin-RevId: 216780438 Change-Id: Ide637fe36f8d2a61fea9e5b16d1b3401f2540416
2018-10-11	Make Wait() return the sandbox exit status if the sandbox has exited.	Nicolas Lacasse
	It's possible for Start() and Wait() calls to race, if the sandboxed application is short-lived. If the application finishes before (or during) the Wait RPC, then Wait will fail. In practice this looks like "connection refused" or "EOF" errors when waiting for an RPC response. This race is especially bad in tests, where we often run "true" inside a sandbox. This CL does a best-effort fix, by returning the sandbox exit status as the container exit status. In most cases, these are the same. This fixes the remaining flakes in runsc/container:container_test. PiperOrigin-RevId: 216777793 Change-Id: I9dfc6e6ec885b106a736055bc7a75b2008dfff7a
2018-10-11	Make debug log file name configurable	Fabricio Voznika
	This is a breaking change if you're using --debug-log-dir. The fix is to replace it with --debug-log and add a '/' at the end: --debug-log-dir=/tmp/runsc ==> --debug-log=/tmp/runsc/ PiperOrigin-RevId: 216761212 Change-Id: I244270a0a522298c48115719fa08dad55e34ade1
2018-10-11	Sandbox cgroup tests	Fabricio Voznika
	Verify that cgroup is being properly set. PiperOrigin-RevId: 216736137 Change-Id: I0e27fd604eca67e7dd2e3548dc372ca9cc416309
2018-10-11	Add bare bones unsupported syscall logging	Fabricio Voznika
	This change introduces a new flags to create/run called --user-log. Logs to this files are visible to users and are meant to help debugging problems with their images and containers. For now only unsupported syscalls are sent to this log, and only minimum support was added. We can build more infrastructure around it as needed. PiperOrigin-RevId: 216735977 Change-Id: I54427ca194604991c407d49943ab3680470de2d0
2018-10-10	Removes irrelevant TODO.	Kevin Krakauer
	PiperOrigin-RevId: 216616873 Change-Id: I4d974ab968058eadd01542081e18a987ef08f50a
2018-10-10	runsc: Pass controlling TTY by FD in the new process, not current process.	Nicolas Lacasse
	When setting Cmd.SysProcAttr.Ctty, the FD must be the FD of the controlling TTY in the new process, not the current process. The ioctl call is made after duping all FDs in Cmd.ExtraFiles, which may stomp on the old TTY FD. This fixes the "bad address" flakes in runsc/container:container_test, although some other flakes remain. PiperOrigin-RevId: 216594394 Change-Id: Idfd1677abb866aa82ad7e8be776f0c9087256862
2018-10-10	Support for older Linux kernels without getrandom	Jonathan Giannuzzi
	Change-Id: I1fb9f5b47a264a7617912f6f56f995f3c4c5e578 PiperOrigin-RevId: 216591484
2018-10-10	Enforce message size limits and avoid host calls with too many iovecs	Michael Pratt
	Currently, in the face of FileMem fragmentation and a large sendmsg or recvmsg call, host sockets may pass > 1024 iovecs to the host, which will immediately cause the host to return EMSGSIZE. When we detect this case, use a single intermediate buffer to pass to the kernel, copying to/from the src/dst buffer. To avoid creating unbounded intermediate buffers, enforce message size checks and truncation w.r.t. the send buffer size. The same functionality is added to netstack unix sockets for feature parity. PiperOrigin-RevId: 216590198 Change-Id: I719a32e71c7b1098d5097f35e6daf7dd5190eff7
2018-10-10	Add sandbox to cgroup	Fabricio Voznika
	Sandbox creation uses the limits and reservations configured in the OCI spec and set cgroup options accordinly. Then it puts both the sandbox and gofer processes inside the cgroup. It also allows the cgroup to be pre-configured by the caller. If the cgroup already exists, sandbox and gofer processes will join the cgroup but it will not modify the cgroup with spec limits. PiperOrigin-RevId: 216538209 Change-Id: If2c65ffedf55820baab743a0edcfb091b89c1019
2018-10-09	Add tests to verify gofer is chroot'ed	Fabricio Voznika
	PiperOrigin-RevId: 216472439 Change-Id: Ic4cb86c8e0a9cb022d3ceed9dc5615266c307cf9
2018-10-09	Add new netstack metrics to the sentry	Ian Gudger
	PiperOrigin-RevId: 216431260 Change-Id: Ia6e5c8d506940148d10ff2884cf4440f470e5820
2018-10-08	Job control signals must be sent to all processes in the FG process group.	Nicolas Lacasse
	We were previously only sending to the originator of the process group. Integration test was changed to test this behavior. It fails without the corresponding code change. PiperOrigin-RevId: 216297263 Change-Id: I7e41cfd6bdd067f4b9dc215e28f555fb5088916f
2018-10-08	Uncapitalize error	Michael Pratt
	PiperOrigin-RevId: 216281263 Change-Id: Ie0c189e7f5934b77c6302336723bc1181fd2866c
2018-10-04	Capture boot panics in debug log.	Nicolas Lacasse
	Docker and Containerd both eat the boot processes stderr, making it difficult to track down panics (which are always written to stderr). This CL makes the boot process dup its debug log FD to stderr, so that panics will be captured in the debug log, which is better than nothing. This is the 3rd try at this CL. Previous attempts were foiled because Docker expects the 'create' command to pass its stdio directly to the container, so duping stderr in 'create' caused the applications stderr to go to the log file, which breaks many applications (including our mysql test). I added a new image_test that makes sure stdout and stderr are handled correctly. PiperOrigin-RevId: 215767328 Change-Id: Icebac5a5dcf39b623b79d7a0e2f968e059130059
2018-10-03	Fix sandbox chroot	Fabricio Voznika
	Sandbox was setting chroot, but was not chaging the working dir. Added test to ensure this doesn't happen in the future. PiperOrigin-RevId: 215676270 Change-Id: I14352d3de64a4dcb90e50948119dc8328c9c15e1
2018-10-03	Automated rollback of changelist 215585559	Nicolas Lacasse
	PiperOrigin-RevId: 215633475 Change-Id: I7bc471e3b9a2c725fb5e15b3bbcba2ee1ea574b1
2018-10-03	runsc: Allow state transition from Creating to Stopped.	Nicolas Lacasse
	This can happen if an error is encountered during Create() which causes the container to be destroyed and set to state Stopped. Without this transition, errors during Create get hidden by the later panic. PiperOrigin-RevId: 215599193 Change-Id: Icd3f42e12c685cbf042f46b3929bccdf30ad55b0
2018-10-03	Fix arithmetic error in multi_container_test.	Nicolas Lacasse
	We add an additional (2^3)-1=7 processes, but the code was only waiting for 3. I switched back to Math.Pow format to make the arithmetic easier to inspect. PiperOrigin-RevId: 215588140 Change-Id: Iccad4d6f977c1bfc5c4b08d3493afe553fe25733
2018-10-03	runsc: Dup debug log file to stderr, so sentry panics don't get lost.	Nicolas Lacasse
	Docker and containerd do not expose runsc's stderr, so tracking down sentry panics can be painful. If we have a debug log file, we should send panics (and all stderr data) to the log file. PiperOrigin-RevId: 215585559 Change-Id: I3844259ed0cd26e26422bcdb40dded302740b8b6
2018-10-03	runsc: Pass root container's stdio via FD.	Nicolas Lacasse
	We were previously using the sandbox process's stdio as the root container's stdio. This makes it difficult/impossible to distinguish output application output from sandbox output, such as panics, which are always written to stderr. Also close the console socket when we are done with it. PiperOrigin-RevId: 215585180 Change-Id: I980b8c69bd61a8b8e0a496fd7bc90a06446764e0
2018-10-03	Add TIOCINQ to allowed seccomp when hostinet is used	Fabricio Voznika
	PiperOrigin-RevId: 215574070 Change-Id: Ib36e804adebaf756adb9cbc2752be9789691530b
2018-10-02	Bump some timeouts in the image tests.	Nicolas Lacasse
	PiperOrigin-RevId: 215489101 Change-Id: Iaf96aa8edb1101b70548030c62995841215237d9
2018-10-02	Fix compilation bug.	Nicolas Lacasse
	Docker.Run only returns a single argument. PiperOrigin-RevId: 215427309 Change-Id: I1eebbc628853ca57f79d25e18d4f04dfa5a2a003
2018-10-01	runsc: Support job control signals in "exec -it".	Nicolas Lacasse
	Terminal support in runsc relies on host tty file descriptors that are imported into the sandbox. Application tty ioctls are sent directly to the host fd. However, those host tty ioctls are associated in the host kernel with a host process (in this case runsc), and the host kernel intercepts job control characters like ^C and send signals to the host process. Thus, typing ^C into a "runsc exec" shell will send a SIGINT to the runsc process. This change makes "runsc exec" handle all signals, and forward them into the sandbox via the "ContainerSignal" urpc method. Since the "runsc exec" is associated with a particular container process in the sandbox, the signal must be associated with the same container process. One big difficulty is that the signal should not necessarily be sent to the sandbox process started by "exec", but instead must be sent to the foreground process group for the tty. For example, we may exec "bash", and from bash call "sleep 100". A ^C at this point should SIGINT sleep, not bash. To handle this, tty files inside the sandbox must keep track of their foreground process group, which is set/get via ioctls. When an incoming ContainerSignal urpc comes in, we look up the foreground process group via the tty file. Unfortunately, this means we have to expose and cache the tty file in the Loader. Note that "runsc exec" now handles signals properly, but "runs run" does not. That will come in a later CL, as this one is complex enough already. Example: root@:/usr/local/apache2# sleep 100 ^C root@:/usr/local/apache2# sleep 100 ^Z [1]+ Stopped sleep 100 root@:/usr/local/apache2# fg sleep 100 ^C root@:/usr/local/apache2# PiperOrigin-RevId: 215334554 Change-Id: I53cdce39653027908510a5ba8d08c49f9cf24f39
2018-10-01	Fix ruby image tests.	Nicolas Lacasse
	PiperOrigin-RevId: 215274663 Change-Id: I051721f459084db3aa608432831170cd47ae7df0
2018-10-01	Make multi-container the default mode for runsc	Fabricio Voznika
	And remove multicontainer option. PiperOrigin-RevId: 215236981 Change-Id: I9fd1d963d987e421e63d5817f91a25c819ced6cb
2018-09-30	Don't fail if Root is readonly and is not a mount point	Fabricio Voznika
	This makes runsc more friendly to run without docker or K8s. PiperOrigin-RevId: 215165586 Change-Id: Id45a9fc24a3c09b1645f60dbaf70e64711a7a4cd
2018-09-30	Removed duplicate/stale TODOs	Fabricio Voznika
	PiperOrigin-RevId: 215162121 Change-Id: I35f06ac3235cf31c9e8a158dcf6261a7ded6c4c4
2018-09-28	Add test for 'signall --all' with stopped container	Fabricio Voznika
	PiperOrigin-RevId: 215025517 Change-Id: I04b9d8022b3d9dfe279e466ddb91310b9860b9af
2018-09-28	Made a few changes to make testutil.Docker easier to use	Fabricio Voznika
	PiperOrigin-RevId: 215023376 Change-Id: I139569bd15c013e5dd0f60d0c98a64eaa0ba9e8e
2018-09-28	runsc: allow `kill --all` when container is in stopped state.	Lantao Liu
	PiperOrigin-RevId: 215009105 Change-Id: I1ab12eddf7694c4db98f6dafca9dae352a33f7c4
2018-09-28	Add ruby image tests	Fabricio Voznika
	PiperOrigin-RevId: 215009066 Change-Id: I54ab920fa649cf4d0817f7cb8ea76f9126523330
2018-09-28	Make runsc kill and delete more conformant to the "spec"	Fabricio Voznika
	PiperOrigin-RevId: 214976251 Change-Id: I631348c3886f41f63d0e77e7c4f21b3ede2ab521
2018-09-28	Change tcpip.Route.Mask to tcpip.AddressMask.	Googler
	PiperOrigin-RevId: 214975659 Change-Id: I7bd31a2c54f03ff52203109da312e4206701c44c
2018-09-28	Switch to root in userns when CAP_SYS_CHROOT is also missing	Fabricio Voznika
	Some tests check current capabilities and re-run the tests as root inside userns if required capabibilities are missing. It was checking for CAP_SYS_ADMIN only, CAP_SYS_CHROOT is also required now. PiperOrigin-RevId: 214949226 Change-Id: Ic81363969fa76c04da408fae8ea7520653266312
2018-09-27	Merge Loader.containerRootTGs and execProcess into a single map	Fabricio Voznika
	It's easier to manage a single map with processes that we're interested to track. This will make the next change to clean up the map on destroy easier. PiperOrigin-RevId: 214894210 Change-Id: I099247323a0487cd0767120df47ba786fac0926d
2018-09-27	Move common test code to function	Fabricio Voznika
	PiperOrigin-RevId: 214890335 Change-Id: I42743f0ce46a5a42834133bce2f32d187194fc87
2018-09-27	Forward ioctl(TCSETSF) calls on host ttys to the host kernel.	Nicolas Lacasse
	We already forward TCSETS and TCSETSW. TCSETSF is roughly equivalent but discards pending input. The filters were relaxed to allow host ioctls with TCSETSF argument. This fixes programs like "passwd" that prevent user input from being displayed on the terminal. Before: root@b8a0240fc836:/# passwd Enter new UNIX password: 123 Retype new UNIX password: 123 passwd: password updated successfully After: root@ae6f5dabe402:/# passwd Enter new UNIX password: Retype new UNIX password: passwd: password updated successfully PiperOrigin-RevId: 214869788 Change-Id: I31b4d1373c1388f7b51d0f2f45ce40aa8e8b0b58
2018-09-27	Implement 'runsc kill --all'	Fabricio Voznika
	In order to implement kill --all correctly, the Sentry needs to track all tasks that belong to a given container. This change introduces ContainerID to the task, that gets inherited by all children. 'kill --all' then iterates over all tasks comparing the ContainerID field to find all processes that need to be signalled. PiperOrigin-RevId: 214841768 Change-Id: I693b2374be8692d88cc441ef13a0ae34abf73ac6
2018-09-27	Refactor 'runsc boot' to take container ID as argument	Fabricio Voznika
	This makes the flow slightly simpler (no need to call Loader.SetRootContainer). And this is required change to tag tasks with container ID inside the Sentry. PiperOrigin-RevId: 214795210 Change-Id: I6ff4af12e73bb07157f7058bb15fd5bb88760884
2018-09-27	Move uds_test_app to common test_app	Fabricio Voznika
	This was done so it's easier to add more functionality to this file for other tests. PiperOrigin-RevId: 214782043 Change-Id: I1f38b9ee1219b3ce7b789044ada8e52bdc1e6279
2018-09-26	runsc: fix pid file race condition in exec detach mode.	Lantao Liu
	PiperOrigin-RevId: 214700295 Change-Id: I73d8490572eebe5da584af91914650d1953aeb91
2018-09-24	runsc: All non-root bind mounts should be shared.	Nicolas Lacasse
	This CL changes the semantics of the "--file-access" flag so that it only affects the root filesystem. The default remains "exclusive" which is the common use case, as neither Docker nor K8s supports sharing the root. Keeping the root fs as "exclusive" means that the fs-intensive work done during application startup will mostly be cacheable, and thus faster. Non-root bind mounts will always be shared. This CL also removes some redundant FSAccessType validations. We validate this flag in main(), so we can assume it is valid afterwards. PiperOrigin-RevId: 214359936 Change-Id: I7e75d7bf52dbd7fa834d0aacd4034868314f3b51
2018-09-21	Run gofmt -s on everything	Ian Gudger
	PiperOrigin-RevId: 214040901 Change-Id: I74d79497a053da3624921ad2b7c5193ca4a87942
2018-09-21	The "action" in container.Signal should be "signal".	Nicolas Lacasse
	PiperOrigin-RevId: 214038776 Change-Id: I4ad212540ec4ef4fb5ab5fdcb7f0865c4f746895
2018-09-21	runsc: Synchronize container metadata changes with a file lock.	Nicolas Lacasse
	Each container has associated metadata (particularly the container status) that is manipulated by various runsc commands. This metadata is stored in a file identified by the container id. Different runsc processes may manipulate the same container metadata, and each will read/write to the metadata file. This CL adds a file lock per container which must be held when reading the container metadata file, and when modifying and writing the container metadata. PiperOrigin-RevId: 214019179 Change-Id: Ice4390ad233bc7f216c9a9a6cf05fb456c9ec0ad