summaryrefslogtreecommitdiffhomepage
path: root/runsc/container
AgeCommit message (Collapse)Author
2018-10-17runsc: Add --pid flag to runsc kill.Kevin Krakauer
--pid allows specific processes to be signalled rather than the container root process or all processes in the container. containerd needs to SIGKILL exec'd processes that timeout and check whether processes are still alive. PiperOrigin-RevId: 217547636 Change-Id: I2058ebb548b51c8eb748f5884fb88bad0b532e45
2018-10-16Bump sandbox start and stop timeouts.Nicolas Lacasse
PiperOrigin-RevId: 217433699 Change-Id: Icef08285728c23ee7dd650706aaf18da51c25dff
2018-10-11Make the gofer process enter namespacesFabricio Voznika
This is done to further isolate the gofer from the host. PiperOrigin-RevId: 216790991 Change-Id: Ia265b77e4e50f815d08f743a05669f9d75ad7a6f
2018-10-11Make Wait() return the sandbox exit status if the sandbox has exited.Nicolas Lacasse
It's possible for Start() and Wait() calls to race, if the sandboxed application is short-lived. If the application finishes before (or during) the Wait RPC, then Wait will fail. In practice this looks like "connection refused" or "EOF" errors when waiting for an RPC response. This race is especially bad in tests, where we often run "true" inside a sandbox. This CL does a best-effort fix, by returning the sandbox exit status as the container exit status. In most cases, these are the same. This fixes the remaining flakes in runsc/container:container_test. PiperOrigin-RevId: 216777793 Change-Id: I9dfc6e6ec885b106a736055bc7a75b2008dfff7a
2018-10-11Add bare bones unsupported syscall loggingFabricio Voznika
This change introduces a new flags to create/run called --user-log. Logs to this files are visible to users and are meant to help debugging problems with their images and containers. For now only unsupported syscalls are sent to this log, and only minimum support was added. We can build more infrastructure around it as needed. PiperOrigin-RevId: 216735977 Change-Id: I54427ca194604991c407d49943ab3680470de2d0
2018-10-10Add sandbox to cgroupFabricio Voznika
Sandbox creation uses the limits and reservations configured in the OCI spec and set cgroup options accordinly. Then it puts both the sandbox and gofer processes inside the cgroup. It also allows the cgroup to be pre-configured by the caller. If the cgroup already exists, sandbox and gofer processes will join the cgroup but it will not modify the cgroup with spec limits. PiperOrigin-RevId: 216538209 Change-Id: If2c65ffedf55820baab743a0edcfb091b89c1019
2018-10-09Add tests to verify gofer is chroot'edFabricio Voznika
PiperOrigin-RevId: 216472439 Change-Id: Ic4cb86c8e0a9cb022d3ceed9dc5615266c307cf9
2018-10-03runsc: Allow state transition from Creating to Stopped.Nicolas Lacasse
This can happen if an error is encountered during Create() which causes the container to be destroyed and set to state Stopped. Without this transition, errors during Create get hidden by the later panic. PiperOrigin-RevId: 215599193 Change-Id: Icd3f42e12c685cbf042f46b3929bccdf30ad55b0
2018-10-03Fix arithmetic error in multi_container_test.Nicolas Lacasse
We add an additional (2^3)-1=7 processes, but the code was only waiting for 3. I switched back to Math.Pow format to make the arithmetic easier to inspect. PiperOrigin-RevId: 215588140 Change-Id: Iccad4d6f977c1bfc5c4b08d3493afe553fe25733
2018-10-01runsc: Support job control signals in "exec -it".Nicolas Lacasse
Terminal support in runsc relies on host tty file descriptors that are imported into the sandbox. Application tty ioctls are sent directly to the host fd. However, those host tty ioctls are associated in the host kernel with a host process (in this case runsc), and the host kernel intercepts job control characters like ^C and send signals to the host process. Thus, typing ^C into a "runsc exec" shell will send a SIGINT to the runsc process. This change makes "runsc exec" handle all signals, and forward them into the sandbox via the "ContainerSignal" urpc method. Since the "runsc exec" is associated with a particular container process in the sandbox, the signal must be associated with the same container process. One big difficulty is that the signal should not necessarily be sent to the sandbox process started by "exec", but instead must be sent to the foreground process group for the tty. For example, we may exec "bash", and from bash call "sleep 100". A ^C at this point should SIGINT sleep, not bash. To handle this, tty files inside the sandbox must keep track of their foreground process group, which is set/get via ioctls. When an incoming ContainerSignal urpc comes in, we look up the foreground process group via the tty file. Unfortunately, this means we have to expose and cache the tty file in the Loader. Note that "runsc exec" now handles signals properly, but "runs run" does not. That will come in a later CL, as this one is complex enough already. Example: root@:/usr/local/apache2# sleep 100 ^C root@:/usr/local/apache2# sleep 100 ^Z [1]+ Stopped sleep 100 root@:/usr/local/apache2# fg sleep 100 ^C root@:/usr/local/apache2# PiperOrigin-RevId: 215334554 Change-Id: I53cdce39653027908510a5ba8d08c49f9cf24f39
2018-10-01Make multi-container the default mode for runscFabricio Voznika
And remove multicontainer option. PiperOrigin-RevId: 215236981 Change-Id: I9fd1d963d987e421e63d5817f91a25c819ced6cb
2018-09-30Don't fail if Root is readonly and is not a mount pointFabricio Voznika
This makes runsc more friendly to run without docker or K8s. PiperOrigin-RevId: 215165586 Change-Id: Id45a9fc24a3c09b1645f60dbaf70e64711a7a4cd
2018-09-30Removed duplicate/stale TODOsFabricio Voznika
PiperOrigin-RevId: 215162121 Change-Id: I35f06ac3235cf31c9e8a158dcf6261a7ded6c4c4
2018-09-28Add test for 'signall --all' with stopped containerFabricio Voznika
PiperOrigin-RevId: 215025517 Change-Id: I04b9d8022b3d9dfe279e466ddb91310b9860b9af
2018-09-28runsc: allow `kill --all` when container is in stopped state.Lantao Liu
PiperOrigin-RevId: 215009105 Change-Id: I1ab12eddf7694c4db98f6dafca9dae352a33f7c4
2018-09-28Make runsc kill and delete more conformant to the "spec"Fabricio Voznika
PiperOrigin-RevId: 214976251 Change-Id: I631348c3886f41f63d0e77e7c4f21b3ede2ab521
2018-09-27Move common test code to functionFabricio Voznika
PiperOrigin-RevId: 214890335 Change-Id: I42743f0ce46a5a42834133bce2f32d187194fc87
2018-09-27Implement 'runsc kill --all'Fabricio Voznika
In order to implement kill --all correctly, the Sentry needs to track all tasks that belong to a given container. This change introduces ContainerID to the task, that gets inherited by all children. 'kill --all' then iterates over all tasks comparing the ContainerID field to find all processes that need to be signalled. PiperOrigin-RevId: 214841768 Change-Id: I693b2374be8692d88cc441ef13a0ae34abf73ac6
2018-09-27Move uds_test_app to common test_appFabricio Voznika
This was done so it's easier to add more functionality to this file for other tests. PiperOrigin-RevId: 214782043 Change-Id: I1f38b9ee1219b3ce7b789044ada8e52bdc1e6279
2018-09-21Run gofmt -s on everythingIan Gudger
PiperOrigin-RevId: 214040901 Change-Id: I74d79497a053da3624921ad2b7c5193ca4a87942
2018-09-21The "action" in container.Signal should be "signal".Nicolas Lacasse
PiperOrigin-RevId: 214038776 Change-Id: I4ad212540ec4ef4fb5ab5fdcb7f0865c4f746895
2018-09-21runsc: Synchronize container metadata changes with a file lock.Nicolas Lacasse
Each container has associated metadata (particularly the container status) that is manipulated by various runsc commands. This metadata is stored in a file identified by the container id. Different runsc processes may manipulate the same container metadata, and each will read/write to the metadata file. This CL adds a file lock per container which must be held when reading the container metadata file, and when modifying and writing the container metadata. PiperOrigin-RevId: 214019179 Change-Id: Ice4390ad233bc7f216c9a9a6cf05fb456c9ec0ad
2018-09-20Set Sandbox.Chroot so it gets cleaned up upon destructionFabricio Voznika
I've made several attempts to create a test, but the lack of permission from the test user makes it nearly impossible to test anything useful. PiperOrigin-RevId: 213922174 Change-Id: I5b502ca70cb7a6645f8836f028fb203354b4c625
2018-09-20runsc: allow `runsc wait` on a container for multiple times.Lantao Liu
PiperOrigin-RevId: 213908919 Change-Id: I74eff99a5360bb03511b946f4cb5658bb5fc40c7
2018-09-20runsc: Fix a bug that `runsc wait` doesn't work after container exits.Lantao Liu
PiperOrigin-RevId: 213849165 Change-Id: I5120b2f568850c0c42a08e8706e7f8653ef1bd94
2018-09-19Add container.Destroy urpc method.Nicolas Lacasse
This method will: 1. Stop the container process if it is still running. 2. Unmount all sanadbox-internal mounts for the container. 3. Delete the contaner root directory inside the sandbox. Destroy is idempotent, and safe to call concurrantly. This fixes a bug where after stopping a container, we cannot unmount the container root directory on the host. This bug occured because the sandbox dirent cache was holding a dirent with a host fd corresponding to a file inside the container root on the host. The dirent cache did not know that the container had exited, and kept the FD open, preventing us from unmounting on the host. Now that we unmount (and flush) all container mounts inside the sandbox, any host FDs donated by the gofer will be closed, and we can unmount the container root on the host. PiperOrigin-RevId: 213737693 Change-Id: I28c0ff4cd19a08014cdd72fec5154497e92aacc9
2018-09-19runsc: Mark container_test flaky.Kevin Krakauer
PiperOrigin-RevId: 213732520 Change-Id: Ife292987ec8b1de4c2e7e3b7d4452b00c1582e91
2018-09-18Added state machine checks for Container.StatusFabricio Voznika
For my own sanitity when thinking about possible transitions and state. PiperOrigin-RevId: 213559482 Change-Id: I25588c86cf6098be4eda01f4e7321c102ceef33c
2018-09-18Handle children processes better in testsFabricio Voznika
Reap children more systematically in container tests. Previously, container_test was taking ~5 mins to run because constainer.Destroy() would timeout waiting for the sandbox process to exit. Now the test running in less than a minute. Also made the contract around Container and Sandbox destroy clearer. PiperOrigin-RevId: 213527471 Change-Id: Icca84ee1212bbdcb62bdfc9cc7b71b12c6d1688d
2018-09-17Rename container in testFabricio Voznika
's' used to stand for sandbox, before container exited. PiperOrigin-RevId: 213390641 Change-Id: I7bda94a50398c46721baa92227e32a7a1d817412
2018-09-17runsc: Enable waiting on exited processes.Kevin Krakauer
This makes `runsc wait` behave more like waitpid()/wait4() in that: - Once a process has run to completion, you can wait on it and get its exit code. - Processes not waited on will consume memory (like a zombie process) PiperOrigin-RevId: 213358916 Change-Id: I5b5eca41ce71eea68e447380df8c38361a4d1558
2018-09-13runsc: Support container signal/wait.Lantao Liu
This CL: 1) Fix `runsc wait`, it now also works after the container exits; 2) Generate correct container state in Load; 2) Make sure `Destory` cleanup everything before successfully return. PiperOrigin-RevId: 212900107 Change-Id: Ie129cbb9d74f8151a18364f1fc0b2603eac4109a
2018-09-12runsc: Add exec flag that specifies where to save the sandbox-internal pid.Kevin Krakauer
This is different from the existing -pid-file flag, which saves a host pid. PiperOrigin-RevId: 212713968 Change-Id: I2c486de8dd5cfd9b923fb0970165ef7c5fc597f0
2018-09-07Remove '--file-access=direct' optionFabricio Voznika
It was used before gofer was implemented and it's not supported anymore. BREAKING CHANGE: proxy-shared and proxy-exclusive options are now: shared and exclusive. PiperOrigin-RevId: 212017643 Change-Id: If029d4073fe60583e5ca25f98abb2953de0d78fd
2018-09-07runsc: Run sandbox process inside minimal chroot.Nicolas Lacasse
We construct a dir with the executable bind-mounted at /exe, and proc mounted at /proc. Runsc now executes the sandbox process inside this chroot, thus limiting access to the host filesystem. The mounts and chroot dir are removed when the sandbox is destroyed. Because this requires bind-mounts, we can only do the chroot if we have CAP_SYS_ADMIN. PiperOrigin-RevId: 211994001 Change-Id: Ia71c515e26085e0b69b833e71691830148bc70d1
2018-09-06runsc testing: Move TestMultiContainerSignal to multi_container_test.Kevin Krakauer
PiperOrigin-RevId: 211831396 Change-Id: Id67f182cb43dccb696180ec967f5b96176f252e0
2018-09-05runsc: Support runsc kill multi-container.Kevin Krakauer
Now, we can kill individual containers rather than the entire sandbox. PiperOrigin-RevId: 211748106 Change-Id: Ic97e91db33d53782f838338c4a6d0aab7a313ead
2018-09-05Enabled bind mounts in sub-containersFabricio Voznika
With multi-gofers, bind mounts in sub-containers should just work. Removed restrictions and added test. There are also a few cleanups along the way, e.g. retry unmounting in case cleanup races with gofer teardown. PiperOrigin-RevId: 211699569 Change-Id: Ic0a69c29d7c31cd7e038909cc686c6ac98703374
2018-09-05Running container should have a valid sandboxFabricio Voznika
PiperOrigin-RevId: 211693868 Change-Id: Iea340dd78bf26ae6409c310b63c17cc611c2055f
2018-09-05Move multi-container test to a single fileFabricio Voznika
PiperOrigin-RevId: 211685288 Change-Id: I7872f2a83fcaaa54f385e6e567af6e72320c5aa0
2018-09-05runsc: Promote getExecutablePathInternal to getExecutablePath.Nicolas Lacasse
Remove GetExecutablePath (the non-internal version). This makes path handling more consistent between exec, root, and child containers. The new getExecutablePath now uses MountNamespace.FindInode, which is more robust than Walking the Dirent tree ourselves. This also removes the last use of lstat(2) in the sentry, so that can be removed from the filters. PiperOrigin-RevId: 211683110 Change-Id: Ic8ec960fc1c267aa7d310b8efe6e900c88a9207a
2018-09-04runsc: fix container rootfs path.Lantao Liu
PiperOrigin-RevId: 211515350 Change-Id: Ia495af57447c799909aa97bb873a50b87bee2625
2018-08-31Mounting over '/tmp' may failFabricio Voznika
PiperOrigin-RevId: 211160120 Change-Id: Ie5f280bdac17afd01cb16562ffff6222b3184c34
2018-08-31runsc: Set volume mount rslave.Lantao Liu
PiperOrigin-RevId: 211111376 Change-Id: I27b8cb4e070d476fa4781ed6ecfa0cf1dcaf85f5
2018-08-28runsc: unmount volume mounts when destroy container.Lantao Liu
PiperOrigin-RevId: 210579178 Change-Id: Iae20639c5186b1a976cbff6d05bda134cd00d0da
2018-08-27runsc: Fix readonly filesystem causing failure to create containers.Kevin Krakauer
For readonly filesystems specified via relative path, we were forgetting to mount relative to the container's bundle directory. PiperOrigin-RevId: 210483388 Change-Id: I84809fce4b1f2056d0e225547cb611add5f74177
2018-08-27fs: Fix remote-revalidate cache policy.Nicolas Lacasse
When revalidating a Dirent, if the inode id is the same, then we don't need to throw away the entire Dirent. We can just update the unstable attributes in place. If the inode id has changed, then the remote file has been deleted or moved, and we have no choice but to throw away the dirent we have a look up another. In this case, we may still end up losing a mounted dirent that is a child of the revalidated dirent. However, that seems appropriate here because the entire mount point has been pulled out from underneath us. Because gVisor's overlay is at the Inode level rather than the Dirent level, we must pass the parent Inode and name along with the Inode that is being revalidated. PiperOrigin-RevId: 210431270 Change-Id: I705caef9c68900234972d5aac4ae3a78c61c7d42
2018-08-27Put fsgofer inside chrootFabricio Voznika
Now each container gets its own dedicated gofer that is chroot'd to the rootfs path. This is done to add an extra layer of security in case the gofer gets compromised. PiperOrigin-RevId: 210396476 Change-Id: Iba21360a59dfe90875d61000db103f8609157ca0
2018-08-22runsc: De-flakes container_test TestMultiContainerSanity.Kevin Krakauer
The bug was caused by os.File's finalizer, which closes the file. Because fsgofer.serve() was passed a file descriptor as an int rather than a os.File, callers would pass os.File.Fd(), and the os.File would go out of scope. Thus, the file would get GC'd and finalized nondeterministically, causing failures when the file was used. PiperOrigin-RevId: 209861834 Change-Id: Idf24d5c1f04c9b28659e62c97202ab3b4d72e994
2018-08-21Fix TestUnixDomainSockets failure when path is too largeFabricio Voznika
UDS has a lower size limit than regular files. When running under bazel this limit is exceeded. Test was changed to always mount /tmp and use it for the test. PiperOrigin-RevId: 209717830 Change-Id: I1dbe19fe2051ffdddbaa32b188a9167f446ed193