summaryrefslogtreecommitdiffhomepage
path: root/runsc
AgeCommit message (Collapse)Author
2019-02-14Don't allow writing or reading to TTY unless process group is in foreground.Nicolas Lacasse
If a background process tries to read from a TTY, linux sends it a SIGTTIN unless the signal is blocked or ignored, or the process group is an orphan, in which case the syscall returns EIO. See drivers/tty/n_tty.c:n_tty_read()=>job_control(). If a background process tries to write a TTY, set the termios, or set the foreground process group, linux then sends a SIGTTOU. If the signal is ignored or blocked, linux allows the write. If the process group is an orphan, the syscall returns EIO. See drivers/tty/tty_io.c:tty_check_change(). PiperOrigin-RevId: 234044367 Change-Id: I009461352ac4f3f11c5d42c43ac36bb0caa580f9
2019-02-13Add support for using PACKET_RX_RING to receive packets.Bhasker Hariharan
PACKET_RX_RING allows the use of an mmapped buffer to receive packets from the kernel. This should cut down the number of host syscalls that need to be made to receive packets when the underlying fd is a socket of the AF_PACKET type. PiperOrigin-RevId: 233834998 Change-Id: I8060025c6ced206986e94cc46b8f382b81bfa47f
2019-02-01Factor the subtargets method into a helper method with tests.Nicolas Lacasse
PiperOrigin-RevId: 232047515 Change-Id: I00f036816e320356219be7b2f2e6d5fe57583a60
2019-01-31gvisor/gofer: Use pivot_root instead of chrootAndrei Vagin
PiperOrigin-RevId: 231864273 Change-Id: I8545b72b615f5c2945df374b801b80be64ec3e13
2019-01-31Remove license commentsMichael Pratt
Nothing reads them and they can simply get stale. Generated with: $ sed -i "s/licenses(\(.*\)).*/licenses(\1)/" **/BUILD PiperOrigin-RevId: 231818945 Change-Id: Ibc3f9838546b7e94f13f217060d31f4ada9d4bf0
2019-01-31runsc: check whether a container is deleted or not before setupContainerFSAndrei Vagin
PiperOrigin-RevId: 231811387 Change-Id: Ib143fb9a4d0fa1f105d1a3a3bd533dfc44e792af
2019-01-29runsc: reap a sandbox process only in sandbox.Wait()Andrei Vagin
PiperOrigin-RevId: 231504064 Change-Id: I585b769aef04a3ad7e7936027958910a6eed9c8d
2019-01-29Use recvmmsg() instead of readv() to read packets from NIC.Bhasker Hariharan
This should reduce the number of syscalls required to process packets significantly and improve throughputs. PiperOrigin-RevId: 231366886 Change-Id: I8b38077262bf9c53176bc4a94b530188d3d7c0ca
2019-01-28check isRootNS by ns inodeShijiang Wei
Signed-off-by: Shijiang Wei <mountkin@gmail.com> Change-Id: I032f834edae5c716fb2d3538285eec07aa11a902 PiperOrigin-RevId: 231318438
2019-01-28runsc: Only uninstall cgroup for sandbox stop.Lantao Liu
PiperOrigin-RevId: 231263114 Change-Id: I57467a34fe94e395fdd3685462c4fe9776d040a3
2019-01-25Make cacheRemoteRevalidating detect changes to file sizeFabricio Voznika
When file size changes outside the sandbox, page cache was not refreshing file size which is required for cacheRemoteRevalidating. In fact, cacheRemoteRevalidating should be skipping the cache completely since it's not really benefiting from it. The cache is cache is already bypassed for unstable attributes (see cachePolicy.cacheUAttrs). And althought the cache is called to map pages, they will always miss the cache and map directly from the host. Created a HostMappable struct that maps directly to the host and use it for files with cacheRemoteRevalidating. Closes #124 PiperOrigin-RevId: 230998440 Change-Id: Ic5f632eabe33b47241e05e98c95e9b2090ae08fc
2019-01-25Fix a nil pointer dereference bug in Container.Destroy()ShiruRen
In Container.Destroy(), we call c.stop() before calling executeHooksBestEffort(), therefore, when we call executeHooksBestEffort(c.Spec.Hooks.Poststop, c.State()) to execute the poststop hook, it results in a nil pointer dereference since it reads c.Sandbox.Pid in c.State() after the sandbox has been destroyed. To fix this bug, we can change container's status to "stopped" before executing the poststop hook. Signed-off-by: ShiruRen <renshiru2000@gmail.com> Change-Id: I4d835e430066fab7e599e188f945291adfc521ef PiperOrigin-RevId: 230975505
2019-01-25Execute statically linked binaryFabricio Voznika
Mounting lib and lib64 are not necessary anymore and simplifies the test. PiperOrigin-RevId: 230971195 Change-Id: Ib91a3ffcec4b322cd3687c337eedbde9641685ed
2019-01-22Don't bind-mount runsc into a sandbox mntnsAndrei Vagin
PiperOrigin-RevId: 230437407 Change-Id: Id9d8ceeb018aad2fe317407c78c6ee0f4b47aa2b
2019-01-18Scrub runsc error messagesFabricio Voznika
Removed "error" and "failed to" prefix that don't add value from messages. Adjusted a few other messages. In particular, when the container fail to start, the message returned is easier for humans to read: $ docker run --rm --runtime=runsc alpine foobar docker: Error response from daemon: OCI runtime start failed: <path> did not terminate sucessfully: starting container: starting root container [foobar]: starting sandbox: searching for executable "foobar", cwd: "/", $PATH="/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin": no such file or directory Closes #77 PiperOrigin-RevId: 230022798 Change-Id: I83339017c70dae09e4f9f8e0ea2e554c4d5d5cd1
2019-01-18Start a sandbox process in a new userns only if CAP_SETUID is setAndrei Vagin
In addition, it fixes a race condition in TestMultiContainerGoferStop. There are two scripts copy the same set of files into the same directory and sometime one of this command fails with EXIST. PiperOrigin-RevId: 230011247 Change-Id: I9289f72e65dc407cdcd0e6cd632a509e01f43e9c
2019-01-18runsc: create a new proc mount if the sandbox process is running in a new pidnsAndrei Vagin
PiperOrigin-RevId: 229971902 Change-Id: Ief4fac731e839ef092175908de9375d725eaa3aa
2019-01-16Prevent internal tmpfs mount to override files in /tmpFabricio Voznika
Runsc wants to mount /tmp using internal tmpfs implementation for performance. However, it risks hiding files that may exist under /tmp in case it's present in the container. Now, it only mounts over /tmp iff: - /tmp was not explicitly asked to be mounted - /tmp is empty If any of this is not true, then /tmp maps to the container's image /tmp. Note: checkpoint doesn't have sentry FS mounted to check if /tmp is empty. It simply looks for explicit mounts right now. PiperOrigin-RevId: 229607856 Change-Id: I10b6dae7ac157ef578efc4dfceb089f3b94cde06
2019-01-15Create working directory if it doesn't yet existFabricio Voznika
PiperOrigin-RevId: 229438125 Change-Id: I58eb0d10178d1adfc709d7b859189d1acbcb2f22
2019-01-14Remove fs.Handle, ramfs.Entry, and all the DeprecatedFileOperations.Nicolas Lacasse
More helper structs have been added to the fsutil package to make it easier to implement fs.InodeOperations and fs.FileOperations. PiperOrigin-RevId: 229305982 Change-Id: Ib6f8d3862f4216745116857913dbfa351530223b
2019-01-14runsc: set up a minimal chroot from the sandbox processAndrei Vagin
In this case, new mounts are not created in the host mount namspaces, so tearDownChroot isn't needed, because chroot will be destroyed with a sandbox mount namespace. In additional, pivot_root can't be called instead of chroot. PiperOrigin-RevId: 229250871 Change-Id: I765bdb587d0b8287a6a8efda8747639d37c7e7b6
2019-01-11runsc: Collect zombies of sandbox and gofer processesAndrei Vagin
And we need to wait a gofer process before cgroup.Uninstall, because it is running in the sandbox cgroups. PiperOrigin-RevId: 228904020 Change-Id: Iaf8826d5b9626db32d4057a1c505a8d7daaeb8f9
2019-01-09Restore to original cgroup after sandbox and gofer processes are createdFabricio Voznika
The original code assumed that it was safe to join and not restore cgroup, but Container.Run will not exit after calling start, making cgroup cleanup fail because there were still processes inside the cgroup. PiperOrigin-RevId: 228529199 Change-Id: I12a48d9adab4bbb02f20d71ec99598c336cbfe51
2019-01-07Undo changes in case of failure to create file/dir/symlinkFabricio Voznika
File/dir/symlink creation is multi-step and may leave state behind in case of failure in one of the steps. Added best effort attempt to clean up. PiperOrigin-RevId: 228286612 Change-Id: Ib03c27cd3d3e4f44d0352edc6ee212a53412d7f1
2019-01-03Apply chroot for --network=host tooFabricio Voznika
PiperOrigin-RevId: 227747566 Change-Id: Ide9df4ac1391adcd1c56e08d6570e0d149d85bc4
2019-01-02Automated rollback of changelist 225089593Michael Pratt
PiperOrigin-RevId: 227595007 Change-Id: If14cc5aab869c5fd7a4ebd95929c887ab690e94c
2018-12-28Simplify synchronization between runsc and sandbox processFabricio Voznika
Make 'runsc create' join cgroup before creating sandbox process. This removes the need to synchronize platform creation and ensure that sandbox process is charged to the right cgroup from the start. PiperOrigin-RevId: 227166451 Change-Id: Ieb4b18e6ca0daf7b331dc897699ca419bc5ee3a2
2018-12-20Rename limits.MemoryPagesLocked to limits.MemoryLocked.Jamie Liu
"RLIMIT_MEMLOCK: This is the maximum number of bytes of memory that may be locked into RAM." - getrlimit(2) PiperOrigin-RevId: 226384346 Change-Id: Iefac4a1bb69f7714dc813b5b871226a8344dc800
2018-12-19Automated rollback of changelist 225861605Googler
PiperOrigin-RevId: 226224230 Change-Id: Id24c7d3733722fd41d5fe74ef64e0ce8c68f0b12
2018-12-17Expose internal testing flagMichael Pratt
Never to used outside of runsc tests! PiperOrigin-RevId: 225919013 Change-Id: Ib3b14aa2a2564b5246fb3f8933d95e01027ed186
2018-12-17Implement mlock(), kind of.Jamie Liu
Currently mlock() and friends do nothing whatsoever. However, mlocking is directly application-visible in a number of ways; for example, madvise(MADV_DONTNEED) and msync(MS_INVALIDATE) both fail on mlocked regions. We handle this inconsistently: MADV_DONTNEED is too important to not work, but MS_INVALIDATE is rejected. Change MM to track mlocked regions in a manner consistent with Linux. It still will not actually pin pages into host physical memory, but: - mlock() will now cause sentry memory management to precommit mlocked pages. - MADV_DONTNEED and MS_INVALIDATE will interact with mlocked pages as described above. PiperOrigin-RevId: 225861605 Change-Id: Iee187204979ac9a4d15d0e037c152c0902c8d0ee
2018-12-13container.Destroy should clean up container metadata even if other cleanups failNicolas Lacasse
If the sandbox process is dead (because of a panic or some other problem), container.Destroy will never remove the container metadata file, since it will always fail when calling container.stop(). This CL changes container.Destroy() to always perform the three necessary cleanup operations: * Stop the sandbox and gofer processes. * Remove the container fs on the host. * Delete the container metadata directory. Errors from these three operations will be concatenated and returned from Destroy(). PiperOrigin-RevId: 225448164 Change-Id: I99c6311b2e4fe5f6e2ca991424edf1ebeae9df32
2018-12-11Add "trace signal" optionMichael Pratt
This option is effectively equivalent to -panic-signal, except that the sandbox does not die after logging the traceback. PiperOrigin-RevId: 225089593 Change-Id: Ifb1c411210110b6104613f404334bd02175e484e
2018-12-10Open source system call tests.Brian Geffon
PiperOrigin-RevId: 224886231 Change-Id: I0fccb4d994601739d8b16b1d4e6b31f40297fb22
2018-12-10Internal change.Nicolas Lacasse
PiperOrigin-RevId: 224865061 Change-Id: I6aa31f880931980ad2fc4c4b3cc4c532aacb31f4
2018-12-07sentry: turn "dynamically-created" procfs files into static creation.Zhaozhong Ni
PiperOrigin-RevId: 224600982 Change-Id: I547253528e24fb0bb318fc9d2632cb80504acb34
2018-12-06A sandbox process should wait until it has not been moved into cgroupsAndrei Vagin
PiperOrigin-RevId: 224418900 Change-Id: I53cf4d7c1c70117875b6920f8fd3d58a3b1497e9
2018-12-04Max link traversals should be for an entire path.Brian Geffon
The number of symbolic links that are allowed to be followed are for a full path and not just a chain of symbolic links. PiperOrigin-RevId: 224047321 Change-Id: I5e3c4caf66a93c17eeddcc7f046d1e8bb9434a40
2018-12-03Internal change.Googler
PiperOrigin-RevId: 223893409 Change-Id: I58869c7fb0012f6c3f7612a96cb649348b56335f
2018-11-28Internal change.Googler
PiperOrigin-RevId: 223231273 Change-Id: I8fb97ea91f7507b4918f7ce6562890611513fc30
2018-11-28Fix crictl tests.Kevin Krakauer
gvisor-containerd-shim moved. It now has a stable URL that run_tests.sh always uses. PiperOrigin-RevId: 223188822 Change-Id: I5687c78289404da27becd8d5949371e580fdb360
2018-11-27Disable crictl testsMichael Pratt
gvisor-containerd-shim installation is currently broken. PiperOrigin-RevId: 223002877 Change-Id: I2b890c5bf602a96c475c3805f24852ead8593a35
2018-11-20Use RET_KILL_PROCESS if available in kernelFabricio Voznika
RET_KILL_THREAD doesn't work well for Go because it will kill only the offending thread and leave the process hanging. RET_TRAP can be masked out and it's not guaranteed to kill the process. RET_KILL_PROCESS is available since 4.14. For older kernel, continue to use RET_TRAP as this is the best option (likely to kill process, easy to debug). PiperOrigin-RevId: 222357867 Change-Id: Icc1d7d731274b16c2125b7a1ba4f7883fbdb2cbd
2018-11-20Use math.Rand to generate a random test container id.Nicolas Lacasse
We were relying on time.UnixNano, but that was causing collisions. Now we generate 20 bytes of entropy from rand.Read, and base32-encode it to get a valid container id. PiperOrigin-RevId: 222313867 Change-Id: Iaeea9b9582d36de55f9f02f55de6a5de3f739371
2018-11-20Internal change.Nicolas Lacasse
PiperOrigin-RevId: 222170431 Change-Id: I26a6d6ad5d6910a94bb8b0a05fc2d12e23098399
2018-11-20Add unsupported syscall events for get/setsockoptFabricio Voznika
PiperOrigin-RevId: 222148953 Change-Id: I21500a9f08939c45314a6414e0824490a973e5aa
2018-11-20Don't fail when destroyContainerFS is called more than onceFabricio Voznika
This can happen when destroy is called multiple times or when destroy failed previously and is being called again. PiperOrigin-RevId: 221882034 Change-Id: I8d069af19cf66c4e2419bdf0d4b789c5def8d19e
2018-11-20Internal change.Nicolas Lacasse
PiperOrigin-RevId: 221848471 Change-Id: I882fbe5ce7737048b2e1f668848e9c14ed355665
2018-11-15Allow sandbox.Wait to be called after the sandbox has exited.Nicolas Lacasse
sandbox.Wait is racey, as the sandbox may have exited before it is called, or even during. We already had code to handle the case that the sandbox exits during the Wait call, but we were not properly handling the case where the sandbox has exited before the call. The best we can do in such cases is return the sandbox exit code as the application exit code. PiperOrigin-RevId: 221702517 Change-Id: I290d0333cc094c7c1c3b4ce0f17f61a3e908d787
2018-11-14Internal change.Nicolas Lacasse
PiperOrigin-RevId: 221462069 Change-Id: Id469ed21fe12e582c78340189b932989afa13c67