summaryrefslogtreecommitdiffhomepage
path: root/runsc/container
AgeCommit message (Collapse)Author
2021-03-11Report filesystem-specific mount options.Rahat Mahmood
PiperOrigin-RevId: 362406813
2021-03-09Update flock to v0.8.0Fabricio Voznika
PiperOrigin-RevId: 361962416
2021-03-06[op] Replace syscall package usage with golang.org/x/sys/unix in runsc/.Ayush Ranjan
The syscall package has been deprecated in favor of golang.org/x/sys. Note that syscall is still used in some places because the following don't seem to have an equivalent in unix package: - syscall.SysProcIDMap - syscall.Credential Updates #214 PiperOrigin-RevId: 361381490
2021-02-24return root pids with runsc psDaniel Dao
`runsc ps` currently return pid for a task's immediate pid namespace, which is confusing when there're multiple pid namespaces. We should return only pids in the root namespace. Before: ``` 1000 1 0 0 ? 02:24 250ms chrome 1000 1 0 0 ? 02:24 40ms dumb-init 1000 1 0 0 ? 02:24 240ms chrome 1000 2 1 0 ? 02:24 2.78s node ``` After: ``` UID PID PPID C TTY STIME TIME CMD 1000 1 0 0 ? 12:35 0s dumb-init 1000 2 1 7 ? 12:35 240ms node 1000 13 2 21 ? 12:35 2.33s chrome 1000 27 13 3 ? 12:35 260ms chrome ``` Signed-off-by: Daniel Dao <dqminh@cloudflare.com>
2021-02-22Return nicer error message when cgroups v1 isn't availableFabricio Voznika
Updates #3481 Closes #5430 PiperOrigin-RevId: 358923208
2021-02-22Fix `runsc kill --pid`Fabricio Voznika
Previously, loader.signalProcess was inconsitently using both root and container's PID namespace to find the process. It used root namespace for the exec'd process and container's PID namespace for other processes. This fixes the code to use the root PID namespace across the board, which is the same PID reported in `runsc ps` (or soon will after https://github.com/google/gvisor/pull/5519). PiperOrigin-RevId: 358836297
2021-02-05Replace TaskFromContext(ctx).Kernel() with KernelFromContext(ctx)Ting-Yu Wang
Panic seen at some code path like control.ExecAsync where ctx does not have a Task. Reported-by: syzbot+55ce727161cf94a7b7d6@syzkaller.appspotmail.com PiperOrigin-RevId: 355960596
2021-02-02Stub out basic `runsc events --stat` CPU functionalityKevin Krakauer
Because we lack gVisor-internal cgroups, we take the CPU usage of the entire pod and divide it proportionally according to sentry-internal usage stats. This fixes `kubectl top pods`, which gets a pod's CPU usage by summing the usage of its containers. Addresses #172. PiperOrigin-RevId: 355229833
2021-02-01Enable container checkpoint/restore tests with VFS2Fabricio Voznika
Updates #1663 PiperOrigin-RevId: 355077816
2021-01-22Fix TestDuplicateEnvVariable flakynessFabricio Voznika
Updates #5226 PiperOrigin-RevId: 353262133
2021-01-12Fix simple mistakes identified by goreportcard.Adin Scannell
These are primarily simplification and lint mistakes. However, minor fixes are also included and tests added where appropriate. PiperOrigin-RevId: 351425971
2021-01-11OCI spec may contain duplicate environment variablesFabricio Voznika
Closes #5226 PiperOrigin-RevId: 351259576
2020-12-17Add sandbox ID to state file nameFabricio Voznika
This allows to find all containers inside a sandbox more efficiently. This operation is required every time a container starts and stops, and previously required loading *all* container state files to check whether the container belonged to the sandbox. Apert from being inneficient, it has caused problems when state files are stale or corrupt, causing inavalability to create any container. Also adjust commands `list` and `debug` to skip over files that fail to load. Resolves #5052 PiperOrigin-RevId: 348050637
2020-12-03Support partitions for other tests.Adin Scannell
PiperOrigin-RevId: 345399936
2020-11-18Fix race condition in multi-container wait testFabricio Voznika
Container is not thread-safe, locking must be done in the caller. The test was calling Container.Wait() from multiple threads with no synchronization. Also removed Container.WaitPID from test because the process might have already existed when wait is called. PiperOrigin-RevId: 343176280
2020-11-17Add support for TTY in multi-containerFabricio Voznika
Fixes #2714 PiperOrigin-RevId: 342950412
2020-11-05Re-add start/stop container testsFabricio Voznika
Due to a type doDestroyNotStartedTest was being tested 2x instead of doDestroyStartingTest. PiperOrigin-RevId: 340969797
2020-11-05Return failure when `runsc events` queries a stopped containerFabricio Voznika
This was causing gvisor-containerd-shim to crash because the command suceeded, but there was no stat present. PiperOrigin-RevId: 340964921
2020-11-05Fix failure setting OOM score adjustmentFabricio Voznika
When OOM score adjustment needs to be set, all the containers need to be loaded to find all containers that belong to the sandbox. However, each load signals the container to ensure it is still alive. OOM score adjustment is set during creation and deletion of every container, generating a flood of signals to all containers. The fix removes the signal check when it's not needed. There is also a race fetching OOM score adjustment value from the parent when the sandbox exits at the same time (the time it took to signal containers above made this window quite large). The fix is to store the original value in the sandbox state file and use it when the value needs to be restored. Also add more logging and made the existing ones more consistent to help with debugging. PiperOrigin-RevId: 340940799
2020-10-21Merge pull request #3957 from workato:auto-cgroupgVisor bot
PiperOrigin-RevId: 338372736
2020-10-20Do not even try forcing cgroups in testsKonstantin Baranov
2020-10-19Fixes to cgroupsFabricio Voznika
There were a few problems with cgroups: - cleanup loop what breaking too early - parse of /proc/[pid]/cgroups was skipping "name=systemd" because "name=" was not being removed from name. - When no limits are specified, fillFromAncestor was not being called, causing a failure to set cpuset.mems Updates #4536 PiperOrigin-RevId: 337947356
2020-10-06Ignore errors in rootless and test modesKonstantin Baranov
2020-10-05Fix gofer monitor prematurely destroying containerFabricio Voznika
When all container tasks finish, they release the mount which in turn will close the 9P session to the gofer. The gofer exits when the connection closes, triggering the gofer monitor. The gofer monitor will _think_ that the gofer died prematurely and destroy the container. Then when the caller attempts to wait for the container, e.g. to get the exit code, wait fails saying the container doesn't exist. Gofer monitor now just SIGKILLs the container, and let the normal teardown process to happen, which will evetually destroy the container at the right time. Also, fixed an issue with exec racing with container's init process exiting. Closes #1487 PiperOrigin-RevId: 335537350
2020-10-05Enable more VFS2 testsFabricio Voznika
Updates #1487 PiperOrigin-RevId: 335516732
2020-10-02Treat absent "linux" section is empty "cgroupsPath" tooKonstantin Baranov
2020-09-25fix TestUserLog for multi-archHoward Zhang
based on arch, apply different syscall number for sched_rr_get_interval Signed-off-by: Howard Zhang <howard.zhang@arm.com>
2020-09-17Remove option to panic goferFabricio Voznika
Gofer panics are suppressed by p9 server and an error is returned to the caller, making it effectively the same as returning EROFS. PiperOrigin-RevId: 332282959
2020-09-17Add VFS2 overlay support in runscFabricio Voznika
All tests under runsc are passing with overlay enabled. Updates #1487, #1199 PiperOrigin-RevId: 332181267
2020-09-16Refactor removed default test dimensionFabricio Voznika
ptrace was always selected as a dimension before, but not anymore. Some tests were specifying "overlay" expecting that to be in addition to the default. PiperOrigin-RevId: 332004111
2020-09-15Use container ID as cgroup name if not providedKonstantin Baranov
Useful when you want to run multiple containers with the same config. And runc does that too.
2020-09-08Honor readonly flag for root mountFabricio Voznika
Updates #1487 PiperOrigin-RevId: 330580699
2020-09-04Simplify FD handling for container start/execFabricio Voznika
VFS1 and VFS2 host FDs have different dupping behavior, making error prone to code for both. Change the contract so that FDs are released as they are used, so the caller can simple defer a block that closes all remaining files. This also addresses handling of partial failures. With this fix, more VFS2 tests can be enabled. Updates #1487 PiperOrigin-RevId: 330112266
2020-09-01Refactor tty codebase to use master-replica terminology.Ayush Ranjan
Updates #2972 PiperOrigin-RevId: 329584905
2020-08-19Move boot.Config to its own packageFabricio Voznika
Updates #3494 PiperOrigin-RevId: 327548511
2020-07-27Clean-up bazel wrapper.Adin Scannell
The bazel server was being started as the wrong user, leading to issues where the container would suddenly exit during a build. We can also simplify the waiting logic by starting the container in two separate steps: those that must complete first, then the asynchronous bit. PiperOrigin-RevId: 323391161
2020-07-15Merge pull request #3022 from prattmic:runsc_do_pdeathsiggVisor bot
PiperOrigin-RevId: 321449877
2020-07-15Apply pdeathsig to gofer for runsc run/doMichael Pratt
Much like the boot process, apply pdeathsig to the gofer for cases where the sandbox lifecycle is attached to the parent (runsc run/do). This isn't strictly necessary, as the gofer normally exits once the sentry disappears, but this makes that extra reliable.
2020-07-14Prepare boot.Loader to support multi-container TTYFabricio Voznika
- Combine process creation code that is shared between root and subcontainer processes - Move root container information into a struct for clarity Updates #2714 PiperOrigin-RevId: 321204798
2020-07-13Merge pull request #2672 from amscanne:shim-integratedgVisor bot
PiperOrigin-RevId: 321053634
2020-07-08Add shared mount hints to VFS2Fabricio Voznika
Container restart test is disabled for VFS2 for now. Updates #1487 PiperOrigin-RevId: 320296401
2020-06-11Set the HOME environment variable for sub-containers.Ian Lewis
Fixes #701 PiperOrigin-RevId: 316025635
2020-06-08Combine executable lookup codeFabricio Voznika
Run vs. exec, VFS1 vs. VFS2 were executable lookup were slightly different from each other. Combine them all into the same logic. PiperOrigin-RevId: 315426443
2020-06-01More runsc changes for VFS2Fabricio Voznika
- Add /tmp handling - Apply mount options - Enable more container_test tests - Forward signals to child process when test respaws process to run as root inside namespace. Updates #1487 PiperOrigin-RevId: 314263281
2020-05-28Move Cleanup to its own packageFabricio Voznika
PiperOrigin-RevId: 313663382
2020-05-28Automated rollback of changelist 309082540Fabricio Voznika
PiperOrigin-RevId: 313636920
2020-05-18Improve unsupported syscall messageFabricio Voznika
PiperOrigin-RevId: 312104899
2020-05-13Enable overlayfs_stale_read by default for runsc.Jamie Liu
Linux 4.18 and later make reads and writes coherent between pre-copy-up and post-copy-up FDs representing the same file on an overlay filesystem. However, memory mappings remain incoherent: - Documentation/filesystems/overlayfs.rst, "Non-standard behavior": "If a file residing on a lower layer is opened for read-only and then memory mapped with MAP_SHARED, then subsequent changes to the file are not reflected in the memory mapping." - fs/overlay/file.c:ovl_mmap() passes through to the underlying FD without any management of coherence in the overlay. - Experimentally on Linux 5.2: ``` $ cat mmap_cat_page.c #include <err.h> #include <fcntl.h> #include <stdio.h> #include <string.h> #include <sys/mman.h> #include <unistd.h> int main(int argc, char **argv) { if (argc < 2) { errx(1, "syntax: %s [FILE]", argv[0]); } const int fd = open(argv[1], O_RDONLY); if (fd < 0) { err(1, "open(%s)", argv[1]); } const size_t page_size = sysconf(_SC_PAGE_SIZE); void* page = mmap(NULL, page_size, PROT_READ, MAP_SHARED, fd, 0); if (page == MAP_FAILED) { err(1, "mmap"); } for (;;) { write(1, page, strnlen(page, page_size)); if (getc(stdin) == EOF) { break; } } return 0; } $ gcc -O2 -o mmap_cat_page mmap_cat_page.c $ mkdir lowerdir upperdir workdir overlaydir $ echo old > lowerdir/file $ sudo mount -t overlay -o "lowerdir=lowerdir,upperdir=upperdir,workdir=workdir" none overlaydir $ ./mmap_cat_page overlaydir/file old ^Z [1]+ Stopped ./mmap_cat_page overlaydir/file $ echo new > overlaydir/file $ cat overlaydir/file new $ fg ./mmap_cat_page overlaydir/file old ``` Therefore, while the VFS1 gofer client's behavior of reopening read FDs is only necessary pre-4.18, replacing existing memory mappings (in both sentry and application address spaces) with mappings of the new FD is required regardless of kernel version, and this latter behavior is common to both VFS1 and VFS2. Re-document accordingly, and change the runsc flag to enabled by default. New test: - Before this CL: https://source.cloud.google.com/results/invocations/5b222d2c-e918-4bae-afc4-407f5bac509b - After this CL: https://source.cloud.google.com/results/invocations/f28c747e-d89c-4d8c-a461-602b33e71aab PiperOrigin-RevId: 311361267
2020-05-04Enable TestRunNonRoot on VFS2Fabricio Voznika
Also added back the default test dimension back which was dropped in a previous refactor. PiperOrigin-RevId: 309797327
2020-05-04Add TTY support on VFS2 to runscFabricio Voznika
Updates #1623, #1487 PiperOrigin-RevId: 309777922