summaryrefslogtreecommitdiffhomepage
path: root/runsc
AgeCommit message (Collapse)Author
2021-03-02Add reverse flag to mitigate.Zach Koopmans
Add reverse operation to mitigate that just enables all CPUs. PiperOrigin-RevId: 360511215
2021-02-24Merge pull request #5519 from dqminh:runsc-ps-pidsgVisor bot
PiperOrigin-RevId: 359334029
2021-02-24runsc/filters: permit clock_nanosleep for raceAndrei Vagin
Syzkaller hosts contains many audit messages that runsc tries to call the clock_nanosleep syscall. PiperOrigin-RevId: 359331413
2021-02-24return root pids with runsc psDaniel Dao
`runsc ps` currently return pid for a task's immediate pid namespace, which is confusing when there're multiple pid namespaces. We should return only pids in the root namespace. Before: ``` 1000 1 0 0 ? 02:24 250ms chrome 1000 1 0 0 ? 02:24 40ms dumb-init 1000 1 0 0 ? 02:24 240ms chrome 1000 2 1 0 ? 02:24 2.78s node ``` After: ``` UID PID PPID C TTY STIME TIME CMD 1000 1 0 0 ? 12:35 0s dumb-init 1000 2 1 7 ? 12:35 240ms node 1000 13 2 21 ? 12:35 2.33s chrome 1000 27 13 3 ? 12:35 260ms chrome ``` Signed-off-by: Daniel Dao <dqminh@cloudflare.com>
2021-02-22Only detect mds for mitigate.Zach Koopmans
Only detect and mitigate on mds for the mitigate command. PiperOrigin-RevId: 358924466
2021-02-22Return nicer error message when cgroups v1 isn't availableFabricio Voznika
Updates #3481 Closes #5430 PiperOrigin-RevId: 358923208
2021-02-22Fix `runsc kill --pid`Fabricio Voznika
Previously, loader.signalProcess was inconsitently using both root and container's PID namespace to find the process. It used root namespace for the exec'd process and container's PID namespace for other processes. This fixes the code to use the root PID namespace across the board, which is the same PID reported in `runsc ps` (or soon will after https://github.com/google/gvisor/pull/5519). PiperOrigin-RevId: 358836297
2021-02-12Stop the control server only once.Adin Scannell
Operations are now shut down automatically by the main Stop command, and it is not necessary to call Stop during Destroy. Fixes #5454 PiperOrigin-RevId: 357295930
2021-02-11Allow rt_sigaction in gofer seccompFabricio Voznika
rt_sigaction may be called by Go runtime when trying to panic: https://cs.opensource.google/go/go/+/master:src/runtime/signal_unix.go;drc=ed3e4afa12d655a0c5606bcf3dd4e1cdadcb1476;bpv=1;bpt=1;l=780?q=rt_sigaction&ss=go Updates #5038 PiperOrigin-RevId: 357013186
2021-02-10Add mitigate command to runscZach Koopmans
PiperOrigin-RevId: 356772367
2021-02-05Replace TaskFromContext(ctx).Kernel() with KernelFromContext(ctx)Ting-Yu Wang
Panic seen at some code path like control.ExecAsync where ctx does not have a Task. Reported-by: syzbot+55ce727161cf94a7b7d6@syzkaller.appspotmail.com PiperOrigin-RevId: 355960596
2021-02-04Move getcpu() to core filter listMichael Pratt
Some versions of the Go runtime call getcpu(), so add it for compatibility. The hostcpu package already uses getcpu() on arm64. PiperOrigin-RevId: 355717757
2021-02-02Add CPUSet for runsc mitigate.Zach Koopmans
PiperOrigin-RevId: 355242055
2021-02-02Stub out basic `runsc events --stat` CPU functionalityKevin Krakauer
Because we lack gVisor-internal cgroups, we take the CPU usage of the entire pod and divide it proportionally according to sentry-internal usage stats. This fixes `kubectl top pods`, which gets a pod's CPU usage by summing the usage of its containers. Addresses #172. PiperOrigin-RevId: 355229833
2021-02-01Enable container checkpoint/restore tests with VFS2Fabricio Voznika
Updates #1663 PiperOrigin-RevId: 355077816
2021-01-29Merge pull request #4503 from dqminh:nested-cgroupgVisor bot
PiperOrigin-RevId: 354568091
2021-01-27Internal changeZach Koopmans
PiperOrigin-RevId: 354170726
2021-01-27Clean cgroupt mountinfo and add more test casesDaniel Dao
Signed-off-by: Daniel Dao <dqminh89@gmail.com>
2021-01-26runsc: check for nested cgroup when generating croup pathsDaniel Dao
in nested container, we see paths from host in /proc/self/cgroup, so we need to re-process that path to get a relative path to be used inside the container. Without it, runsc generates ugly paths that may trip other cgroup watchers that expect clean paths. An example of ugly path is: ``` /sys/fs/cgroup/memory/docker/e383892b29290ae8005d535f2dadc4a583bb354d5bb1ba8c10bf900d92c4db93/docker/e383892b29290ae8005d535f2dadc4a583bb354d5bb1ba8c10bf900d92c4db93/cgroupPath ``` Notice duplication of `docker/e383892b29290ae8005d535f2dadc4a583bb354d5bb1ba8c10bf900d92c4db93` `/proc/1/cgroup` looks like ``` 12:perf_event:/docker/e383892b29290ae8005d535f2dadc4a583bb354d5bb1ba8c10bf900d92c4db93 11:blkio:/docker/e383892b29290ae8005d535f2dadc4a583bb354d5bb1ba8c10bf900d92c4db93 10:freezer:/docker/e383892b29290ae8005d535f2dadc4a583bb354d5bb1ba8c10bf900d92c4db93 9:hugetlb:/docker/e383892b29290ae8005d535f2dadc4a583bb354d5bb1ba8c10bf900d92c4db93 8:devices:/docker/e383892b29290ae8005d535f2dadc4a583bb354d5bb1ba8c10bf900d92c4db93 7:rdma:/ 6:pids:/docker/e383892b29290ae8005d535f2dadc4a583bb354d5bb1ba8c10bf900d92c4db93 5:cpuset:/docker/e383892b29290ae8005d535f2dadc4a583bb354d5bb1ba8c10bf900d92c4db93 4:cpu,cpuacct:/docker/e383892b29290ae8005d535f2dadc4a583bb354d5bb1ba8c10bf900d92c4db93 3:memory:/docker/e383892b29290ae8005d535f2dadc4a583bb354d5bb1ba8c10bf900d92c4db93 2:net_cls,net_prio:/docker/e383892b29290ae8005d535f2dadc4a583bb354d5bb1ba8c10bf900d92c4db93 1:name=systemd:/docker/e383892b29290ae8005d535f2dadc4a583bb354d5bb1ba8c10bf900d92c4db93 0::/system.slice/containerd.service ``` This is not necessary when the parent container was created with cgroup namespace, but that setup is not very common right now. Signed-off-by: Daniel Dao <dqminh89@gmail.com>
2021-01-22Add initial mitigate code and cpu parsing.Zach Koopmans
PiperOrigin-RevId: 353274135
2021-01-22Remove dependency to abi/linuxFabricio Voznika
abi package is to be used by the Sentry to implement the Linux ABI. Code dealing with the host should use x/sys/unix. PiperOrigin-RevId: 353272679
2021-01-22Fix TestDuplicateEnvVariable flakynessFabricio Voznika
Updates #5226 PiperOrigin-RevId: 353262133
2021-01-21Fix ownership change logicFabricio Voznika
Previously fsgofer was skipping chown call if the uid and gid were the same as the current user/group. However, when setgid is set, the group may not be the same as the caller. Instead, compare the actual uid/gid of the file after it has been created and change ownership only if needed. Updates #180 PiperOrigin-RevId: 353118733
2021-01-13Switch uses of os.Getenv that check for empty string to os.LookupEnv.Dean Deng
Whether the variable was found is already returned by syscall.Getenv. os.Getenv drops this value while os.Lookupenv passes it along. PiperOrigin-RevId: 351674032
2021-01-12Fix simple mistakes identified by goreportcard.Adin Scannell
These are primarily simplification and lint mistakes. However, minor fixes are also included and tests added where appropriate. PiperOrigin-RevId: 351425971
2021-01-11OCI spec may contain duplicate environment variablesFabricio Voznika
Closes #5226 PiperOrigin-RevId: 351259576
2021-01-05Add benchmarks targets to BuildKite.Adin Scannell
This includes minor fix-ups: * Handle SIGTERM in runsc debug, to exit gracefully. * Fix cmd.debug.go opening all profiles as RDONLY. * Fix the test name in fio_test.go, and encode the block size in the test. PiperOrigin-RevId: 350205718
2021-01-05Internal changes.Andrei Vagin
PiperOrigin-RevId: 350159657
2020-12-30Fix condition checking in `runsc debug`Fabricio Voznika
Closes #5052 PiperOrigin-RevId: 349579814
2020-12-29Make profiling commands synchronous.Adin Scannell
This allows for a model of profiling when you can start collection, and it will terminate when the sandbox terminates. Without this synchronous call, it is effectively impossible to collect length blocking and mutex profiles. PiperOrigin-RevId: 349483418
2020-12-17Typo fix.Etienne Perot
PiperOrigin-RevId: 348106699
2020-12-17[netstack] Implement IP(V6)_RECVERR socket option.Ayush Ranjan
PiperOrigin-RevId: 348055514
2020-12-17Add sandbox ID to state file nameFabricio Voznika
This allows to find all containers inside a sandbox more efficiently. This operation is required every time a container starts and stops, and previously required loading *all* container state files to check whether the container belonged to the sandbox. Apert from being inneficient, it has caused problems when state files are stale or corrupt, causing inavalability to create any container. Also adjust commands `list` and `debug` to skip over files that fail to load. Resolves #5052 PiperOrigin-RevId: 348050637
2020-12-17Set max memory not minFabricio Voznika
Closes #5048 PiperOrigin-RevId: 348050472
2020-12-15fsgofer optimizationsFabricio Voznika
- Skip chown call in case owner change is not needed - Skip filepath.Clean() calls when joining paths - Pass unix.Stat_t by value to reduce runtime.duffcopy calls. This change allows for better inlining in localFile.walk(). Change Baseline Improvement BenchmarkWalkOne-6 2912 ns/op 3082 ns/op 5.5% BenchmarkCreate-6 15915 ns/op 19126 ns/op 16.8% BenchmarkCreateDiffOwner-6 18795 ns/op 19741 ns/op 4.8% PiperOrigin-RevId: 347667833
2020-12-14[netstack] Update raw socket and hostinet control message parsing.Ayush Ranjan
There are surprisingly few syscall tests that run with hostinet. For example running the following command only returns two results: `bazel query test/syscalls:all | grep hostnet` I think as a result, as our control messages evolved, hostinet was left behind. Update it to support all control messages netstack supports. This change also updates sentry's control message parsing logic to make it up to date with all the control messages we support. PiperOrigin-RevId: 347508892
2020-12-11Add runsc symbolize command.Dean Deng
This command takes instruction pointers from stdin and converts them into their corresponding file names and line/column numbers in the runsc source code. The inputs are not interpreted as actual addresses, but as synthetic values that are exposed through /sys/kernel/debug/kcov. One can extract coverage information from kcov and translate those values into locations in the source code by running symbolize on the same runsc binary. This will allow us to generate syzkaller coverage reports. PiperOrigin-RevId: 347089624
2020-12-11Remove existing nogo exceptions.Adin Scannell
PiperOrigin-RevId: 347047550
2020-12-10Disable host reassembly for fragments.Bhasker Hariharan
fdbased endpoint was enabling fragment reassembly on the host AF_PACKET socket to ensure that fragments are delivered inorder to the right dispatcher. But this prevents fragments from being delivered to gvisor at all and makes testing of gvisor's fragment reassembly code impossible. The potential impact from this is minimal since IP Fragmentation is not really that prevelant and in cases where we do get fragments we may deliver the fragment out of order to the TCP layer as multiple network dispatchers may process the fragments and deliver a reassembled fragment after the next packet has been delivered to the TCP endpoint. While not desirable I believe the impact from this is minimal due to low prevalence of fragmentation. Also removed PktType and Hatype fields when binding the socket as these are not used when binding. Its just confusing to have them specified. See: https://man7.org/linux/man-pages/man7/packet.7.html "Fields used for binding are sll_family (should be AF_PACKET), sll_protocol, and sll_ifindex." Fixes #5055 PiperOrigin-RevId: 346919439
2020-12-09Tweak aarch64 support.Adin Scannell
A few images were broken with respect to aarch64. We should now be able to run push-all-images with ARCH=aarch64 as part of the regular continuous integration builds, and add aarch64 smoke tests (via user emulation for now) to the regular test suite (future). PiperOrigin-RevId: 346685462
2020-12-07Support icmpv6 transport protocolPeter Johnston
PiperOrigin-RevId: 346101076
2020-12-04Overlay runsc regular file mounts with regular files.Jamie Liu
Fixes #4991 PiperOrigin-RevId: 345800333
2020-12-03Surface usage message for `runsc do`.Dean Deng
c.Usage() only returns a string; f.Usage() will print the usage message. PiperOrigin-RevId: 345500123
2020-12-03Support partitions for other tests.Adin Scannell
PiperOrigin-RevId: 345399936
2020-11-19Propagate IP address prefix from host to netstackFabricio Voznika
Closes #4022 PiperOrigin-RevId: 343378647
2020-11-18runsc: check whether cgroup exists or not for each controllerAndrei Vagin
We have seen a case when a memory cgroup exists but a perf_event one doesn't. Reported-by: syzbot+f31468b61d1a27e629dc@syzkaller.appspotmail.com Reported-by: syzbot+1f163ec0321768f1497e@syzkaller.appspotmail.com PiperOrigin-RevId: 343200070
2020-11-18Fix race condition in multi-container wait testFabricio Voznika
Container is not thread-safe, locking must be done in the caller. The test was calling Container.Wait() from multiple threads with no synchronization. Also removed Container.WaitPID from test because the process might have already existed when wait is called. PiperOrigin-RevId: 343176280
2020-11-17Add support for TTY in multi-containerFabricio Voznika
Fixes #2714 PiperOrigin-RevId: 342950412
2020-11-16Remove ARP address workaroundGhanan Gowripalan
- Make AddressableEndpoint optional for NetworkEndpoint. Not all NetworkEndpoints need to support addressing (e.g. ARP), so AddressableEndpoint should only be implemented for protocols that support addressing such as IPv4 and IPv6. With this change, tcpip.ErrNotSupported will be returned by the stack when attempting to modify addresses on a network endpoint that does not support addressing. Now that packets are fully handled at the network layer, and (with this change) addresses are optional for network endpoints, we no longer need the workaround for ARP where a fake ARP address was added to each NIC that performs ARP so that packets would be delivered to the ARP layer. PiperOrigin-RevId: 342722547
2020-11-12Remove TESTONLY tag from vfs2 flagFabricio Voznika
Updates #1035 PiperOrigin-RevId: 342168926