gvisor - Container Runtime Sandbox

Age	Commit message (Collapse)	Author
2019-08-01	Set sandbox oom_score_adj	Ian Lewis
	Set /proc/self/oom_score_adj based on oomScoreAdj specified in the OCI bundle. When new containers are added to the sandbox oom_score_adj for the sandbox and all other gofers are adjusted so that oom_score_adj is equal to the lowest oom_score_adj of all containers in the sandbox. Fixes #512 PiperOrigin-RevId: 261242725
2019-07-30	Remove unused const variables	Ian Lewis
	PiperOrigin-RevId: 260824989
2019-07-03	Avoid importing platforms from many source files	Andrei Vagin
	PiperOrigin-RevId: 256494243
2019-06-26	Always set SysProcAttr.Ctty to an FD in the child's FD table.	Nicolas Lacasse
	Go was going to change the behavior of SysProcAttr.Ctty such that it must be an FD in the parent FD table: https://go-review.googlesource.com/c/go/+/178919/ However, after some debate, it was decided that this change was too backwards-incompatible, and so it was reverted. https://github.com/golang/go/issues/29458 The behavior going forward is unchanged: the Ctty FD must be an FD in the child FD table. PiperOrigin-RevId: 255228476
2019-06-25	Use different Ctty FDs based on the go version.	Nicolas Lacasse
	An upcoming change in Go 1.13 [1] changes the semantics of the SysProcAttr.Ctty field. Prior to the change, the FD must be an FD in the child process's FD table (aka "post-shuffle"). After the change, the FD must be an FD in the current process's FD table (aka "pre-shuffle"). To be compatible with both versions this CL introduces a new boolean "CttyFdIsPostShuffle" which indicates whether a pre- or post-shuffle FD should be provided. We use build tags to chose the correct one. 1: https://go-review.googlesource.com/c/go/+/178919/ PiperOrigin-RevId: 255015303
2019-06-24	Allow to change logging options using 'runsc debug'	Fabricio Voznika
	New options are: runsc debug --strace=off\|all\|function1,function2 runsc debug --log-level=warning\|info\|debug runsc debug --log-packets=true\|false Updates #407 PiperOrigin-RevId: 254843128
2019-06-21	Delete dangling comment line.	Nicolas Lacasse
	This was from an old comment, which was superseded by the existing comment which is correct. PiperOrigin-RevId: 254434535
2019-06-20	Drop extra character	Michael Pratt
	PiperOrigin-RevId: 254237530
2019-06-18	Kill sandbox process when 'runsc do' exits	Fabricio Voznika
	PiperOrigin-RevId: 253882115
2019-06-18	Add Container/Sandbox args struct for creation	Fabricio Voznika
	There were 3 string arguments that could be easily misplaced and it makes it easier to add new arguments, especially for Container that has dozens of callers. PiperOrigin-RevId: 253872074
2019-06-13	Update canonical repository.	Adin Scannell
	This can be merged after: https://github.com/google/gvisor-website/pull/77 or https://github.com/google/gvisor-website/pull/78 PiperOrigin-RevId: 253132620
2019-06-12	Allow 'runsc do' to run without root	Fabricio Voznika
	'--rootless' flag lets a non-root user execute 'runsc do'. The drawback is that the sandbox and gofer processes will run as root inside a user namespace that is mapped to the caller's user, intead of nobody. And network is defaulted to '--network=host' inside the root network namespace. On the bright side, it's very convenient for testing: runsc --rootless do ls runsc --rootless do curl www.google.com PiperOrigin-RevId: 252840970
2019-06-11	Use net.HardwareAddr for FDBasedLink.LinkAddress	Fabricio Voznika
	It prints formatted to the log. PiperOrigin-RevId: 252699551
2019-06-06	Add multi-fd support to fdbased endpoint.	Bhasker Hariharan
	This allows an fdbased endpoint to have multiple underlying fd's from which packets can be read and dispatched/written to. This should allow for higher throughput as well as better scalability of the network stack as number of connections increases. Updates #231 PiperOrigin-RevId: 251852825
2019-06-03	Remove 'clearStatus' option from container.Wait*PID()	Fabricio Voznika
	clearStatus was added to allow detached execution to wait on the exec'd process and retrieve its exit status. However, it's not currently used. Both docker and gvisor-containerd-shim wait on the "shim" process and retrieve the exit status from there. We could change gvisor-containerd-shim to use waits, but it will end up also consuming a process for the wait, which is similar to having the shim process. Closes #234 PiperOrigin-RevId: 251349490
2019-05-30	Add support for collecting execution trace to runsc.	Bhasker Hariharan
	Updates #220 PiperOrigin-RevId: 250532302
2019-05-15	gvisor/runsc: use a veth link address instead of generating a new one	Andrei Vagin
	PiperOrigin-RevId: 248367340 Change-Id: Id792afcfff9c9d2cfd62cae21048316267b4a924
2019-05-02	runsc: don't create an empty network namespace if NetworkHost is set	Andrei Vagin
	With this change, we will be able to run runsc do in a host network namespace. PiperOrigin-RevId: 246436660 Change-Id: I8ea18b1053c88fe2feed74239b915fe7a151ce34
2019-05-02	Add [simple] network support to 'runsc do'	Fabricio Voznika
	Sandbox always runsc with IP 192.168.10.2 and the peer network adds 1 to the address (192.168.10.3). Sandbox IP can be changed using --ip flag. Here a few examples: sudo runsc do curl www.google.com sudo runsc do --ip=10.10.10.2 bash -c "echo 123 \| netcat -l -p 8080" PiperOrigin-RevId: 246421277 Change-Id: I7b3dce4af46a57300350dab41cb27e04e4b6e9da
2019-04-29	Change copyright notice to "The gVisor Authors"	Michael Pratt
	Based on the guidelines at https://opensource.google.com/docs/releasing/authors/. 1. $ rg -l "Google LLC" \| xargs sed -i 's/Google LLC.*/The gVisor Authors./' 2. Manual fixup of "Google Inc" references. 3. Add AUTHORS file. Authors may request to be added to this file. 4. Point netstack AUTHORS to gVisor AUTHORS. Drop CONTRIBUTORS. Fixes #209 PiperOrigin-RevId: 245823212 Change-Id: I64530b24ad021a7d683137459cafc510f5ee1de9
2019-04-29	Allow and document bug ids in gVisor codebase.	Nicolas Lacasse
	PiperOrigin-RevId: 245818639 Change-Id: I03703ef0fb9b6675955637b9fe2776204c545789
2019-04-26	Bump the AF_PACKET socket rcv buf size to 4MB by default.	Bhasker Hariharan
	Packet socket receive buffers default to the sysctl value of net.core.rmem_default and are capped by net.core.rmem_max both which are usually set to 208KB on most systems. Since we can't expect every gVisor user to bump these we use SO_RCVBUFFORCE to exceed the limit. This is possible as runsc runs with CAP_NET_ADMIN outside the sandbox and can do this before the FD is passed to the sentry inside the sandbox. Updates #211 iperf output w/ 4MB buffer. iperf3 -c 172.17.0.2 -t 100 Connecting to host 172.17.0.2, port 5201 [ 4] local 172.17.0.1 port 40378 connected to 172.17.0.2 port 5201 [ ID] Interval Transfer Bandwidth Retr Cwnd [ 4] 0.00-1.00 sec 1.15 GBytes 9.89 Gbits/sec 0 1.02 MBytes [ 4] 1.00-2.00 sec 1.18 GBytes 10.2 Gbits/sec 0 1.02 MBytes [ 4] 2.00-3.00 sec 965 MBytes 8.09 Gbits/sec 0 1.02 MBytes [ 4] 3.00-4.00 sec 942 MBytes 7.90 Gbits/sec 0 1.02 MBytes [ 4] 4.00-5.00 sec 952 MBytes 7.99 Gbits/sec 0 1.02 MBytes [ 4] 5.00-6.00 sec 1.14 GBytes 9.81 Gbits/sec 0 1.02 MBytes [ 4] 6.00-7.00 sec 1.13 GBytes 9.68 Gbits/sec 0 1.02 MBytes [ 4] 7.00-8.00 sec 930 MBytes 7.80 Gbits/sec 0 1.02 MBytes [ 4] 8.00-9.00 sec 1.15 GBytes 9.91 Gbits/sec 0 1.02 MBytes [ 4] 9.00-10.00 sec 938 MBytes 7.87 Gbits/sec 0 1.02 MBytes [ 4] 10.00-11.00 sec 737 MBytes 6.18 Gbits/sec 0 1.02 MBytes [ 4] 11.00-12.00 sec 1.16 GBytes 9.93 Gbits/sec 0 1.02 MBytes [ 4] 12.00-13.00 sec 917 MBytes 7.69 Gbits/sec 0 1.02 MBytes [ 4] 13.00-14.00 sec 1.19 GBytes 10.2 Gbits/sec 0 1.02 MBytes [ 4] 14.00-15.00 sec 1.01 GBytes 8.70 Gbits/sec 0 1.02 MBytes [ 4] 15.00-16.00 sec 1.20 GBytes 10.3 Gbits/sec 0 1.02 MBytes [ 4] 16.00-17.00 sec 1.14 GBytes 9.80 Gbits/sec 0 1.02 MBytes ^C[ 4] 17.00-17.60 sec 718 MBytes 10.1 Gbits/sec 0 1.02 MBytes - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bandwidth Retr [ 4] 0.00-17.60 sec 18.4 GBytes 8.98 Gbits/sec 0 sender [ 4] 0.00-17.60 sec 0.00 Bytes 0.00 bits/sec receiver PiperOrigin-RevId: 245470590 Change-Id: I1c08c5ee8345de6ac070513656a4703312dc3c00
2019-04-23	Replace os.File with fd.FD in fsgofer	Fabricio Voznika
	os.NewFile() accounts for 38% of CPU time in localFile.Walk(). This change switchs to use fd.FD which is much cheaper to create. Now, fd.New() in localFile.Walk() accounts for only 4%. PiperOrigin-RevId: 244944983 Change-Id: Ic892df96cf2633e78ad379227a213cb93ee0ca46
2019-04-11	Add 'runsc do' command	Fabricio Voznika
	It provides an easy way to run commands to quickly test gVisor. By default it maps the host root as the container root with a writable overlay on top (so the host root is not modified). Example: sudo runsc do ls -lh --color sudo runsc do ~/src/test/my-test.sh PiperOrigin-RevId: 243178711 Change-Id: I05f3d6ce253fe4b5f1362f4a07b5387f6ddb5dd9
2019-03-29	gvisor/runsc: enable generic segmentation offload (GSO)	Andrei Vagin
	The linux packet socket can handle GSO packets, so we can segment packets to 64K instead of the MTU which is usually 1500. Here are numbers for the nginx-1m test: runsc: 579330.01 [Kbytes/sec] received runsc-gso: 1794121.66 [Kbytes/sec] received runc: 2122139.06 [Kbytes/sec] received and for tcp_benchmark: $ tcp_benchmark --duration 15 --ideal [ 4] 0.0-15.0 sec 86647 MBytes 48456 Mbits/sec $ tcp_benchmark --client --duration 15 --ideal [ 4] 0.0-15.0 sec 2173 MBytes 1214 Mbits/sec $ tcp_benchmark --client --duration 15 --ideal --gso 65536 [ 4] 0.0-15.0 sec 19357 MBytes 10825 Mbits/sec PiperOrigin-RevId: 241072403 Change-Id: I20b03063a1a6649362b43609cbbc9b59be06e6d5
2019-03-18	Add support for mount propagation	Fabricio Voznika
	Properly handle propagation options for root and mounts. Now usage of mount options shared, rshared, and noexec cause error to start. shared/ rshared breaks sandbox=>host isolation. slave however can be supported because changes propagate from host to sandbox. Root FS setup moved inside the gofer. Apart from simplifying the code, it keeps all mounts inside the namespace. And they are torn down when the namespace is destroyed (DestroyFS is no longer needed). PiperOrigin-RevId: 239037661 Change-Id: I8b5ee4d50da33c042ea34fa68e56514ebe20e6e0
2019-03-11	Add profiling commands to runsc	Fabricio Voznika
	Example: runsc debug --root=<dir> \ --profile-heap=/tmp/heap.prof \ --profile-cpu=/tmp/cpu.prod --profile-delay=30 \ <container ID> PiperOrigin-RevId: 237848456 Change-Id: Icff3f20c1b157a84d0922599eaea327320dad773
2019-01-31	Remove license comments	Michael Pratt
	Nothing reads them and they can simply get stale. Generated with: $ sed -i "s/licenses($.$)./licenses(\1)/" **/BUILD PiperOrigin-RevId: 231818945 Change-Id: Ibc3f9838546b7e94f13f217060d31f4ada9d4bf0
2019-01-29	runsc: reap a sandbox process only in sandbox.Wait()	Andrei Vagin
	PiperOrigin-RevId: 231504064 Change-Id: I585b769aef04a3ad7e7936027958910a6eed9c8d
2019-01-28	check isRootNS by ns inode	Shijiang Wei
	Signed-off-by: Shijiang Wei <mountkin@gmail.com> Change-Id: I032f834edae5c716fb2d3538285eec07aa11a902 PiperOrigin-RevId: 231318438
2019-01-22	Don't bind-mount runsc into a sandbox mntns	Andrei Vagin
	PiperOrigin-RevId: 230437407 Change-Id: Id9d8ceeb018aad2fe317407c78c6ee0f4b47aa2b
2019-01-18	Scrub runsc error messages	Fabricio Voznika
	Removed "error" and "failed to" prefix that don't add value from messages. Adjusted a few other messages. In particular, when the container fail to start, the message returned is easier for humans to read: $ docker run --rm --runtime=runsc alpine foobar docker: Error response from daemon: OCI runtime start failed: <path> did not terminate sucessfully: starting container: starting root container [foobar]: starting sandbox: searching for executable "foobar", cwd: "/", $PATH="/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin": no such file or directory Closes #77 PiperOrigin-RevId: 230022798 Change-Id: I83339017c70dae09e4f9f8e0ea2e554c4d5d5cd1
2019-01-18	Start a sandbox process in a new userns only if CAP_SETUID is set	Andrei Vagin
	In addition, it fixes a race condition in TestMultiContainerGoferStop. There are two scripts copy the same set of files into the same directory and sometime one of this command fails with EXIST. PiperOrigin-RevId: 230011247 Change-Id: I9289f72e65dc407cdcd0e6cd632a509e01f43e9c
2019-01-18	runsc: create a new proc mount if the sandbox process is running in a new pidns	Andrei Vagin
	PiperOrigin-RevId: 229971902 Change-Id: Ief4fac731e839ef092175908de9375d725eaa3aa
2019-01-14	runsc: set up a minimal chroot from the sandbox process	Andrei Vagin
	In this case, new mounts are not created in the host mount namspaces, so tearDownChroot isn't needed, because chroot will be destroyed with a sandbox mount namespace. In additional, pivot_root can't be called instead of chroot. PiperOrigin-RevId: 229250871 Change-Id: I765bdb587d0b8287a6a8efda8747639d37c7e7b6
2019-01-11	runsc: Collect zombies of sandbox and gofer processes	Andrei Vagin
	And we need to wait a gofer process before cgroup.Uninstall, because it is running in the sandbox cgroups. PiperOrigin-RevId: 228904020 Change-Id: Iaf8826d5b9626db32d4057a1c505a8d7daaeb8f9
2019-01-09	Restore to original cgroup after sandbox and gofer processes are created	Fabricio Voznika
	The original code assumed that it was safe to join and not restore cgroup, but Container.Run will not exit after calling start, making cgroup cleanup fail because there were still processes inside the cgroup. PiperOrigin-RevId: 228529199 Change-Id: I12a48d9adab4bbb02f20d71ec99598c336cbfe51
2019-01-03	Apply chroot for --network=host too	Fabricio Voznika
	PiperOrigin-RevId: 227747566 Change-Id: Ide9df4ac1391adcd1c56e08d6570e0d149d85bc4
2018-12-28	Simplify synchronization between runsc and sandbox process	Fabricio Voznika
	Make 'runsc create' join cgroup before creating sandbox process. This removes the need to synchronize platform creation and ensure that sandbox process is charged to the right cgroup from the start. PiperOrigin-RevId: 227166451 Change-Id: Ieb4b18e6ca0daf7b331dc897699ca419bc5ee3a2
2018-12-06	A sandbox process should wait until it has not been moved into cgroups	Andrei Vagin
	PiperOrigin-RevId: 224418900 Change-Id: I53cf4d7c1c70117875b6920f8fd3d58a3b1497e9
2018-11-15	Allow sandbox.Wait to be called after the sandbox has exited.	Nicolas Lacasse
	sandbox.Wait is racey, as the sandbox may have exited before it is called, or even during. We already had code to handle the case that the sandbox exits during the Wait call, but we were not properly handling the case where the sandbox has exited before the call. The best we can do in such cases is return the sandbox exit code as the application exit code. PiperOrigin-RevId: 221702517 Change-Id: I290d0333cc094c7c1c3b4ce0f17f61a3e908d787
2018-11-05	Fix race between start and destroy	Fabricio Voznika
	Before this change, a container starting up could race with destroy (aka delete) and leave processes behind. Now, whenever a container is created, Loader.processes gets a new entry. Start now expects the entry to be there, and if it's not it means that the container was deleted. I've also fixed Loader.waitPID to search for the process using the init process's PID namespace. We could use a few more tests for signal and wait. I'll send them in another cl. PiperOrigin-RevId: 220224290 Change-Id: I15146079f69904dc07d43c3b66cc343a2dab4cc4
2018-11-01	Make error messages a bit more user friendly.	Ian Lewis
	Updated error messages so that it doesn't print full Go struct representations when running a new container in a sandbox. For example, this occurs frequently when commands are not found when doing a 'kubectl exec'. PiperOrigin-RevId: 219729141 Change-Id: Ic3a7bc84cd7b2167f495d48a1da241d621d3ca09
2018-10-23	Track paths and provide a rename hook.	Adin Scannell
	This change also adds extensive testing to the p9 package via mocks. The sanity checks and type checks are moved from the gofer into the core package, where they can be more easily validated. PiperOrigin-RevId: 218296768 Change-Id: I4fc3c326e7bf1e0e140a454cbacbcc6fd617ab55
2018-10-21	Updated cleanup code to be more explicit about ignoring errors.	Ian Lewis
	Errors are shown as being ignored by assigning to the blank identifier. PiperOrigin-RevId: 218103819 Change-Id: I7cc7b9d8ac503a03de5504ebdeb99ed30a531cf2
2018-10-19	Use correct company name in copyright header	Ian Gudger
	PiperOrigin-RevId: 217951017 Change-Id: Ie08bf6987f98467d07457bcf35b5f1ff6e43c035
2018-10-18	Resolve mount paths while setting up root fs mount	Fabricio Voznika
	It's hard to resolve symlinks inside the sandbox because rootfs and mounts may be read-only, forcing us to create mount points inside lower layer of an overlay, before the volumes are mounted. Since the destination must already be resolved outside the sandbox when creating mounts, take this opportunity to rewrite the spec with paths resolved. "runsc boot" will use the "resolved" spec to load mounts. In addition, symlink traversals were disabled while mounting containers inside the sandbox. It haven't been able to write a good test for it. So I'm relying on manual tests for now. PiperOrigin-RevId: 217749904 Change-Id: I7ac434d5befd230db1488446cda03300cc0751a9
2018-10-17	runsc: Support job control signals for the root container.	Nicolas Lacasse
	Now containers run with "docker run -it" support control characters like ^C and ^Z. This required refactoring our signal handling a bit. Signals delivered to the "runsc boot" process are turned into loader.Signal calls with the appropriate delivery mode. Previously they were always sent directly to PID 1. PiperOrigin-RevId: 217566770 Change-Id: I5b7220d9a0f2b591a56335479454a200c6de8732
2018-10-16	Bump sandbox start and stop timeouts.	Nicolas Lacasse
	PiperOrigin-RevId: 217433699 Change-Id: Icef08285728c23ee7dd650706aaf18da51c25dff
2018-10-15	Never send boot process stdio to application stdio.	Nicolas Lacasse
	We treat handle the boot process stdio separately from the application stdio (which gets passed via flags), but we were still sending both to same place. As a result, some logs that are written directly to os.Stderr by the boot process were ending up in the application logs. This CL starts sendind boot process stdio to the null device (since we don't have any better options). The boot process is already configured to send all logs (and panics) to the log file, so we won't miss anything important. PiperOrigin-RevId: 217173020 Change-Id: I5ab980da037f34620e7861a3736ba09c18d73794