Age | Commit message (Collapse) | Author |
|
Starting with go1.13, we can specify ambient capabilities when we execute a new
process with os/exe.Cmd.
PiperOrigin-RevId: 305366706
|
|
PiperOrigin-RevId: 304221302
|
|
In the case of other signals (preemption), inject a normal bounce and
defer the signal until the vCPU has been returned from guest mode.
PiperOrigin-RevId: 303799678
|
|
When the sandbox runs in attached more, e.g. runsc do, runsc run, the
sandbox lifetime is controlled by the parent process. This wasn't working
in all cases because PR_GET_PDEATHSIG doesn't propagate through execve
when the process changes uid/gid. So it was getting dropped when the
sandbox execve's to change to user nobody.
PiperOrigin-RevId: 300601247
|
|
The asynchronous goroutine preemption is a new feature of Go 1.14.
When we switched to go 1.14 (cl/297915917) in the bazel config,
the kokoro syscall-kvm job started permanently failing. Lets
temporary set asyncpreemptoff for the kvm platform to unblock tests.
PiperOrigin-RevId: 300372387
|
|
PiperOrigin-RevId: 299233818
|
|
GO's runtime calls the write system call twice to print "panic:"
and "the reason of this panic", so here is a race window when
other threads can print something to the log and we will see
something like this:
panic: log messages from another thread
The reason of the panic.
This confuses the syzkaller blacklist and dedup detection.
It also makes the logs generally difficult to read. e.g.,
data races often have one side of the race, followed by
a large "diagnosis" dump, finally followed by the other
side of the race.
PiperOrigin-RevId: 297887895
|
|
|
|
TCP/IP will work with netstack networking. hostinet doesn't work, and sockets
will have the same behavior as it is now.
Before the userspace is able to create device, the default loopback device can
be used to test.
/proc/net and /sys/net will still be connected to the root network stack; this
is the same behavior now.
Issue #1833
PiperOrigin-RevId: 296309389
|
|
PiperOrigin-RevId: 294500858
|
|
PiperOrigin-RevId: 291745021
|
|
Updates #231
PiperOrigin-RevId: 289897881
|
|
* Rename syncutil to sync.
* Add aliases to sync types.
* Replace existing usage of standard library sync package.
This will make it easier to swap out synchronization primitives. For example,
this will allow us to use primitives from github.com/sasha-s/go-deadlock to
check for lock ordering violations.
Updates #1472
PiperOrigin-RevId: 289033387
|
|
Remove introduced CPUNumMin config and hard-code it as 2.
|
|
* Add `--cpu-num-min` flag to control minimum CPUs
* Only lower CPU count
* Fix comments
|
|
When application is not cgroups-aware, it can spawn excessive threads
which often defaults to CPU number.
Introduce a opt-in flag that will set CPU number accordingly to CPU
quota (if available).
Fixes #1391
|
|
Fixes #1341
PiperOrigin-RevId: 285108973
|
|
When the sandbox is destroyed, making URPC calls to destroy the
container will fail. The code was checking if the sandbox was
running before attempting to make the URPC call, but that is racy.
PiperOrigin-RevId: 284093764
|
|
Right now, we send each tcp packet separately, we call one system
call per-packet. This patch allows to generate multiple tcp packets
and send them by sendmmsg.
The arguable part of this CL is a way how to handle multiple headers.
This CL adds the next field to the Prepandable buffer.
Nginx test results:
Server Software: nginx/1.15.9
Server Hostname: 10.138.0.2
Server Port: 8080
Document Path: /10m.txt
Document Length: 10485760 bytes
w/o gso:
Concurrency Level: 5
Time taken for tests: 5.491 seconds
Complete requests: 100
Failed requests: 0
Total transferred: 1048600200 bytes
HTML transferred: 1048576000 bytes
Requests per second: 18.21 [#/sec] (mean)
Time per request: 274.525 [ms] (mean)
Time per request: 54.905 [ms] (mean, across all concurrent requests)
Transfer rate: 186508.03 [Kbytes/sec] received
sw-gso:
Concurrency Level: 5
Time taken for tests: 3.852 seconds
Complete requests: 100
Failed requests: 0
Total transferred: 1048600200 bytes
HTML transferred: 1048576000 bytes
Requests per second: 25.96 [#/sec] (mean)
Time per request: 192.576 [ms] (mean)
Time per request: 38.515 [ms] (mean, across all concurrent requests)
Transfer rate: 265874.92 [Kbytes/sec] received
w/o gso:
$ ./tcp_benchmark --client --duration 15 --ideal
[SUM] 0.0-15.1 sec 2.20 GBytes 1.25 Gbits/sec
software gso:
$ tcp_benchmark --client --duration 15 --ideal --gso $((1<<16)) --swgso
[SUM] 0.0-15.1 sec 3.99 GBytes 2.26 Gbits/sec
PiperOrigin-RevId: 276112677
|
|
PiperOrigin-RevId: 271207152
|
|
- Sandbox logs are generated when running tests
- Kokoro uploads the sandbox logs
- Supports multiple parallel runs
- Revive script to install locally built runsc with docker
PiperOrigin-RevId: 269337274
|
|
This is the first step in replacing some of the redundant types with the
standard library equivalents.
PiperOrigin-RevId: 264706552
|
|
PiperOrigin-RevId: 263184083
|
|
Set /proc/self/oom_score_adj based on oomScoreAdj specified in the OCI bundle.
When new containers are added to the sandbox oom_score_adj for the sandbox and
all other gofers are adjusted so that oom_score_adj is equal to the lowest
oom_score_adj of all containers in the sandbox.
Fixes #512
PiperOrigin-RevId: 261242725
|
|
PiperOrigin-RevId: 260824989
|
|
PiperOrigin-RevId: 256494243
|
|
Go was going to change the behavior of SysProcAttr.Ctty such that it must be an
FD in the *parent* FD table:
https://go-review.googlesource.com/c/go/+/178919/
However, after some debate, it was decided that this change was too
backwards-incompatible, and so it was reverted.
https://github.com/golang/go/issues/29458
The behavior going forward is unchanged: the Ctty FD must be an FD in the
*child* FD table.
PiperOrigin-RevId: 255228476
|
|
An upcoming change in Go 1.13 [1] changes the semantics of the SysProcAttr.Ctty
field. Prior to the change, the FD must be an FD in the child process's FD
table (aka "post-shuffle"). After the change, the FD must be an FD in the
current process's FD table (aka "pre-shuffle").
To be compatible with both versions this CL introduces a new boolean
"CttyFdIsPostShuffle" which indicates whether a pre- or post-shuffle FD should
be provided. We use build tags to chose the correct one.
1: https://go-review.googlesource.com/c/go/+/178919/
PiperOrigin-RevId: 255015303
|
|
New options are:
runsc debug --strace=off|all|function1,function2
runsc debug --log-level=warning|info|debug
runsc debug --log-packets=true|false
Updates #407
PiperOrigin-RevId: 254843128
|
|
This was from an old comment, which was superseded by the
existing comment which is correct.
PiperOrigin-RevId: 254434535
|
|
PiperOrigin-RevId: 254237530
|
|
PiperOrigin-RevId: 253882115
|
|
There were 3 string arguments that could be easily misplaced
and it makes it easier to add new arguments, especially for
Container that has dozens of callers.
PiperOrigin-RevId: 253872074
|
|
This can be merged after:
https://github.com/google/gvisor-website/pull/77
or
https://github.com/google/gvisor-website/pull/78
PiperOrigin-RevId: 253132620
|
|
'--rootless' flag lets a non-root user execute 'runsc do'.
The drawback is that the sandbox and gofer processes will
run as root inside a user namespace that is mapped to the
caller's user, intead of nobody. And network is defaulted
to '--network=host' inside the root network namespace. On
the bright side, it's very convenient for testing:
runsc --rootless do ls
runsc --rootless do curl www.google.com
PiperOrigin-RevId: 252840970
|
|
It prints formatted to the log.
PiperOrigin-RevId: 252699551
|
|
This allows an fdbased endpoint to have multiple underlying fd's from which
packets can be read and dispatched/written to.
This should allow for higher throughput as well as better scalability of the
network stack as number of connections increases.
Updates #231
PiperOrigin-RevId: 251852825
|
|
clearStatus was added to allow detached execution to wait
on the exec'd process and retrieve its exit status. However,
it's not currently used. Both docker and gvisor-containerd-shim
wait on the "shim" process and retrieve the exit status from
there. We could change gvisor-containerd-shim to use waits, but
it will end up also consuming a process for the wait, which is
similar to having the shim process.
Closes #234
PiperOrigin-RevId: 251349490
|
|
Updates #220
PiperOrigin-RevId: 250532302
|
|
PiperOrigin-RevId: 248367340
Change-Id: Id792afcfff9c9d2cfd62cae21048316267b4a924
|
|
With this change, we will be able to run runsc do in a host network namespace.
PiperOrigin-RevId: 246436660
Change-Id: I8ea18b1053c88fe2feed74239b915fe7a151ce34
|
|
Sandbox always runsc with IP 192.168.10.2 and the peer
network adds 1 to the address (192.168.10.3). Sandbox
IP can be changed using --ip flag.
Here a few examples:
sudo runsc do curl www.google.com
sudo runsc do --ip=10.10.10.2 bash -c "echo 123 | netcat -l -p 8080"
PiperOrigin-RevId: 246421277
Change-Id: I7b3dce4af46a57300350dab41cb27e04e4b6e9da
|
|
Based on the guidelines at
https://opensource.google.com/docs/releasing/authors/.
1. $ rg -l "Google LLC" | xargs sed -i 's/Google LLC.*/The gVisor Authors./'
2. Manual fixup of "Google Inc" references.
3. Add AUTHORS file. Authors may request to be added to this file.
4. Point netstack AUTHORS to gVisor AUTHORS. Drop CONTRIBUTORS.
Fixes #209
PiperOrigin-RevId: 245823212
Change-Id: I64530b24ad021a7d683137459cafc510f5ee1de9
|
|
PiperOrigin-RevId: 245818639
Change-Id: I03703ef0fb9b6675955637b9fe2776204c545789
|
|
Packet socket receive buffers default to the sysctl value of
net.core.rmem_default and are capped by net.core.rmem_max both
which are usually set to 208KB on most systems.
Since we can't expect every gVisor user to bump these we use
SO_RCVBUFFORCE to exceed the limit. This is possible as runsc runs
with CAP_NET_ADMIN outside the sandbox and can do this before
the FD is passed to the sentry inside the sandbox.
Updates #211
iperf output w/ 4MB buffer.
iperf3 -c 172.17.0.2 -t 100
Connecting to host 172.17.0.2, port 5201
[ 4] local 172.17.0.1 port 40378 connected to 172.17.0.2 port 5201
[ ID] Interval Transfer Bandwidth Retr Cwnd
[ 4] 0.00-1.00 sec 1.15 GBytes 9.89 Gbits/sec 0 1.02 MBytes
[ 4] 1.00-2.00 sec 1.18 GBytes 10.2 Gbits/sec 0 1.02 MBytes
[ 4] 2.00-3.00 sec 965 MBytes 8.09 Gbits/sec 0 1.02 MBytes
[ 4] 3.00-4.00 sec 942 MBytes 7.90 Gbits/sec 0 1.02 MBytes
[ 4] 4.00-5.00 sec 952 MBytes 7.99 Gbits/sec 0 1.02 MBytes
[ 4] 5.00-6.00 sec 1.14 GBytes 9.81 Gbits/sec 0 1.02 MBytes
[ 4] 6.00-7.00 sec 1.13 GBytes 9.68 Gbits/sec 0 1.02 MBytes
[ 4] 7.00-8.00 sec 930 MBytes 7.80 Gbits/sec 0 1.02 MBytes
[ 4] 8.00-9.00 sec 1.15 GBytes 9.91 Gbits/sec 0 1.02 MBytes
[ 4] 9.00-10.00 sec 938 MBytes 7.87 Gbits/sec 0 1.02 MBytes
[ 4] 10.00-11.00 sec 737 MBytes 6.18 Gbits/sec 0 1.02 MBytes
[ 4] 11.00-12.00 sec 1.16 GBytes 9.93 Gbits/sec 0 1.02 MBytes
[ 4] 12.00-13.00 sec 917 MBytes 7.69 Gbits/sec 0 1.02 MBytes
[ 4] 13.00-14.00 sec 1.19 GBytes 10.2 Gbits/sec 0 1.02 MBytes
[ 4] 14.00-15.00 sec 1.01 GBytes 8.70 Gbits/sec 0 1.02 MBytes
[ 4] 15.00-16.00 sec 1.20 GBytes 10.3 Gbits/sec 0 1.02 MBytes
[ 4] 16.00-17.00 sec 1.14 GBytes 9.80 Gbits/sec 0 1.02 MBytes
^C[ 4] 17.00-17.60 sec 718 MBytes 10.1 Gbits/sec 0 1.02 MBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bandwidth Retr
[ 4] 0.00-17.60 sec 18.4 GBytes 8.98 Gbits/sec 0 sender
[ 4] 0.00-17.60 sec 0.00 Bytes 0.00 bits/sec receiver
PiperOrigin-RevId: 245470590
Change-Id: I1c08c5ee8345de6ac070513656a4703312dc3c00
|
|
os.NewFile() accounts for 38% of CPU time in localFile.Walk().
This change switchs to use fd.FD which is much cheaper to create.
Now, fd.New() in localFile.Walk() accounts for only 4%.
PiperOrigin-RevId: 244944983
Change-Id: Ic892df96cf2633e78ad379227a213cb93ee0ca46
|
|
It provides an easy way to run commands to quickly test gVisor.
By default it maps the host root as the container root with a
writable overlay on top (so the host root is not modified).
Example:
sudo runsc do ls -lh --color
sudo runsc do ~/src/test/my-test.sh
PiperOrigin-RevId: 243178711
Change-Id: I05f3d6ce253fe4b5f1362f4a07b5387f6ddb5dd9
|
|
The linux packet socket can handle GSO packets, so we can segment packets to
64K instead of the MTU which is usually 1500.
Here are numbers for the nginx-1m test:
runsc: 579330.01 [Kbytes/sec] received
runsc-gso: 1794121.66 [Kbytes/sec] received
runc: 2122139.06 [Kbytes/sec] received
and for tcp_benchmark:
$ tcp_benchmark --duration 15 --ideal
[ 4] 0.0-15.0 sec 86647 MBytes 48456 Mbits/sec
$ tcp_benchmark --client --duration 15 --ideal
[ 4] 0.0-15.0 sec 2173 MBytes 1214 Mbits/sec
$ tcp_benchmark --client --duration 15 --ideal --gso 65536
[ 4] 0.0-15.0 sec 19357 MBytes 10825 Mbits/sec
PiperOrigin-RevId: 241072403
Change-Id: I20b03063a1a6649362b43609cbbc9b59be06e6d5
|
|
Properly handle propagation options for root and mounts. Now usage of
mount options shared, rshared, and noexec cause error to start. shared/
rshared breaks sandbox=>host isolation. slave however can be supported
because changes propagate from host to sandbox.
Root FS setup moved inside the gofer. Apart from simplifying the code,
it keeps all mounts inside the namespace. And they are torn down when
the namespace is destroyed (DestroyFS is no longer needed).
PiperOrigin-RevId: 239037661
Change-Id: I8b5ee4d50da33c042ea34fa68e56514ebe20e6e0
|
|
Example:
runsc debug --root=<dir> \
--profile-heap=/tmp/heap.prof \
--profile-cpu=/tmp/cpu.prod --profile-delay=30 \
<container ID>
PiperOrigin-RevId: 237848456
Change-Id: Icff3f20c1b157a84d0922599eaea327320dad773
|