summaryrefslogtreecommitdiffhomepage
path: root/runsc/sandbox/network.go
AgeCommit message (Collapse)Author
2020-12-10Disable host reassembly for fragments.Bhasker Hariharan
fdbased endpoint was enabling fragment reassembly on the host AF_PACKET socket to ensure that fragments are delivered inorder to the right dispatcher. But this prevents fragments from being delivered to gvisor at all and makes testing of gvisor's fragment reassembly code impossible. The potential impact from this is minimal since IP Fragmentation is not really that prevelant and in cases where we do get fragments we may deliver the fragment out of order to the TCP layer as multiple network dispatchers may process the fragments and deliver a reassembled fragment after the next packet has been delivered to the TCP endpoint. While not desirable I believe the impact from this is minimal due to low prevalence of fragmentation. Also removed PktType and Hatype fields when binding the socket as these are not used when binding. Its just confusing to have them specified. See: https://man7.org/linux/man-pages/man7/packet.7.html "Fields used for binding are sll_family (should be AF_PACKET), sll_protocol, and sll_ifindex." Fixes #5055 PiperOrigin-RevId: 346919439
2020-11-19Propagate IP address prefix from host to netstackFabricio Voznika
Closes #4022 PiperOrigin-RevId: 343378647
2020-09-25Swallow SO_RCVBUFFORCE and SO_SNDBUFFORCE errors on SOCK_RAW socketsMarek Majkowski
In --network=sandbox mode, we create SOCK_RAW sockets on the given network device. The code tries to force-set 4MiB rcv and snd buffers on it, but in certain situations it might fail. There is no reason for refusing the sandbox startup in such case - we should bump the buffers to max availabe size and just move on.
2020-08-26Make flag propagation automaticFabricio Voznika
Use reflection and tags to provide automatic conversion from Config to flags. This makes adding new flags less error-prone, skips flags using default values (easier to read), and makes tests correctly use default flag values for test Configs. Updates #3494 PiperOrigin-RevId: 328662070
2020-08-19Move boot.Config to its own packageFabricio Voznika
Updates #3494 PiperOrigin-RevId: 327548511
2020-07-08Drop empty lineMichael Pratt
PiperOrigin-RevId: 320281516
2020-06-16Add runsc options to set checksum offloading statusgVisor bot
--tx-checksum-offload=<true|false> enable TX checksum offload (default: false) --rx-checksum-offload=<true|false> enable RX checksum offload (default: true) Fixes #2989 PiperOrigin-RevId: 316781309
2020-04-30FIFO QDisc implementationBhasker Hariharan
Updates #231 PiperOrigin-RevId: 309323808
2020-02-20Initial network namespace support.gVisor bot
TCP/IP will work with netstack networking. hostinet doesn't work, and sockets will have the same behavior as it is now. Before the userspace is able to create device, the default loopback device can be used to test. /proc/net and /sys/net will still be connected to the root network stack; this is the same behavior now. Issue #1833 PiperOrigin-RevId: 296309389
2020-02-11Disallow duplicate NIC names.gVisor bot
PiperOrigin-RevId: 294500858
2020-01-15Bump SO_SNDBUF for fdbased endpoint used by runsc.Bhasker Hariharan
Updates #231 PiperOrigin-RevId: 289897881
2019-12-11Enable IPv6 in runscBhasker Hariharan
Fixes #1341 PiperOrigin-RevId: 285108973
2019-10-22netstack/tcp: software segmentation offloadAndrei Vagin
Right now, we send each tcp packet separately, we call one system call per-packet. This patch allows to generate multiple tcp packets and send them by sendmmsg. The arguable part of this CL is a way how to handle multiple headers. This CL adds the next field to the Prepandable buffer. Nginx test results: Server Software: nginx/1.15.9 Server Hostname: 10.138.0.2 Server Port: 8080 Document Path: /10m.txt Document Length: 10485760 bytes w/o gso: Concurrency Level: 5 Time taken for tests: 5.491 seconds Complete requests: 100 Failed requests: 0 Total transferred: 1048600200 bytes HTML transferred: 1048576000 bytes Requests per second: 18.21 [#/sec] (mean) Time per request: 274.525 [ms] (mean) Time per request: 54.905 [ms] (mean, across all concurrent requests) Transfer rate: 186508.03 [Kbytes/sec] received sw-gso: Concurrency Level: 5 Time taken for tests: 3.852 seconds Complete requests: 100 Failed requests: 0 Total transferred: 1048600200 bytes HTML transferred: 1048576000 bytes Requests per second: 25.96 [#/sec] (mean) Time per request: 192.576 [ms] (mean) Time per request: 38.515 [ms] (mean, across all concurrent requests) Transfer rate: 265874.92 [Kbytes/sec] received w/o gso: $ ./tcp_benchmark --client --duration 15 --ideal [SUM] 0.0-15.1 sec 2.20 GBytes 1.25 Gbits/sec software gso: $ tcp_benchmark --client --duration 15 --ideal --gso $((1<<16)) --swgso [SUM] 0.0-15.1 sec 3.99 GBytes 2.26 Gbits/sec PiperOrigin-RevId: 276112677
2019-08-21Use tcpip.Subnet in tcpip.RouteTamir Duberstein
This is the first step in replacing some of the redundant types with the standard library equivalents. PiperOrigin-RevId: 264706552
2019-07-30Remove unused const variablesIan Lewis
PiperOrigin-RevId: 260824989
2019-06-13Update canonical repository.Adin Scannell
This can be merged after: https://github.com/google/gvisor-website/pull/77 or https://github.com/google/gvisor-website/pull/78 PiperOrigin-RevId: 253132620
2019-06-11Use net.HardwareAddr for FDBasedLink.LinkAddressFabricio Voznika
It prints formatted to the log. PiperOrigin-RevId: 252699551
2019-06-06Add multi-fd support to fdbased endpoint.Bhasker Hariharan
This allows an fdbased endpoint to have multiple underlying fd's from which packets can be read and dispatched/written to. This should allow for higher throughput as well as better scalability of the network stack as number of connections increases. Updates #231 PiperOrigin-RevId: 251852825
2019-05-15gvisor/runsc: use a veth link address instead of generating a new oneAndrei Vagin
PiperOrigin-RevId: 248367340 Change-Id: Id792afcfff9c9d2cfd62cae21048316267b4a924
2019-04-29Change copyright notice to "The gVisor Authors"Michael Pratt
Based on the guidelines at https://opensource.google.com/docs/releasing/authors/. 1. $ rg -l "Google LLC" | xargs sed -i 's/Google LLC.*/The gVisor Authors./' 2. Manual fixup of "Google Inc" references. 3. Add AUTHORS file. Authors may request to be added to this file. 4. Point netstack AUTHORS to gVisor AUTHORS. Drop CONTRIBUTORS. Fixes #209 PiperOrigin-RevId: 245823212 Change-Id: I64530b24ad021a7d683137459cafc510f5ee1de9
2019-04-26Bump the AF_PACKET socket rcv buf size to 4MB by default.Bhasker Hariharan
Packet socket receive buffers default to the sysctl value of net.core.rmem_default and are capped by net.core.rmem_max both which are usually set to 208KB on most systems. Since we can't expect every gVisor user to bump these we use SO_RCVBUFFORCE to exceed the limit. This is possible as runsc runs with CAP_NET_ADMIN outside the sandbox and can do this before the FD is passed to the sentry inside the sandbox. Updates #211 iperf output w/ 4MB buffer. iperf3 -c 172.17.0.2 -t 100 Connecting to host 172.17.0.2, port 5201 [ 4] local 172.17.0.1 port 40378 connected to 172.17.0.2 port 5201 [ ID] Interval Transfer Bandwidth Retr Cwnd [ 4] 0.00-1.00 sec 1.15 GBytes 9.89 Gbits/sec 0 1.02 MBytes [ 4] 1.00-2.00 sec 1.18 GBytes 10.2 Gbits/sec 0 1.02 MBytes [ 4] 2.00-3.00 sec 965 MBytes 8.09 Gbits/sec 0 1.02 MBytes [ 4] 3.00-4.00 sec 942 MBytes 7.90 Gbits/sec 0 1.02 MBytes [ 4] 4.00-5.00 sec 952 MBytes 7.99 Gbits/sec 0 1.02 MBytes [ 4] 5.00-6.00 sec 1.14 GBytes 9.81 Gbits/sec 0 1.02 MBytes [ 4] 6.00-7.00 sec 1.13 GBytes 9.68 Gbits/sec 0 1.02 MBytes [ 4] 7.00-8.00 sec 930 MBytes 7.80 Gbits/sec 0 1.02 MBytes [ 4] 8.00-9.00 sec 1.15 GBytes 9.91 Gbits/sec 0 1.02 MBytes [ 4] 9.00-10.00 sec 938 MBytes 7.87 Gbits/sec 0 1.02 MBytes [ 4] 10.00-11.00 sec 737 MBytes 6.18 Gbits/sec 0 1.02 MBytes [ 4] 11.00-12.00 sec 1.16 GBytes 9.93 Gbits/sec 0 1.02 MBytes [ 4] 12.00-13.00 sec 917 MBytes 7.69 Gbits/sec 0 1.02 MBytes [ 4] 13.00-14.00 sec 1.19 GBytes 10.2 Gbits/sec 0 1.02 MBytes [ 4] 14.00-15.00 sec 1.01 GBytes 8.70 Gbits/sec 0 1.02 MBytes [ 4] 15.00-16.00 sec 1.20 GBytes 10.3 Gbits/sec 0 1.02 MBytes [ 4] 16.00-17.00 sec 1.14 GBytes 9.80 Gbits/sec 0 1.02 MBytes ^C[ 4] 17.00-17.60 sec 718 MBytes 10.1 Gbits/sec 0 1.02 MBytes - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bandwidth Retr [ 4] 0.00-17.60 sec 18.4 GBytes 8.98 Gbits/sec 0 sender [ 4] 0.00-17.60 sec 0.00 Bytes 0.00 bits/sec receiver PiperOrigin-RevId: 245470590 Change-Id: I1c08c5ee8345de6ac070513656a4703312dc3c00
2019-04-23Replace os.File with fd.FD in fsgoferFabricio Voznika
os.NewFile() accounts for 38% of CPU time in localFile.Walk(). This change switchs to use fd.FD which is much cheaper to create. Now, fd.New() in localFile.Walk() accounts for only 4%. PiperOrigin-RevId: 244944983 Change-Id: Ic892df96cf2633e78ad379227a213cb93ee0ca46
2019-03-29gvisor/runsc: enable generic segmentation offload (GSO)Andrei Vagin
The linux packet socket can handle GSO packets, so we can segment packets to 64K instead of the MTU which is usually 1500. Here are numbers for the nginx-1m test: runsc: 579330.01 [Kbytes/sec] received runsc-gso: 1794121.66 [Kbytes/sec] received runc: 2122139.06 [Kbytes/sec] received and for tcp_benchmark: $ tcp_benchmark --duration 15 --ideal [ 4] 0.0-15.0 sec 86647 MBytes 48456 Mbits/sec $ tcp_benchmark --client --duration 15 --ideal [ 4] 0.0-15.0 sec 2173 MBytes 1214 Mbits/sec $ tcp_benchmark --client --duration 15 --ideal --gso 65536 [ 4] 0.0-15.0 sec 19357 MBytes 10825 Mbits/sec PiperOrigin-RevId: 241072403 Change-Id: I20b03063a1a6649362b43609cbbc9b59be06e6d5
2019-01-28check isRootNS by ns inodeShijiang Wei
Signed-off-by: Shijiang Wei <mountkin@gmail.com> Change-Id: I032f834edae5c716fb2d3538285eec07aa11a902 PiperOrigin-RevId: 231318438
2019-01-18Scrub runsc error messagesFabricio Voznika
Removed "error" and "failed to" prefix that don't add value from messages. Adjusted a few other messages. In particular, when the container fail to start, the message returned is easier for humans to read: $ docker run --rm --runtime=runsc alpine foobar docker: Error response from daemon: OCI runtime start failed: <path> did not terminate sucessfully: starting container: starting root container [foobar]: starting sandbox: searching for executable "foobar", cwd: "/", $PATH="/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin": no such file or directory Closes #77 PiperOrigin-RevId: 230022798 Change-Id: I83339017c70dae09e4f9f8e0ea2e554c4d5d5cd1
2018-10-19Use correct company name in copyright headerIan Gudger
PiperOrigin-RevId: 217951017 Change-Id: Ie08bf6987f98467d07457bcf35b5f1ff6e43c035
2018-10-01Make multi-container the default mode for runscFabricio Voznika
And remove multicontainer option. PiperOrigin-RevId: 215236981 Change-Id: I9fd1d963d987e421e63d5817f91a25c819ced6cb
2018-09-06Enable network for multi-containerFabricio Voznika
PiperOrigin-RevId: 211834411 Change-Id: I52311a6c5407f984e5069359d9444027084e4d2a
2018-08-27Put fsgofer inside chrootFabricio Voznika
Now each container gets its own dedicated gofer that is chroot'd to the rootfs path. This is done to add an extra layer of security in case the gofer gets compromised. PiperOrigin-RevId: 210396476 Change-Id: Iba21360a59dfe90875d61000db103f8609157ca0
2018-08-06Tiny reordering to network codeFabricio Voznika
PiperOrigin-RevId: 207581723 Change-Id: I6e4eb1227b5ed302de5e6c891040b670955f1eea
2018-07-19runsc: copy gateway from the pod network interface.Lantao Liu
PiperOrigin-RevId: 205334841 Change-Id: Ia60d486f9aae70182fdc4af50cf7c915986126d7
2018-06-01Ignores IPv6 addresses when configuring networkFabricio Voznika
Closes #60 PiperOrigin-RevId: 198887885 Change-Id: I9bf990ee3fde9259836e57d67257bef5b85c6008
2018-05-08Use the containerd annotation instead of detecting the "pause" application.Nicolas Lacasse
FIXED=72380268 PiperOrigin-RevId: 195846596 Change-Id: Ic87fed1433482a514631e1e72f5ee208e11290d1
2018-04-28Check in gVisor.Googler
PiperOrigin-RevId: 194583126 Change-Id: Ica1d8821a90f74e7e745962d71801c598c652463