summaryrefslogtreecommitdiffhomepage
path: root/pkg/tcpip/link
AgeCommit message (Collapse)Author
2019-06-10Merge a00157cc (automated)gVisor bot
2019-06-06Add multi-fd support to fdbased endpoint.Bhasker Hariharan
This allows an fdbased endpoint to have multiple underlying fd's from which packets can be read and dispatched/written to. This should allow for higher throughput as well as better scalability of the network stack as number of connections increases. Updates #231 PiperOrigin-RevId: 251852825
2019-06-05netstack/sniffer: log GSO attributesAndrei Vagin
PiperOrigin-RevId: 251788534
2019-06-02Merge 216da0b7 (automated)gVisor bot
2019-05-30Add build guard to files using go:linknameFabricio Voznika
Funcion signatures are not validated during compilation. Since they are not exported, they can change at any time. The guard ensures that they are verified at least on every version upgrade. PiperOrigin-RevId: 250733742
2019-05-21Refactor fdbased endpoint dispatcher code.Bhasker Hariharan
This is in preparation to support an fdbased endpoint that can read/dispatch packets from multiple underlying fds. Updates #231 PiperOrigin-RevId: 249337074 Change-Id: Id7d375186cffcf55ae5e38986e7d605a96916d35
2019-05-08Set the FilesytemType in MountSource from the Filesystem.Nicolas Lacasse
And stop storing the Filesystem in the MountSource. This allows us to decouple the MountSource filesystem type from the name of the filesystem. PiperOrigin-RevId: 247292982 Change-Id: I49cbcce3c17883b7aa918ba76203dfd6d1b03cc8
2019-05-03Support IPv4 fragmentation in netstackGoogler
Testing: Unit tests and also large ping in Fuchsia OS PiperOrigin-RevId: 246563592 Change-Id: Ia12ab619f64f4be2c8d346ce81341a91724aef95
2019-04-29Change copyright notice to "The gVisor Authors"Michael Pratt
Based on the guidelines at https://opensource.google.com/docs/releasing/authors/. 1. $ rg -l "Google LLC" | xargs sed -i 's/Google LLC.*/The gVisor Authors./' 2. Manual fixup of "Google Inc" references. 3. Add AUTHORS file. Authors may request to be added to this file. 4. Point netstack AUTHORS to gVisor AUTHORS. Drop CONTRIBUTORS. Fixes #209 PiperOrigin-RevId: 245823212 Change-Id: I64530b24ad021a7d683137459cafc510f5ee1de9
2019-04-26Bump the AF_PACKET socket rcv buf size to 4MB by default.Bhasker Hariharan
Packet socket receive buffers default to the sysctl value of net.core.rmem_default and are capped by net.core.rmem_max both which are usually set to 208KB on most systems. Since we can't expect every gVisor user to bump these we use SO_RCVBUFFORCE to exceed the limit. This is possible as runsc runs with CAP_NET_ADMIN outside the sandbox and can do this before the FD is passed to the sentry inside the sandbox. Updates #211 iperf output w/ 4MB buffer. iperf3 -c 172.17.0.2 -t 100 Connecting to host 172.17.0.2, port 5201 [ 4] local 172.17.0.1 port 40378 connected to 172.17.0.2 port 5201 [ ID] Interval Transfer Bandwidth Retr Cwnd [ 4] 0.00-1.00 sec 1.15 GBytes 9.89 Gbits/sec 0 1.02 MBytes [ 4] 1.00-2.00 sec 1.18 GBytes 10.2 Gbits/sec 0 1.02 MBytes [ 4] 2.00-3.00 sec 965 MBytes 8.09 Gbits/sec 0 1.02 MBytes [ 4] 3.00-4.00 sec 942 MBytes 7.90 Gbits/sec 0 1.02 MBytes [ 4] 4.00-5.00 sec 952 MBytes 7.99 Gbits/sec 0 1.02 MBytes [ 4] 5.00-6.00 sec 1.14 GBytes 9.81 Gbits/sec 0 1.02 MBytes [ 4] 6.00-7.00 sec 1.13 GBytes 9.68 Gbits/sec 0 1.02 MBytes [ 4] 7.00-8.00 sec 930 MBytes 7.80 Gbits/sec 0 1.02 MBytes [ 4] 8.00-9.00 sec 1.15 GBytes 9.91 Gbits/sec 0 1.02 MBytes [ 4] 9.00-10.00 sec 938 MBytes 7.87 Gbits/sec 0 1.02 MBytes [ 4] 10.00-11.00 sec 737 MBytes 6.18 Gbits/sec 0 1.02 MBytes [ 4] 11.00-12.00 sec 1.16 GBytes 9.93 Gbits/sec 0 1.02 MBytes [ 4] 12.00-13.00 sec 917 MBytes 7.69 Gbits/sec 0 1.02 MBytes [ 4] 13.00-14.00 sec 1.19 GBytes 10.2 Gbits/sec 0 1.02 MBytes [ 4] 14.00-15.00 sec 1.01 GBytes 8.70 Gbits/sec 0 1.02 MBytes [ 4] 15.00-16.00 sec 1.20 GBytes 10.3 Gbits/sec 0 1.02 MBytes [ 4] 16.00-17.00 sec 1.14 GBytes 9.80 Gbits/sec 0 1.02 MBytes ^C[ 4] 17.00-17.60 sec 718 MBytes 10.1 Gbits/sec 0 1.02 MBytes - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bandwidth Retr [ 4] 0.00-17.60 sec 18.4 GBytes 8.98 Gbits/sec 0 sender [ 4] 0.00-17.60 sec 0.00 Bytes 0.00 bits/sec receiver PiperOrigin-RevId: 245470590 Change-Id: I1c08c5ee8345de6ac070513656a4703312dc3c00
2019-04-23Fixes to PacketMMap dispatcher.Bhasker Hariharan
This CL fixes the following bugs: - Uses atomic to set/read status instead of binary.LittleEndian.PutUint32 etc which are not atomic. - Increments ringOffsets for frames that are truncated (i.e status is tpStatusCopy) - Does not ignore frames with tpStatusLost bit set as they are valid frames and only indicate that there some frames were lost before this one and metrics can be retrieved with a getsockopt call. - Adds checks to make sure blockSize is a multiple of page size. This is required as the kernel allocates in pages per block and rejects sizes that are not page aligned with an EINVAL. Updates #210 PiperOrigin-RevId: 244959464 Change-Id: I5d61337b7e4c0f8a3063dcfc07791d4c4521ba1f
2019-04-18netstack: use a proper network protocol to set gso.L3HdrLenAndrei Vagin
It is possible to create a listening socket which will accept IPv4 and IPv6 connections. In this case, we set IPv6ProtocolNumber for all accepted endpoints, even if they handle IPv4 connections. This means that we can't use endpoint.netProto to set gso.L3HdrLen. PiperOrigin-RevId: 244227948 Change-Id: I5e1863596cb9f3d216febacdb7dc75651882eef1
2019-04-17Return error from fdbased.NewFabricio Voznika
RELNOTES: n/a PiperOrigin-RevId: 244031742 Change-Id: Id0cdb73194018fb5979e67b58510ead19b5a2b81
2019-04-09Add TCP checksum verification.Bhasker Hariharan
PiperOrigin-RevId: 242704699 Change-Id: I87db368ca343b3b4bf4f969b17d3aa4ce2f8bd4f
2019-03-28netstack/fdbased: add generic segmentation offload (GSO) supportAndrei Vagin
The linux packet socket can handle GSO packets, so we can segment packets to 64K instead of the MTU which is usually 1500. Here are numbers for the nginx-1m test: runsc: 579330.01 [Kbytes/sec] received runsc-gso: 1794121.66 [Kbytes/sec] received runc: 2122139.06 [Kbytes/sec] received and for tcp_benchmark: $ tcp_benchmark --duration 15 --ideal [ 4] 0.0-15.0 sec 86647 MBytes 48456 Mbits/sec $ tcp_benchmark --client --duration 15 --ideal [ 4] 0.0-15.0 sec 2173 MBytes 1214 Mbits/sec $ tcp_benchmark --client --duration 15 --ideal --gso 65536 [ 4] 0.0-15.0 sec 19357 MBytes 10825 Mbits/sec PiperOrigin-RevId: 240809103 Change-Id: I2637f104db28b5d4c64e1e766c610162a195775a
2019-03-13Reduce PACKET_RX_RING memory usageFabricio Voznika
Previous memory allocation was excessive (80 MB). Changed it to use 2 MB instead. There is no drop in perfomance due to this change: ab -n 100 -c 10 http://server/latin10m.txt ==> 10 MB file 80 MB: 178 MB/s 2 MB: 181 MB/s PiperOrigin-RevId: 238321594 Change-Id: I1c8aed13cad5d75f4506d2b406b305117055fbe5
2019-03-12Make HandleLocal apply to all non-loopback interfaces.Ian Gudger
HandleLocal is very similar conceptually to MULTICAST_LOOP, so we can unify the implementations. This has the benefit of making HandleLocal apply even when the fdbased link endpoint isn't in use. In addition, move looping logic to route creation so that it doesn't need to be run for each packet. This should improve performance. PiperOrigin-RevId: 238099480 Change-Id: I72839f16f25310471453bc9d3fb8544815b25c23
2019-02-26Adds a WriteRawPacket method to the InjectableLinkEndpoint interface.Googler
Also exposes ipv4.MaxTotalSize since it is a generally useful constant. PiperOrigin-RevId: 235799755 Change-Id: I1fa8d5294bf355acf5527cfdf274b3687d3c8b13
2019-02-13Add support for using PACKET_RX_RING to receive packets.Bhasker Hariharan
PACKET_RX_RING allows the use of an mmapped buffer to receive packets from the kernel. This should cut down the number of host syscalls that need to be made to receive packets when the underlying fd is a socket of the AF_PACKET type. PiperOrigin-RevId: 233834998 Change-Id: I8060025c6ced206986e94cc46b8f382b81bfa47f
2019-02-08Fix build error.Ian Gudger
PiperOrigin-RevId: 233139020 Change-Id: I2e7089fa25d20e5662eb941054a684d41f5d3e12
2019-02-07Internal change.Googler
PiperOrigin-RevId: 232937200 Change-Id: I5c3709cc8f1313313ff618a45e48c14a3a111cb4
2019-01-31Remove license commentsMichael Pratt
Nothing reads them and they can simply get stale. Generated with: $ sed -i "s/licenses(\(.*\)).*/licenses(\1)/" **/BUILD PiperOrigin-RevId: 231818945 Change-Id: Ibc3f9838546b7e94f13f217060d31f4ada9d4bf0
2019-01-29Use recvmmsg() instead of readv() to read packets from NIC.Bhasker Hariharan
This should reduce the number of syscalls required to process packets significantly and improve throughputs. PiperOrigin-RevId: 231366886 Change-Id: I8b38077262bf9c53176bc4a94b530188d3d7c0ca
2019-01-11Internal change.Googler
PiperOrigin-RevId: 228979583 Change-Id: I69bd82def48ceb19bc8558c890622b8528d98764
2018-11-14Rename incorrectly named (dst, src) arguments in DeliverNetworkPacket prototypeBert Muthalaly
...to (remote, local), reflecting the (correct) names in the implementation of DeliverNetworkPacket (see tcpip/stack/nic.go). Also trim the names in DeliverNetworkPacket and elsewhere to avoid stuttering; since the type is tcpip.LinkAddress, there's no need to include "LinkAddr" in the parameter names. Note that every callsite passes arguments in the order (src, dst). PiperOrigin-RevId: 221514396 Change-Id: I3637454ad0d6e62a19e4dcbc2a16493798bd0f09
2018-10-31Use syserr style error translation in netstack's rawfileIan Gudger
Replacing map lookups with slice indexing is higher performance. PiperOrigin-RevId: 219569901 Change-Id: I9b7cd22abd4b95383025edbd5a80d1c1a4496936
2018-10-23Track paths and provide a rename hook.Adin Scannell
This change also adds extensive testing to the p9 package via mocks. The sanity checks and type checks are moved from the gofer into the core package, where they can be more easily validated. PiperOrigin-RevId: 218296768 Change-Id: I4fc3c326e7bf1e0e140a454cbacbcc6fd617ab55
2018-10-19Use correct company name in copyright headerIan Gudger
PiperOrigin-RevId: 217951017 Change-Id: Ie08bf6987f98467d07457bcf35b5f1ff6e43c035
2018-10-10Enforce message size limits and avoid host calls with too many iovecsMichael Pratt
Currently, in the face of FileMem fragmentation and a large sendmsg or recvmsg call, host sockets may pass > 1024 iovecs to the host, which will immediately cause the host to return EMSGSIZE. When we detect this case, use a single intermediate buffer to pass to the kernel, copying to/from the src/dst buffer. To avoid creating unbounded intermediate buffers, enforce message size checks and truncation w.r.t. the send buffer size. The same functionality is added to netstack unix sockets for feature parity. PiperOrigin-RevId: 216590198 Change-Id: I719a32e71c7b1098d5097f35e6daf7dd5190eff7
2018-09-21Deflake TestSimpleReceiveTamir Duberstein
...by increasing the allotted timeout and using direct comparison rather than reflect.DeepEqual (which should be faster). PiperOrigin-RevId: 214027024 Change-Id: I0a2690e65c7e14b4cc118c7312dbbf5267dc78bc
2018-09-19Pass local link address to DeliverNetworkPacketBert Muthalaly
This allows a NetworkDispatcher to implement transparent bridging, assuming all implementations of LinkEndpoint.WritePacket call eth.Encode with header.EthernetFields.SrcAddr set to the passed Route.LocalLinkAddress, if it is provided. PiperOrigin-RevId: 213686651 Change-Id: I446a4ac070970202f0724ef796ff1056ae4dd72a
2018-09-14Remove buffer.Prependable.UsedBytesTamir Duberstein
It is the same as buffer.Prependable.View. PiperOrigin-RevId: 213064166 Change-Id: Ib33b8a2c4da864209d9a0be0a1c113be10b520d3
2018-09-14Pass buffer.Prependable by valueTamir Duberstein
PiperOrigin-RevId: 213053370 Change-Id: I60ea89572b4fca53fd126c870fcbde74fcf52562
2018-09-12Always pass buffer.VectorisedView by valueTamir Duberstein
PiperOrigin-RevId: 212757571 Change-Id: I04200df9e45c21eb64951cd2802532fa84afcb1a
2018-09-05Update {LinkEndpoint,NetworkEndpoint}#WritePacket to take a VectorisedViewBert Muthalaly
Makes it possible to avoid copying or allocating in cases where DeliverNetworkPacket (rx) needs to turn around and call WritePacket (tx) with its VectorisedView. Also removes the restriction on having VectorisedViews with multiple views in the write path. PiperOrigin-RevId: 211728717 Change-Id: Ie03a65ecb4e28bd15ebdb9c69f05eced18fdfcff
2018-09-04Automated rollback of changelist 211156845Bhasker Hariharan
PiperOrigin-RevId: 211525182 Change-Id: I462c20328955c77ecc7bfd8ee803ac91f15858e6
2018-08-31Automated rollback of changelist 211103930Googler
PiperOrigin-RevId: 211156845 Change-Id: Ie28011d7eb5f45f3a0158dbee2a68c5edf22f6e0
2018-08-31ipv6: ICMP supportTamir Duberstein
This CL does NDP link-address discovery for IPv6. It includes several small changes necessary to get linux to talk to this implementation. In particular, a hop limit of 255 is necessary for ICMPv6. PiperOrigin-RevId: 211103930 Change-Id: If25370ab84c6b1decfb15de917f3b0020f2c4e0e
2018-08-27Add various statisticsTamir Duberstein
PiperOrigin-RevId: 210442599 Change-Id: I9498351f461dc69c77b7f815d526c5693bec8e4a
2018-08-22Allow building on !linuxGoogler
PiperOrigin-RevId: 209819644 Change-Id: I329d054bf8f4999e7db0dcd95b13f7793c65d4e2
2018-08-21Build PCAP file with atomic blocking writesIan Gudger
The previous use of non-blocking writes could result in corrupt PCAP files if a partial write occurs. Using (*os.File).Write solves this problem by not allowing partial writes. This change does not increase allocations (in one path it actually reduces them), but does add additional copying. PiperOrigin-RevId: 209652974 Change-Id: I4b1cf2eda4cfd7f237a4245aceb7391b3055a66c
2018-08-08Basic support for ip link/addr and ifconfigFabricio Voznika
Closes #94 PiperOrigin-RevId: 207997580 Change-Id: I19b426f1586b5ec12f8b0cd5884d5b401d334924
2018-08-08Resend packets back to netstack if destined to itselfFabricio Voznika
Add option to redirect packet back to netstack if it's destined to itself. This fixes the problem where connecting to the local NIC address would not work, e.g.: echo bar | nc -l -p 8080 & echo foo | nc 192.168.0.2 8080 PiperOrigin-RevId: 207995083 Change-Id: I17adc2a04df48bfea711011a5df206326a1fb8ef
2018-07-30netstack: support disconnect-on-save option per fdbased link.Zhaozhong Ni
PiperOrigin-RevId: 206659972 Change-Id: I5e0e035f97743b6525ad36bed2c802791609beaf
2018-07-27stateify: support explicit annotation mode; convert refs and stack packages.Zhaozhong Ni
We have been unnecessarily creating too many savable types implicitly. PiperOrigin-RevId: 206334201 Change-Id: Idc5a3a14bfb7ee125c4f2bb2b1c53164e46f29a8
2018-07-17netstack: update goroutine save / restore safety comments.Zhaozhong Ni
PiperOrigin-RevId: 204930314 Change-Id: Ifc4c41ed28616cd57fafbf7c92e87141a945c41f
2018-07-11Automated rollback of changelist 203157739Bhasker Hariharan
PiperOrigin-RevId: 204196916 Change-Id: If632750fc6368acb835e22cfcee0ae55c8a04d16
2018-07-10netstack: only do connected TCP S/R for loopback connections.Zhaozhong Ni
PiperOrigin-RevId: 204006237 Change-Id: Ica8402ab54d9dd7d11cc41c6d74aacef51d140b7
2018-07-09Switch netstack licenses to Apache 2.0.Nicolas Lacasse
Fixes #27 PiperOrigin-RevId: 203825288 Change-Id: Ie9f3a2b2c1e296b026b024f75c07da1a7e118633
2018-07-06Add non-AMD64 support to rawfileIan Gudger
PiperOrigin-RevId: 203499064 Change-Id: I2cd5189638e94ce926f1e82c1264a8d3ece9dfa5