summaryrefslogtreecommitdiffhomepage
path: root/pkg/tcpip/link
AgeCommit message (Collapse)Author
2020-07-15Fix minor bugs in a couple of interface IOCTLs.Bhasker Hariharan
gVisor incorrectly returns the wrong ARP type for SIOGIFHWADDR. This breaks tcpdump as it tries to interpret the packets incorrectly. Similarly, SIOCETHTOOL is used by tcpdump to query interface properties which fails with an EINVAL since we don't implement it. For now change it to return EOPNOTSUPP to indicate that we don't support the query rather than return EINVAL. NOTE: ARPHRD types for link endpoints are distinct from NIC capabilities and NIC flags. In Linux all 3 exist eg. ARPHRD types are stored in dev->type field while NIC capabilities are more like the device features which can be queried using SIOCETHTOOL but not modified and NIC Flags are fields that can be modified from user space. eg. NIC status (UP/DOWN/MULTICAST/BROADCAST) etc. Updates #2746 PiperOrigin-RevId: 321436525
2020-07-13Fix recvMMsgDispatcher not slicing link header correctly.Ting-Yu Wang
PiperOrigin-RevId: 321035635
2020-07-06Fix NonBlockingWrite3 not writing b3 if b2 is zero-length.Ting-Yu Wang
PiperOrigin-RevId: 319882171
2020-06-22Extract common nested LinkEndpoint patternBruno Dal Bo
... and unify logic for detached netsted endpoints. sniffer.go caused crashes if a packet delivery is attempted when the dispatcher is nil. Extracted the endpoint nesting logic into a common composable type so it can be used by the Fuchsia Netstack (the pattern is widespread there). PiperOrigin-RevId: 317682842
2020-06-10Remove duplicate and incorrect size checkTamir Duberstein
Minimum header sizes are already checked in each `case` arm below. Worse, the ICMP entries in transportProtocolMinSizes are incorrect, and produce false "raw packet" logs. PiperOrigin-RevId: 315730073
2020-06-10Replace use of %v in snifferTamir Duberstein
PiperOrigin-RevId: 315711208
2020-06-03Pass PacketBuffer as pointer.Ting-Yu Wang
Historically we've been passing PacketBuffer by shallow copying through out the stack. Right now, this is only correct as the caller would not use PacketBuffer after passing into the next layer in netstack. With new buffer management effort in gVisor/netstack, PacketBuffer will own a Buffer (to be added). Internally, both PacketBuffer and Buffer may have pointers and shallow copying shouldn't be used. Updates #2404. PiperOrigin-RevId: 314610879
2020-05-29Update Go version build tagsMichael Pratt
None of the dependencies have changed in 1.15. It may be possible to simplify some of the wrappers in rawfile following 1.13, but that can come in a later change. PiperOrigin-RevId: 313863264
2020-05-27Remove linkEP from DeliverNetworkPacketSam Balana
The specified LinkEndpoint is not being used in a significant way. No behavior change, existing tests pass. This change is a breaking change. PiperOrigin-RevId: 313496602
2020-05-27Fix tiny typo.Kevin Krakauer
PiperOrigin-RevId: 313414690
2020-05-11Automated rollback of changelist 310417191Bhasker Hariharan
PiperOrigin-RevId: 310963404
2020-05-07Automated rollback of changelist 309339316Bhasker Hariharan
PiperOrigin-RevId: 310417191
2020-05-06sniffer: fix accidental logging of good packets as badKevin Krakauer
We need to check vv.Size() instead of len(tcp), as tcp will always be 20 bytes long. PiperOrigin-RevId: 310218351
2020-05-01Automated rollback of changelist 308674219Kevin Krakauer
PiperOrigin-RevId: 309491861
2020-04-30Enable FIFO QDisc by default in runsc.Bhasker Hariharan
Updates #231 PiperOrigin-RevId: 309339316
2020-04-30FIFO QDisc implementationBhasker Hariharan
Updates #231 PiperOrigin-RevId: 309323808
2020-04-27Automated rollback of changelist 308163542gVisor bot
PiperOrigin-RevId: 308674219
2020-04-23Remove View.First() and View.RemoveFirst()Kevin Krakauer
These methods let users eaily break the VectorisedView abstraction, and allowed netstack to slip into pseudo-enforcement of the "all headers are in the first View" invariant. Removing them and replacing with PullUp(n) breaks this reliance and will make it easier to add iptables support and rework network buffer management. The new View.PullUp(n) method is low cost in the common case, when when all the headers fit in the first View. PiperOrigin-RevId: 308163542
2020-04-21Automated rollback of changelist 307477185gVisor bot
PiperOrigin-RevId: 307598974
2020-04-17Remove View.First() and View.RemoveFirst()Kevin Krakauer
These methods let users eaily break the VectorisedView abstraction, and allowed netstack to slip into pseudo-enforcement of the "all headers are in the first View" invariant. Removing them and replacing with PullUp(n) breaks this reliance and will make it easier to add iptables support and rework network buffer management. The new View.PullUp(n) method is low cost in the common case, when when all the headers fit in the first View.
2020-04-17Allow caller-defined sinks for packet sniffing.Tamir Duberstein
PiperOrigin-RevId: 307053624
2020-04-16Properly delegate WaitTamir Duberstein
PiperOrigin-RevId: 306959393
2020-04-15Deduplicate packet loggingTamir Duberstein
PiperOrigin-RevId: 306677789
2020-04-14Reduce flakiness in tcp_test.Bhasker Hariharan
Tests now use a MinRTO of 3s instead of default 200ms. This reduced flakiness in a lot of the congestion control/recovery tests which were flaky due to retransmit timer firing too early in case the test executors were overloaded. This change also bumps some of the timeouts in tests which were too sensitive to timer variations and reduces the number of slow start iterations which can make the tests run for too long and also trigger retansmit timeouts etc if the executor is overloaded. PiperOrigin-RevId: 306562645
2020-04-08Fix all printf formatting errors.Adin Scannell
Updates #2243
2020-04-03Refactor software GSO code.Bhasker Hariharan
Software GSO implementation currently has a complicated code path with implicit assumptions that all packets to WritePackets carry same Data and it does this to avoid allocations on the path etc. But this makes it hard to reuse the WritePackets API. This change breaks all such assumptions by introducing a new Vectorised View API ReadToVV which can be used to cleanly split a VV into multiple independent VVs. Further this change also makes packet buffers linkable to form an intrusive list. This allows us to get rid of the array of packet buffers that are passed in the WritePackets API call and replace it with a list of packet buffers. While this code does introduce some more allocations in the benchmarks it doesn't cause any degradation. Updates #231 PiperOrigin-RevId: 304731742
2020-03-24Add support for setting TCP segment hash.Bhasker Hariharan
This allows the link layer endpoints to consistenly hash a TCP segment to a single underlying queue in case a link layer endpoint does support multiple underlying queues. Updates #231 PiperOrigin-RevId: 302760664
2020-03-24Move tcpip.PacketBuffer and IPTables to stack package.Bhasker Hariharan
This is a precursor to be being able to build an intrusive list of PacketBuffers for use in queuing disciplines being implemented. Updates #2214 PiperOrigin-RevId: 302677662
2020-03-16Enable ARP resolution in TAP devices.Ting-Yu Wang
PiperOrigin-RevId: 301208471
2020-03-16Prevent vnetHdr from escaping in WritePacket.Bhasker Hariharan
PiperOrigin-RevId: 301157950
2020-03-13Fix typoMichael Pratt
PiperOrigin-RevId: 300832988
2020-02-21Implement tap/tun device in vfs.Ting-Yu Wang
PiperOrigin-RevId: 296526279
2020-01-31Use multicast Ethernet address for multicast NDPGhanan Gowripalan
As per RFC 2464 section 7, an IPv6 packet with a multicast destination address is transmitted to the mapped Ethernet multicast address. Test: - ipv6.TestLinkResolution - stack_test.TestDADResolve - stack_test.TestRouterSolicitation PiperOrigin-RevId: 292610529
2020-01-27Refactor to hide C from channel.Endpoint.Ting-Yu Wang
This is to aid later implementation for /dev/net/tun device. PiperOrigin-RevId: 291746025
2020-01-27Standardize on tools directory.Adin Scannell
PiperOrigin-RevId: 291745021
2020-01-09New sync package.Ian Gudger
* Rename syncutil to sync. * Add aliases to sync types. * Replace existing usage of standard library sync package. This will make it easier to swap out synchronization primitives. For example, this will allow us to use primitives from github.com/sasha-s/go-deadlock to check for lock ordering violations. Updates #1472 PiperOrigin-RevId: 289033387
2019-11-25Fix panic in sniffer.Kevin Krakauer
Packets written via SOCK_RAW are guaranteed to have network headers, but not transport headers. Check first whether there are enough bytes left in the packet to contain a transport header before attempting to parse it. PiperOrigin-RevId: 282363895
2019-11-23Cleanup visibility.Adin Scannell
PiperOrigin-RevId: 282194656
2019-11-22Use PacketBuffers with GSO.Kevin Krakauer
PiperOrigin-RevId: 282045221
2019-11-15Automated rollback of changelist 280594395Bhasker Hariharan
PiperOrigin-RevId: 280763655
2019-11-14Fix panic when logging raw packets via sniffer.Kevin Krakauer
Sniffer assumed that outgoing packets have transport headers, but users can write packets via SOCK_RAW with arbitrary transport headers that netstack doesn't know about. We now explicitly check for the presence of network and transport headers before assuming they exist. PiperOrigin-RevId: 280594395
2019-11-14Use PacketBuffers for outgoing packets.Kevin Krakauer
PiperOrigin-RevId: 280455453
2019-11-06Use PacketBuffers, rather than VectorisedViews, in netstack.Kevin Krakauer
PacketBuffers are analogous to Linux's sk_buff. They hold all information about a packet, headers, and payload. This is important for: * iptables to access various headers of packets * Preventing the clutter of passing different net and link headers along with VectorisedViews to packet handling functions. This change only affects the incoming packet path, and a future change will change the outgoing path. Benchmark Regular PacketBufferPtr PacketBufferConcrete -------------------------------------------------------------------------------- BM_Recvmsg 400.715MB/s 373.676MB/s 396.276MB/s BM_Sendmsg 361.832MB/s 333.003MB/s 335.571MB/s BM_Recvfrom 453.336MB/s 393.321MB/s 381.650MB/s BM_Sendto 378.052MB/s 372.134MB/s 341.342MB/s BM_SendmsgTCP/0/1k 353.711MB/s 316.216MB/s 322.747MB/s BM_SendmsgTCP/0/2k 600.681MB/s 588.776MB/s 565.050MB/s BM_SendmsgTCP/0/4k 995.301MB/s 888.808MB/s 941.888MB/s BM_SendmsgTCP/0/8k 1.517GB/s 1.274GB/s 1.345GB/s BM_SendmsgTCP/0/16k 1.872GB/s 1.586GB/s 1.698GB/s BM_SendmsgTCP/0/32k 1.017GB/s 1.020GB/s 1.133GB/s BM_SendmsgTCP/0/64k 475.626MB/s 584.587MB/s 627.027MB/s BM_SendmsgTCP/0/128k 416.371MB/s 503.434MB/s 409.850MB/s BM_SendmsgTCP/0/256k 323.449MB/s 449.599MB/s 388.852MB/s BM_SendmsgTCP/0/512k 243.992MB/s 267.676MB/s 314.474MB/s BM_SendmsgTCP/0/1M 95.138MB/s 95.874MB/s 95.417MB/s BM_SendmsgTCP/0/2M 96.261MB/s 94.977MB/s 96.005MB/s BM_SendmsgTCP/0/4M 96.512MB/s 95.978MB/s 95.370MB/s BM_SendmsgTCP/0/8M 95.603MB/s 95.541MB/s 94.935MB/s BM_SendmsgTCP/0/16M 94.598MB/s 94.696MB/s 94.521MB/s BM_SendmsgTCP/0/32M 94.006MB/s 94.671MB/s 94.768MB/s BM_SendmsgTCP/0/64M 94.133MB/s 94.333MB/s 94.746MB/s BM_SendmsgTCP/0/128M 93.615MB/s 93.497MB/s 93.573MB/s BM_SendmsgTCP/0/256M 93.241MB/s 95.100MB/s 93.272MB/s BM_SendmsgTCP/1/1k 303.644MB/s 316.074MB/s 308.430MB/s BM_SendmsgTCP/1/2k 537.093MB/s 584.962MB/s 529.020MB/s BM_SendmsgTCP/1/4k 882.362MB/s 939.087MB/s 892.285MB/s BM_SendmsgTCP/1/8k 1.272GB/s 1.394GB/s 1.296GB/s BM_SendmsgTCP/1/16k 1.802GB/s 2.019GB/s 1.830GB/s BM_SendmsgTCP/1/32k 2.084GB/s 2.173GB/s 2.156GB/s BM_SendmsgTCP/1/64k 2.515GB/s 2.463GB/s 2.473GB/s BM_SendmsgTCP/1/128k 2.811GB/s 3.004GB/s 2.946GB/s BM_SendmsgTCP/1/256k 3.008GB/s 3.159GB/s 3.171GB/s BM_SendmsgTCP/1/512k 2.980GB/s 3.150GB/s 3.126GB/s BM_SendmsgTCP/1/1M 2.165GB/s 2.233GB/s 2.163GB/s BM_SendmsgTCP/1/2M 2.370GB/s 2.219GB/s 2.453GB/s BM_SendmsgTCP/1/4M 2.005GB/s 2.091GB/s 2.214GB/s BM_SendmsgTCP/1/8M 2.111GB/s 2.013GB/s 2.109GB/s BM_SendmsgTCP/1/16M 1.902GB/s 1.868GB/s 1.897GB/s BM_SendmsgTCP/1/32M 1.655GB/s 1.665GB/s 1.635GB/s BM_SendmsgTCP/1/64M 1.575GB/s 1.547GB/s 1.575GB/s BM_SendmsgTCP/1/128M 1.524GB/s 1.584GB/s 1.580GB/s BM_SendmsgTCP/1/256M 1.579GB/s 1.607GB/s 1.593GB/s PiperOrigin-RevId: 278940079
2019-10-30Deep copy dispatcher views.Kevin Krakauer
When VectorisedViews were passed up the stack from packet_dispatchers, we were passing a sub-slice of the dispatcher's views fields. The dispatchers then immediately set those views to nil. This wasn't caught before because every implementer copied the data in these views before returning. PiperOrigin-RevId: 277615351
2019-10-29Update build tags to allow Go 1.14Michael Pratt
Currently there are no ABI changes. We should check again closer to release. PiperOrigin-RevId: 277349744
2019-10-22netstack/tcp: software segmentation offloadAndrei Vagin
Right now, we send each tcp packet separately, we call one system call per-packet. This patch allows to generate multiple tcp packets and send them by sendmmsg. The arguable part of this CL is a way how to handle multiple headers. This CL adds the next field to the Prepandable buffer. Nginx test results: Server Software: nginx/1.15.9 Server Hostname: 10.138.0.2 Server Port: 8080 Document Path: /10m.txt Document Length: 10485760 bytes w/o gso: Concurrency Level: 5 Time taken for tests: 5.491 seconds Complete requests: 100 Failed requests: 0 Total transferred: 1048600200 bytes HTML transferred: 1048576000 bytes Requests per second: 18.21 [#/sec] (mean) Time per request: 274.525 [ms] (mean) Time per request: 54.905 [ms] (mean, across all concurrent requests) Transfer rate: 186508.03 [Kbytes/sec] received sw-gso: Concurrency Level: 5 Time taken for tests: 3.852 seconds Complete requests: 100 Failed requests: 0 Total transferred: 1048600200 bytes HTML transferred: 1048576000 bytes Requests per second: 25.96 [#/sec] (mean) Time per request: 192.576 [ms] (mean) Time per request: 38.515 [ms] (mean, across all concurrent requests) Transfer rate: 265874.92 [Kbytes/sec] received w/o gso: $ ./tcp_benchmark --client --duration 15 --ideal [SUM] 0.0-15.1 sec 2.20 GBytes 1.25 Gbits/sec software gso: $ tcp_benchmark --client --duration 15 --ideal --gso $((1<<16)) --swgso [SUM] 0.0-15.1 sec 3.99 GBytes 2.26 Gbits/sec PiperOrigin-RevId: 276112677
2019-10-21AF_PACKET support for netstack (aka epsocket).Kevin Krakauer
Like (AF_INET, SOCK_RAW) sockets, AF_PACKET sockets require CAP_NET_RAW. With runsc, you'll need to pass `--net-raw=true` to enable them. Binding isn't supported yet. PiperOrigin-RevId: 275909366
2019-10-14Use a different fanoutID for each new fdbased endpoint.Bhasker Hariharan
PiperOrigin-RevId: 274638272
2019-09-30Add a Stringer implementation to PacketDispatchModeBhasker Hariharan
PiperOrigin-RevId: 272083936
2019-09-20Allow waiting for LinkEndpoint worker goroutines to finish.Ian Gudger
Previously, the only safe way to use an fdbased endpoint was to leak the FD. This change makes it possible to safely close the FD. This is the first step towards having stoppable stacks. Updates #837 PiperOrigin-RevId: 270346582