summaryrefslogtreecommitdiffhomepage
path: root/pkg/tcpip
AgeCommit message (Collapse)Author
2020-07-01Merge release-20200622.1-47-gc9446f053 (automated)gVisor bot
2020-06-30Fix two bugs in TCP sender.Bhasker Hariharan
a) When GSO is in use we should not cap the segment to maxPayloadSize in sender.maybeSendSegment as the GSO logic will cap the segment to the correct size. Without this the host GSO is not used as we end up breaking up large segments into small MSS sized segments before writing the packets to the host. b) The check to not split a segment due to it not fitting in the receiver window when there are pending segments is incorrect as segments in writeList can be really large as we just take the write call's buffer size and create a single large segment. So a write of say 128KB will just be 1 segment in the writeList. The linux code checks if 1 MSS sized segments fits in the receiver's window and if not then does not split the current segment. gVisor's check was incorrect that it was checking if the whole segment which could be >>> 1 MSS would fit in the receiver's window. This was causing us to prematurely stop sending and falling back to retransmit timer/probe from the other end to send data. This was seen when running HTTPD benchmarks where @ HEAD when sending large files the benchmark was taking forever to run. The tcp_splitseg_mss_test.go is being deleted as the test as written doesn't test what is intended correctly. This is because GSO is enabled by default and the reason the MSS+1 sized segment is sent is because GSO is in use. A proper test will require disabling GSO on linux and netstack which is going to take a bit of work in packetimpact to do it correctly. Separately a new test probably should be written that verifies that a segment > availableWindow is not split if the availableWindow is < 1 MSS. Fixes #3107 PiperOrigin-RevId: 319172089
2020-06-30Merge release-20200622.1-42-g4784ed46e (automated)gVisor bot
2020-06-30Avoid multiple atomic loadsTamir Duberstein
...by calling (*tcp.endpoint).EndpointState only once when possible. Avoid wrapping (*sleep.Waker).Assert in a useless func while I'm here. PiperOrigin-RevId: 319074149
2020-06-27Merge release-20200622.1-34-g66d166544 (automated)gVisor bot
2020-06-26IPv6 raw sockets. Needed for ip6tables.Kevin Krakauer
IPv6 raw sockets never include the IPv6 header. PiperOrigin-RevId: 318582989
2020-06-27Merge release-20200622.1-33-g8dbeac53c (automated)gVisor bot
2020-06-26Implement SO_NO_CHECK socket option.gVisor bot
SO_NO_CHECK is used to skip the UDP checksum generation on a TX socket (UDP checksum is optional on IPv4). Test: - TestNoChecksum - SoNoCheckOffByDefault (UdpSocketTest) - SoNoCheck (UdpSocketTest) Fixes #3055 PiperOrigin-RevId: 318575215
2020-06-26Merge release-20200622.1-23-g7fb6cc286 (automated)gVisor bot
2020-06-25conntrack refactor, no behavior changesKevin Krakauer
- Split connTrackForPacket into 2 functions instead of switching on flag - Replace hash with struct keys. - Remove prefixes where possible - Remove unused connStatus, timeout - Flatten ConnTrack struct a bit - some intermediate structs had no meaning outside of the context of their parent. - Protect conn.tcb with a mutex - Remove redundant error checking (e.g. when is pkt.NetworkHeader valid) - Clarify that HandlePacket and CreateConnFor are the expected entrypoints for ConnTrack PiperOrigin-RevId: 318407168
2020-06-24Merge release-20200608.0-120-gb070e218c (automated)gVisor bot
2020-06-24Add support for Stack level options.Bhasker Hariharan
Linux controls socket send/receive buffers using a few sysctl variables - net.core.rmem_default - net.core.rmem_max - net.core.wmem_max - net.core.wmem_default - net.ipv4.tcp_rmem - net.ipv4.tcp_wmem The first 4 control the default socket buffer sizes for all sockets raw/packet/tcp/udp and also the maximum permitted socket buffer that can be specified in setsockopt(SOL_SOCKET, SO_(RCV|SND)BUF,...). The last two control the TCP auto-tuning limits and override the default specified in rmem_default/wmem_default as well as the max limits. Netstack today only implements tcp_rmem/tcp_wmem and incorrectly uses it to limit the maximum size in setsockopt() as well as uses it for raw/udp sockets. This changelist introduces the other 4 and updates the udp/raw sockets to use the newly introduced variables. The values for min/max match the current tcp_rmem/wmem values and the default value buffers for UDP/RAW sockets is updated to match the linux value of 212KiB up from the really low current value of 32 KiB. Updates #3043 Fixes #3043 PiperOrigin-RevId: 318089805
2020-06-24Merge release-20200608.0-119-g364ac92ba (automated)gVisor bot
2020-06-24Merge release-20200608.0-116-g2141013dc (automated)gVisor bot
2020-06-23Add support for SO_REUSEADDR to TCP sockets/endpoints.Ian Gudger
For TCP sockets, SO_REUSEADDR relaxes the rules for binding addresses. gVisor/netstack already supported a behavior similar to SO_REUSEADDR, but did not allow disabling it. This change brings the SO_REUSEADDR behavior closer to the behavior implemented by Linux and adds a new SO_REUSEADDR disabled behavior. Like Linux, SO_REUSEADDR is now disabled by default. PiperOrigin-RevId: 317984380
2020-06-22Merge release-20200608.0-103-g282a6aea1 (automated)gVisor bot
2020-06-22Extract common nested LinkEndpoint patternBruno Dal Bo
... and unify logic for detached netsted endpoints. sniffer.go caused crashes if a packet delivery is attempted when the dispatcher is nil. Extracted the endpoint nesting logic into a common composable type so it can be used by the Fuchsia Netstack (the pattern is widespread there). PiperOrigin-RevId: 317682842
2020-06-19Merge release-20200608.0-95-gd962f9f38 (automated)gVisor bot
2020-06-19Implement UDP cheksum verification.gVisor bot
Test: - TestIncrementChecksumErrors Fixes #2943 PiperOrigin-RevId: 317348158
2020-06-19Merge release-20200608.0-88-g0c169b6ad (automated)gVisor bot
2020-06-18iptables: skip iptables if no rules are setKevin Krakauer
Users that never set iptables rules shouldn't incur the iptables performance cost. Suggested by Ian (@iangudger). PiperOrigin-RevId: 317232921
2020-06-19Merge release-20200608.0-87-g28b8a5cc3 (automated)gVisor bot
2020-06-18iptables: remove metadata structKevin Krakauer
Metadata was useful for debugging and safety, but enough tests exist that we should see failures when (de)serialization is broken. It made stack initialization more cumbersome and it's also getting in the way of ip6tables. PiperOrigin-RevId: 317210653
2020-06-18Merge release-20200608.0-82-g07ff909e7 (automated)gVisor bot
2020-06-18Support setsockopt SO_SNDBUF/SO_RCVBUF for raw/udp sockets.Bhasker Hariharan
Updates #173,#6 Fixes #2888 PiperOrigin-RevId: 317087652
2020-06-18Merge release-20200608.0-81-g09b2fca40 (automated)gVisor bot
2020-06-18Cleanup tcp.timer and tcpip.RouteGhanan Gowripalan
When a tcp.timer or tcpip.Route is no longer used, clean up its resources so that unused memory may be released. PiperOrigin-RevId: 317046582
2020-06-17Increase timeouts for NDP testsGhanan Gowripalan
... to help reduce flakes. When waiting for an event to occur, use a timeout of 10s. When waiting for an event to not occur, use a timeout of 1s. Test: Ran test locally w/ run count of 1000 with and without gotsan. PiperOrigin-RevId: 316998128
2020-06-17Merge release-20200608.0-70-g50afec55c (automated)gVisor bot
2020-06-17TCP stat fixesMithun Iyer
Ensure that CurrentConnected stat is updated on any errors and cleanups during connected state processing. Fixes #2968 PiperOrigin-RevId: 316919426
2020-06-16Replace use of %v in tcp testsMithun Iyer
PiperOrigin-RevId: 316767969
2020-06-15Merge release-20200608.0-61-g67f261a87 (automated)gVisor bot
2020-06-15TCP to honor updated window size during handshake.Mithun Iyer
In passive open cases, we transition to Established state after initializing endpoint's sender and receiver. With this we lose out on any updates coming from the ACK that completes the handshake. This change ensures that we uniformly transition to Established in all cases and does minor cleanups. Fixes #2938 PiperOrigin-RevId: 316567014
2020-06-13Merge release-20200522.0-151-g3b5eaad3c (automated)gVisor bot
2020-06-12Allow reading IP_MULTICAST_LOOP and IP_MULTICAST_TTL on TCP sockets.Ian Gudger
I am not really sure what the point of this is, but someone filed a bug about it, so I assume something relies on it. PiperOrigin-RevId: 316225127
2020-06-11Merge release-20200522.0-142-g4c0a8bdaf (automated)gVisor bot
2020-06-11Do not use tentative addresses for routesGhanan Gowripalan
Tentative addresses should not be used when finding a route. This change fixes a bug where a tentative address may have been used. Test: stack_test.TestDADResolve PiperOrigin-RevId: 315997624
2020-06-11Merge release-20200522.0-129-ga085e562d (automated)gVisor bot
2020-06-10Add support for SO_REUSEADDR to UDP sockets/endpoints.Ian Gudger
On UDP sockets, SO_REUSEADDR allows multiple sockets to bind to the same address, but only delivers packets to the most recently bound socket. This differs from the behavior of SO_REUSEADDR on TCP sockets. SO_REUSEADDR for TCP sockets will likely need an almost completely independent implementation. SO_REUSEADDR has some odd interactions with the similar SO_REUSEPORT. These interactions are tested fairly extensively and all but one particularly odd one (that honestly seems like a bug) behave the same on gVisor and Linux. PiperOrigin-RevId: 315844832
2020-06-10Merge release-20200522.0-115-gf004bb870 (automated)gVisor bot
2020-06-10Remove duplicate and incorrect size checkTamir Duberstein
Minimum header sizes are already checked in each `case` arm below. Worse, the ICMP entries in transportProtocolMinSizes are incorrect, and produce false "raw packet" logs. PiperOrigin-RevId: 315730073
2020-06-10Merge release-20200522.0-114-g9d2b2c121 (automated)gVisor bot
2020-06-10Replace use of %v in snifferTamir Duberstein
PiperOrigin-RevId: 315711208
2020-06-10Merge release-20200522.0-107-g4950ccde7 (automated)gVisor bot
2020-06-09Fix write hang bug found by syzkaller.gVisor bot
After this change e.mu is only promoted to exclusively locked during route.Resolve. It downgrades back to read-lock afterwards. This prevents the second RLock() call gets stuck later in the stack. https://syzkaller.appspot.com/bug?id=065b893bd8d1d04a4e0a1d53c578537cde1efe99 Syzkaller logs does not contain interesting stack traces. The following stack trace is obtained by running repro locally. goroutine 53 [semacquire, 3 minutes]: runtime.gopark(0xfd4278, 0x1896320, 0xc000301912, 0x4) GOROOT/src/runtime/proc.go:304 +0xe0 fp=0xc0000e25f8 sp=0xc0000e25d8 pc=0x437170 runtime.goparkunlock(...) GOROOT/src/runtime/proc.go:310 runtime.semacquire1(0xc0001220b0, 0xc00000a300, 0x1, 0x0) GOROOT/src/runtime/sema.go:144 +0x1c0 fp=0xc0000e2660 sp=0xc0000e25f8 pc=0x4484e0 sync.runtime_Semacquire(0xc0001220b0) GOROOT/src/runtime/sema.go:56 +0x42 fp=0xc0000e2690 sp=0xc0000e2660 pc=0x448132 gvisor.dev/gvisor/pkg/sync.(*RWMutex).RLock(...) pkg/sync/rwmutex_unsafe.go:76 gvisor.dev/gvisor/pkg/tcpip/transport/udp.(*endpoint).HandleControlPacket(0xc000122000, 0x7ee5, 0xc00053c16c, 0x4, 0x5e21, 0xc00053c224, 0x4, 0x1, 0x0, 0xc00007ed00) pkg/tcpip/transport/udp/endpoint.go:1345 +0x169 fp=0xc0000e26d8 sp=0xc0000e2690 pc=0x9843f9 ...... gvisor.dev/gvisor/pkg/tcpip/transport/udp.(*protocol).HandleUnknownDestinationPacket(0x18bb5a0, 0xc000556540, 0x5e21, 0xc00053c16c, 0x4, 0x7ee5, 0xc00053c1ec, 0x4, 0xc00007e680, 0x4) pkg/tcpip/transport/udp/protocol.go:143 +0xb9a fp=0xc0000e8260 sp=0xc0000e7510 pc=0x9859ba ...... gvisor.dev/gvisor/pkg/tcpip/transport/udp.sendUDP(0xc0001220d0, 0xc00053ece0, 0x1, 0x1, 0x883, 0x1405e217ee5, 0x11100a0, 0xc000592000, 0xf88780) pkg/tcpip/transport/udp/endpoint.go:924 +0x3b0 fp=0xc0000ed390 sp=0xc0000ec750 pc=0x981af0 gvisor.dev/gvisor/pkg/tcpip/transport/udp.(*endpoint).write(0xc000122000, 0x11104e0, 0xc00020a460, 0x0, 0x0, 0x0, 0x0, 0x0) pkg/tcpip/transport/udp/endpoint.go:510 +0x4ad fp=0xc0000ed658 sp=0xc0000ed390 pc=0x97f2dd PiperOrigin-RevId: 315590041
2020-06-09Merge release-20200522.0-102-g2d3b9d18e (automated)gVisor bot
2020-06-09Handle removed NIC in NDP timer for packet txGhanan Gowripalan
NDP packets are sent periodically from NDP timers. These timers do not hold the NIC lock when sending packets as the packet write operation may take some time. While the lock is not held, the NIC may be removed by some other goroutine. This change handles that scenario gracefully. Test: stack_test.TestRemoveNICWhileHandlingRSTimer PiperOrigin-RevId: 315524143
2020-06-07Merge release-20200522.0-94-g32b823fc (automated)gVisor bot
2020-06-07netstack: parse incoming packet headers up-frontKevin Krakauer
Netstack has traditionally parsed headers on-demand as a packet moves up the stack. This is conceptually simple and convenient, but incompatible with iptables, where headers can be inspected and mangled before even a routing decision is made. This changes header parsing to happen early in the incoming packet path, as soon as the NIC gets the packet from a link endpoint. Even if an invalid packet is found (e.g. a TCP header of insufficient length), the packet is passed up the stack for proper stats bookkeeping. PiperOrigin-RevId: 315179302
2020-06-06Merge release-20200522.0-91-g427d2082 (automated)gVisor bot