summaryrefslogtreecommitdiffhomepage
path: root/pkg/tcpip/transport
AgeCommit message (Collapse)Author
2020-07-06Merge release-20200622.1-63-g043e5dddd (automated)gVisor bot
2020-07-06Remove dependency on pkg/binaryTamir Duberstein
PiperOrigin-RevId: 319770124
2020-07-05Merge release-20200622.1-62-g0c1353866 (automated)gVisor bot
2020-07-05Add wakers synchronouslyTamir Duberstein
Avoid a race where an arbitrary goroutine scheduling delay can cause the processor to miss events and hang indefinitely. Reduce allocations by storing processors by-value in the dispatcher, and by using a single WaitGroup rather than one per processor. PiperOrigin-RevId: 319665861
2020-07-01Merge release-20200622.1-54-g31b27adf9 (automated)gVisor bot
2020-07-01TCP receive should block when in SYN-SENT state.Mithun Iyer
The application can choose to initiate a non-blocking connect and later block on a read, when the endpoint is still in SYN-SENT state. PiperOrigin-RevId: 319311016
2020-07-01Merge release-20200622.1-47-gc9446f053 (automated)gVisor bot
2020-06-30Fix two bugs in TCP sender.Bhasker Hariharan
a) When GSO is in use we should not cap the segment to maxPayloadSize in sender.maybeSendSegment as the GSO logic will cap the segment to the correct size. Without this the host GSO is not used as we end up breaking up large segments into small MSS sized segments before writing the packets to the host. b) The check to not split a segment due to it not fitting in the receiver window when there are pending segments is incorrect as segments in writeList can be really large as we just take the write call's buffer size and create a single large segment. So a write of say 128KB will just be 1 segment in the writeList. The linux code checks if 1 MSS sized segments fits in the receiver's window and if not then does not split the current segment. gVisor's check was incorrect that it was checking if the whole segment which could be >>> 1 MSS would fit in the receiver's window. This was causing us to prematurely stop sending and falling back to retransmit timer/probe from the other end to send data. This was seen when running HTTPD benchmarks where @ HEAD when sending large files the benchmark was taking forever to run. The tcp_splitseg_mss_test.go is being deleted as the test as written doesn't test what is intended correctly. This is because GSO is enabled by default and the reason the MSS+1 sized segment is sent is because GSO is in use. A proper test will require disabling GSO on linux and netstack which is going to take a bit of work in packetimpact to do it correctly. Separately a new test probably should be written that verifies that a segment > availableWindow is not split if the availableWindow is < 1 MSS. Fixes #3107 PiperOrigin-RevId: 319172089
2020-06-30Merge release-20200622.1-42-g4784ed46e (automated)gVisor bot
2020-06-30Avoid multiple atomic loadsTamir Duberstein
...by calling (*tcp.endpoint).EndpointState only once when possible. Avoid wrapping (*sleep.Waker).Assert in a useless func while I'm here. PiperOrigin-RevId: 319074149
2020-06-27Merge release-20200622.1-34-g66d166544 (automated)gVisor bot
2020-06-26IPv6 raw sockets. Needed for ip6tables.Kevin Krakauer
IPv6 raw sockets never include the IPv6 header. PiperOrigin-RevId: 318582989
2020-06-27Merge release-20200622.1-33-g8dbeac53c (automated)gVisor bot
2020-06-26Implement SO_NO_CHECK socket option.gVisor bot
SO_NO_CHECK is used to skip the UDP checksum generation on a TX socket (UDP checksum is optional on IPv4). Test: - TestNoChecksum - SoNoCheckOffByDefault (UdpSocketTest) - SoNoCheck (UdpSocketTest) Fixes #3055 PiperOrigin-RevId: 318575215
2020-06-24Merge release-20200608.0-120-gb070e218c (automated)gVisor bot
2020-06-24Add support for Stack level options.Bhasker Hariharan
Linux controls socket send/receive buffers using a few sysctl variables - net.core.rmem_default - net.core.rmem_max - net.core.wmem_max - net.core.wmem_default - net.ipv4.tcp_rmem - net.ipv4.tcp_wmem The first 4 control the default socket buffer sizes for all sockets raw/packet/tcp/udp and also the maximum permitted socket buffer that can be specified in setsockopt(SOL_SOCKET, SO_(RCV|SND)BUF,...). The last two control the TCP auto-tuning limits and override the default specified in rmem_default/wmem_default as well as the max limits. Netstack today only implements tcp_rmem/tcp_wmem and incorrectly uses it to limit the maximum size in setsockopt() as well as uses it for raw/udp sockets. This changelist introduces the other 4 and updates the udp/raw sockets to use the newly introduced variables. The values for min/max match the current tcp_rmem/wmem values and the default value buffers for UDP/RAW sockets is updated to match the linux value of 212KiB up from the really low current value of 32 KiB. Updates #3043 Fixes #3043 PiperOrigin-RevId: 318089805
2020-06-24Merge release-20200608.0-119-g364ac92ba (automated)gVisor bot
2020-06-24Merge release-20200608.0-116-g2141013dc (automated)gVisor bot
2020-06-23Add support for SO_REUSEADDR to TCP sockets/endpoints.Ian Gudger
For TCP sockets, SO_REUSEADDR relaxes the rules for binding addresses. gVisor/netstack already supported a behavior similar to SO_REUSEADDR, but did not allow disabling it. This change brings the SO_REUSEADDR behavior closer to the behavior implemented by Linux and adds a new SO_REUSEADDR disabled behavior. Like Linux, SO_REUSEADDR is now disabled by default. PiperOrigin-RevId: 317984380
2020-06-19Merge release-20200608.0-95-gd962f9f38 (automated)gVisor bot
2020-06-19Implement UDP cheksum verification.gVisor bot
Test: - TestIncrementChecksumErrors Fixes #2943 PiperOrigin-RevId: 317348158
2020-06-18Merge release-20200608.0-82-g07ff909e7 (automated)gVisor bot
2020-06-18Support setsockopt SO_SNDBUF/SO_RCVBUF for raw/udp sockets.Bhasker Hariharan
Updates #173,#6 Fixes #2888 PiperOrigin-RevId: 317087652
2020-06-18Merge release-20200608.0-81-g09b2fca40 (automated)gVisor bot
2020-06-18Cleanup tcp.timer and tcpip.RouteGhanan Gowripalan
When a tcp.timer or tcpip.Route is no longer used, clean up its resources so that unused memory may be released. PiperOrigin-RevId: 317046582
2020-06-17Merge release-20200608.0-70-g50afec55c (automated)gVisor bot
2020-06-17TCP stat fixesMithun Iyer
Ensure that CurrentConnected stat is updated on any errors and cleanups during connected state processing. Fixes #2968 PiperOrigin-RevId: 316919426
2020-06-16Replace use of %v in tcp testsMithun Iyer
PiperOrigin-RevId: 316767969
2020-06-15Merge release-20200608.0-61-g67f261a87 (automated)gVisor bot
2020-06-15TCP to honor updated window size during handshake.Mithun Iyer
In passive open cases, we transition to Established state after initializing endpoint's sender and receiver. With this we lose out on any updates coming from the ACK that completes the handshake. This change ensures that we uniformly transition to Established in all cases and does minor cleanups. Fixes #2938 PiperOrigin-RevId: 316567014
2020-06-13Merge release-20200522.0-151-g3b5eaad3c (automated)gVisor bot
2020-06-12Allow reading IP_MULTICAST_LOOP and IP_MULTICAST_TTL on TCP sockets.Ian Gudger
I am not really sure what the point of this is, but someone filed a bug about it, so I assume something relies on it. PiperOrigin-RevId: 316225127
2020-06-11Merge release-20200522.0-129-ga085e562d (automated)gVisor bot
2020-06-10Add support for SO_REUSEADDR to UDP sockets/endpoints.Ian Gudger
On UDP sockets, SO_REUSEADDR allows multiple sockets to bind to the same address, but only delivers packets to the most recently bound socket. This differs from the behavior of SO_REUSEADDR on TCP sockets. SO_REUSEADDR for TCP sockets will likely need an almost completely independent implementation. SO_REUSEADDR has some odd interactions with the similar SO_REUSEPORT. These interactions are tested fairly extensively and all but one particularly odd one (that honestly seems like a bug) behave the same on gVisor and Linux. PiperOrigin-RevId: 315844832
2020-06-10Merge release-20200522.0-107-g4950ccde7 (automated)gVisor bot
2020-06-09Fix write hang bug found by syzkaller.gVisor bot
After this change e.mu is only promoted to exclusively locked during route.Resolve. It downgrades back to read-lock afterwards. This prevents the second RLock() call gets stuck later in the stack. https://syzkaller.appspot.com/bug?id=065b893bd8d1d04a4e0a1d53c578537cde1efe99 Syzkaller logs does not contain interesting stack traces. The following stack trace is obtained by running repro locally. goroutine 53 [semacquire, 3 minutes]: runtime.gopark(0xfd4278, 0x1896320, 0xc000301912, 0x4) GOROOT/src/runtime/proc.go:304 +0xe0 fp=0xc0000e25f8 sp=0xc0000e25d8 pc=0x437170 runtime.goparkunlock(...) GOROOT/src/runtime/proc.go:310 runtime.semacquire1(0xc0001220b0, 0xc00000a300, 0x1, 0x0) GOROOT/src/runtime/sema.go:144 +0x1c0 fp=0xc0000e2660 sp=0xc0000e25f8 pc=0x4484e0 sync.runtime_Semacquire(0xc0001220b0) GOROOT/src/runtime/sema.go:56 +0x42 fp=0xc0000e2690 sp=0xc0000e2660 pc=0x448132 gvisor.dev/gvisor/pkg/sync.(*RWMutex).RLock(...) pkg/sync/rwmutex_unsafe.go:76 gvisor.dev/gvisor/pkg/tcpip/transport/udp.(*endpoint).HandleControlPacket(0xc000122000, 0x7ee5, 0xc00053c16c, 0x4, 0x5e21, 0xc00053c224, 0x4, 0x1, 0x0, 0xc00007ed00) pkg/tcpip/transport/udp/endpoint.go:1345 +0x169 fp=0xc0000e26d8 sp=0xc0000e2690 pc=0x9843f9 ...... gvisor.dev/gvisor/pkg/tcpip/transport/udp.(*protocol).HandleUnknownDestinationPacket(0x18bb5a0, 0xc000556540, 0x5e21, 0xc00053c16c, 0x4, 0x7ee5, 0xc00053c1ec, 0x4, 0xc00007e680, 0x4) pkg/tcpip/transport/udp/protocol.go:143 +0xb9a fp=0xc0000e8260 sp=0xc0000e7510 pc=0x9859ba ...... gvisor.dev/gvisor/pkg/tcpip/transport/udp.sendUDP(0xc0001220d0, 0xc00053ece0, 0x1, 0x1, 0x883, 0x1405e217ee5, 0x11100a0, 0xc000592000, 0xf88780) pkg/tcpip/transport/udp/endpoint.go:924 +0x3b0 fp=0xc0000ed390 sp=0xc0000ec750 pc=0x981af0 gvisor.dev/gvisor/pkg/tcpip/transport/udp.(*endpoint).write(0xc000122000, 0x11104e0, 0xc00020a460, 0x0, 0x0, 0x0, 0x0, 0x0) pkg/tcpip/transport/udp/endpoint.go:510 +0x4ad fp=0xc0000ed658 sp=0xc0000ed390 pc=0x97f2dd PiperOrigin-RevId: 315590041
2020-06-07Merge release-20200522.0-94-g32b823fc (automated)gVisor bot
2020-06-07netstack: parse incoming packet headers up-frontKevin Krakauer
Netstack has traditionally parsed headers on-demand as a packet moves up the stack. This is conceptually simple and convenient, but incompatible with iptables, where headers can be inspected and mangled before even a routing decision is made. This changes header parsing to happen early in the incoming packet path, as soon as the NIC gets the packet from a link endpoint. Even if an invalid packet is found (e.g. a TCP header of insufficient length), the packet is passed up the stack for proper stats bookkeeping. PiperOrigin-RevId: 315179302
2020-06-05Drop flaky tag.Adin Scannell
PiperOrigin-RevId: 315018295
2020-06-05Merge release-20200522.0-82-g6d9a68ca (automated)gVisor bot
2020-06-05Centralize the categories of endpoint states.Rahat Mahmood
PiperOrigin-RevId: 314996457
2020-06-05Merge release-20200522.0-81-g526df4f5 (automated)gVisor bot
2020-06-05Fix error code returned due to Port exhaustion.Bhasker Hariharan
For TCP sockets gVisor incorrectly returns EAGAIN when no ephemeral ports are available to bind during a connect. Linux returns EADDRNOTAVAIL. This change fixes gVisor to return the correct code and adds a test for the same. This change also fixes a minor bug for ping sockets where connect() would fail with EINVAL unless the socket was bound first. Also added tests for testing UDP Port exhaustion and Ping socket port exhaustion. PiperOrigin-RevId: 314988525
2020-06-05Merge release-20200522.0-76-g41da7a56 (automated)gVisor bot
2020-06-05Merge release-20200522.0-75-gf7663660 (automated)gVisor bot
2020-06-05Fix copylocks error about copying IPTables.Ting-Yu Wang
IPTables.connections contains a sync.RWMutex. Copying it will trigger copylocks analysis. Tested by manually enabling nogo tests. sync.RWMutex is added to IPTables for the additional race condition discovered. PiperOrigin-RevId: 314817019
2020-06-05Handle TCP segment split cases as per MSS.Mithun Iyer
- Always split segments larger than MSS. Currently, we base the segment split decision as a function of the send congestion window and MSS, which could be greater than the MSS advertised by remote. - While splitting segments, ensure the PSH flag is reset when there are segments that are queued to be sent. - With TCP_CORK, hold up segments up until MSS. Fix a bug in computing available send space before attempting to coalesce segments. Fixes #2832 PiperOrigin-RevId: 314802928
2020-06-03Merge release-20200522.0-72-gd3a8bffe (automated)gVisor bot
2020-06-03Pass PacketBuffer as pointer.Ting-Yu Wang
Historically we've been passing PacketBuffer by shallow copying through out the stack. Right now, this is only correct as the caller would not use PacketBuffer after passing into the next layer in netstack. With new buffer management effort in gVisor/netstack, PacketBuffer will own a Buffer (to be added). Internally, both PacketBuffer and Buffer may have pointers and shallow copying shouldn't be used. Updates #2404. PiperOrigin-RevId: 314610879
2020-06-03Merge release-20200522.0-66-g162848e1 (automated)gVisor bot