summaryrefslogtreecommitdiffhomepage
path: root/pkg/tcpip/transport/tcp
AgeCommit message (Collapse)Author
2021-03-17Merge release-20210309.0-36-g3dd7ad13b (automated)gVisor bot
2021-03-16Fix tcp_fin_retransmission_netstack_testZeling Feng
Netstack does not check ACK number for FIN-ACK packets and goes into TIMEWAIT unconditionally. Fixing the state machine will give us back the retransmission of FIN. PiperOrigin-RevId: 363301883
2021-03-16Merge release-20210309.0-35-g5eede4e75 (automated)gVisor bot
2021-03-16Fix a race with synRcvdCount and acceptMithun Iyer
There is a race in handling new incoming connections on a listening endpoint that causes the endpoint to reply to more incoming SYNs than what is permitted by the listen backlog. The race occurs when there is a successful passive connection handshake and the synRcvdCount counter is decremented, followed by the endpoint delivered to the accept queue. In the window of time between synRcvdCount decrementing and the endpoint being enqueued for accept, new incoming SYNs can be handled without honoring the listen backlog value, as the backlog could be perceived not full. Fixes #5637 PiperOrigin-RevId: 363279372
2021-03-16Merge release-20210309.0-27-gb1d578772 (automated)gVisor bot
2021-03-15Make netstack (//pkg/tcpip) buildable for 32 bitKevin Krakauer
Doing so involved breaking dependencies between //pkg/tcpip and the rest of gVisor, which are discouraged anyways. Tested on the Go branch via: gvisor.dev/gvisor/pkg/tcpip/... Addresses #1446. PiperOrigin-RevId: 363081778
2021-03-12Merge release-20210301.0-44-g82d7fb2cb (automated)gVisor bot
2021-03-11improve readability of ports packageKevin Krakauer
Lots of small changes: - simplify package API via Reservation type - rename some single-letter variable names that were hard to follow - rename some types PiperOrigin-RevId: 362442366
2021-03-10Merge release-20210301.0-31-g2a888a106 (automated)gVisor bot
2021-03-09Give TCP flags a dedicated typeZeling Feng
- Implement Stringer for it so that we can improve error messages. - Use TCPFlags through the code base. There used to be a mixed usage of byte, uint8 and int as TCP flags. PiperOrigin-RevId: 361940150
2021-03-09Merge release-20210301.0-29-gabbdcebc5 (automated)gVisor bot
2021-03-08Implement /proc/sys/net/ipv4/ip_local_port_rangeKevin Krakauer
Speeds up the socket stress tests by a couple orders of magnitude. PiperOrigin-RevId: 361721050
2021-03-04Merge release-20210301.0-14-ga9face757 (automated)gVisor bot
2021-03-04Nit fix: Should use maxTimeout in backoffTimerTing-Yu Wang
The only user is in (*handshake).complete and it specifies MaxRTO, so there is no behavior changes. PiperOrigin-RevId: 360954447
2021-03-04Merge release-20210301.0-12-g1cd76d958 (automated)gVisor bot
2021-03-03Make dedicated methods for data operations in PacketBufferTing-Yu Wang
One of the preparation to decouple underlying buffer implementation. There are still some methods that tie to VectorisedView, and they will be changed gradually in later CLs. This CL also introduce a new ICMPv6ChecksumParams to replace long list of parameters when calling ICMPv6Checksum, aiming to be more descriptive. PiperOrigin-RevId: 360778149
2021-03-03Merge release-20210301.0-8-g3e69f5d08 (automated)gVisor bot
2021-03-03Add checklocks analyzer.Bhasker Hariharan
This validates that struct fields if annotated with "// checklocks:mu" where "mu" is a mutex field in the same struct then access to the field is only done with "mu" locked. All types that are guarded by a mutex must be annotated with // +checklocks:<mutex field name> For more details please refer to README.md. PiperOrigin-RevId: 360729328
2021-03-01Merge release-20210208.0-106-g865ca64ee (automated)gVisor bot
2021-03-01tcp: endpoint.Write has to send all data that has been read from payloadAndrei Vagin
io.Reader.ReadFull returns the number of bytes copied and an error if fewer bytes were read. PiperOrigin-RevId: 360247614
2021-02-27Merge release-20210208.0-105-g037bb2f45 (automated)gVisor bot
2021-02-26Fix panic due to zero length writes in TCP.Bhasker Hariharan
There is a short race where in Write an endpoint can transition from writable to non-writable state due to say an incoming RST during the time we release the endpoint lock and reacquire after copying the payload. In such a case if the write happens to be a zero sized write we end up trying to call sendData() even though nothing was queued. This can panic when trying to enable/disable TCP timers if the endpoint had already transitioned to a CLOSED/ERROR state due to the incoming RST as we cleanup timers when the protocol goroutine terminates. Sadly the race window is small enough that my attempts at reproducing the panic in a syscall test has not been successful. PiperOrigin-RevId: 359887905
2021-02-26Merge release-20210208.0-99-gf3de211bb (automated)gVisor bot
2021-02-25RACK: recovery logic should check for receive window before re-transmitting.Nayana Bidari
Use maybeSendSegment while sending segments in RACK recovery which checks if the receiver has space and splits the segments when the segment size is greater than MSS. PiperOrigin-RevId: 359641097
2021-02-12Merge release-20210208.0-52-g845d0a65f (automated)gVisor bot
2021-02-11[rack] TLP: ACK Processing and PTO scheduling.Ayush Ranjan
This change implements TLP details enumerated in https://tools.ietf.org/html/draft-ietf-tcpm-rack-08#section-7.5.3 Fixes #5085 PiperOrigin-RevId: 357125037
2021-02-12Merge release-20210201.0-91-g91cf7b3ca (automated)gVisor bot
2021-02-11[netstack] Fix recovery entry and exit checks.Ayush Ranjan
Entry check: - Earlier implementation was preventing us from entering recovery even if SND.UNA is lost but dupAckCount is still below threshold. Fixed that. - We should only enter recovery when at least one more byte of data beyond the highest byte that was outstanding when fast retransmit was last entered is acked. Added that check. Exit check: - Earlier we were checking if SEG.ACK is in range [SND.UNA, SND.NXT]. The intention was to check if any unacknowledged data was ACKed. Note that (SEG.ACK - 1) is actually the sequence number which was ACKed. So we were incorrectly including (SND.UNA - 1) in the range. Fixed the check to now be (SEG.ACK - 1) in range [SND.UNA, SND.NXT). Additionally, moved a RACK specific test to the rack tests file. Added tests for the changes I made. PiperOrigin-RevId: 357091322
2021-02-11Merge release-20210201.0-83-gff04d019e (automated)gVisor bot
2021-02-10RACK: Fix re-transmitting the segment twice when entering recovery.Nayana Bidari
TestRACKWithDuplicateACK is flaky as the reorder window can expire before receiving three duplicate ACKs which will result in sending the first unacknowledged segment twice: when reorder timer expired and again after receiving the third duplicate ACK. This CL will fix this behavior and will not resend the segment again if it was already re-transmittted when reorder timer expired. Update the TestRACKWithDuplicateACK to test that the first segment is considered as lost and is re-transmitted. PiperOrigin-RevId: 356855168
2021-02-10Merge release-20210201.0-72-g298c129cc (automated)gVisor bot
2021-02-09Add support for setting SO_SNDBUF for unix domain sockets.Bhasker Hariharan
The limits for snd/rcv buffers for unix domain socket is controlled by the following sysctls on linux - net.core.rmem_default - net.core.rmem_max - net.core.wmem_default - net.core.wmem_max Today in gVisor we do not expose these sysctls but we do support setting the equivalent in netstack via stack.Options() method. But AF_UNIX sockets in gVisor can be used without netstack, with hostinet or even without any networking stack at all. Which means ideally these sysctls need to live as globals in gVisor. But rather than make this a big change for now we hardcode the limits in the AF_UNIX implementation itself (which in itself is better than where we were before) where it SO_SNDBUF was hardcoded to 16KiB. Further we bump the initial limit to a default value of 208 KiB to match linux from the paltry 16 KiB we use today. Updates #5132 PiperOrigin-RevId: 356665498
2021-02-08Merge release-20210201.0-55-gfe63db2e9 (automated)gVisor bot
2021-02-08RACK: Detect lossNayana Bidari
Detect packet loss using reorder window and re-transmit them after the reorder timer expires. PiperOrigin-RevId: 356321786
2021-02-03Merge release-20210125.0-74-ge3bce9689 (automated)gVisor bot
2021-02-03Add a function to enable RACK in tests.Nayana Bidari
- Adds a function to enable RACK in tests. - RACK update functions are guarded behind the flag tcpRecovery. PiperOrigin-RevId: 355435973
2021-02-02Merge release-20210125.0-63-g49f783fb6 (automated)gVisor bot
2021-02-02Rename HandleNDupAcks in TCP.Nayana Bidari
Rename HandleNDupAcks() to HandleLossDetected() as it will enter this when is detected after: - reorder window expires and TLP (in case of RACK) - dupAckCount >= 3 PiperOrigin-RevId: 355237858
2021-02-02Merge release-20210125.0-58-g8c7c5abaf (automated)gVisor bot
2021-02-02Add support for rate limiting out of window ACKs.Bhasker Hariharan
Netstack today will send dupACK's with no rate limit for incoming out of window segments. This can result in ACK loops for example if a TCP socket connects to itself (actually permitted by TCP). Where the ACK sent in response to packets being out of order itself gets considered as an out of window segment resulting in another ACK being generated. PiperOrigin-RevId: 355206877
2021-02-01Merge release-20210125.0-47-gebd3912c0 (automated)gVisor bot
2021-02-01Refactor HandleControlPacket/SockErrorGhanan Gowripalan
...to remove the need for the transport layer to deduce the type of error it received. Rename HandleControlPacket to HandleError as HandleControlPacket only handles errors. tcpip.SockError now holds a tcpip.SockErrorCause interface that different errors can implement. PiperOrigin-RevId: 354994306
2021-01-29Merge release-20210125.0-24-gff4fc4278 (automated)gVisor bot
2021-01-28RACK: Update reorder window.Nayana Bidari
After receiving an ACK(cumulative or selective), RACK will update the reorder window which is used as a settling time before marking the packet as lost. This change will add an init function to initialize the variables in RACK and also store the reference to sender in rackControl. The reorder window is calculated as per rfc: https://tools.ietf.org/html/draft-ietf-tcpm-rack-08#section-7.2 Step 4. PiperOrigin-RevId: 354453528
2021-01-29Merge release-20210125.0-21-g8d1afb418 (automated)gVisor bot
2021-01-28Change tcpip.Error to an interfaceTamir Duberstein
This makes it possible to add data to types that implement tcpip.Error. ErrBadLinkEndpoint is removed as it is unused. PiperOrigin-RevId: 354437314
2021-01-28Merge release-20210125.0-11-gb85b23e50 (automated)gVisor bot
2021-01-27Confirm neighbor reachability with TCP ACKsGhanan Gowripalan
As per RFC 4861 section 7.3.1, A neighbor is considered reachable if the node has recently received a confirmation that packets sent recently to the neighbor were received by its IP layer. Positive confirmation can be gathered in two ways: hints from upper-layer protocols that indicate a connection is making "forward progress", or receipt of a Neighbor Advertisement message that is a response to a Neighbor Solicitation message. This change adds support for TCP to let the IP/link layers know that a neighbor is reachable. Test: integration_test.TestTCPConfirmNeighborReachability PiperOrigin-RevId: 354222833
2021-01-28Merge release-20210112.0-104-g99988e45e (automated)gVisor bot
2021-01-27Add support for more fields in netstack for TCP_INFONayana Bidari
This CL adds support for the following fields: - RTT, RTTVar, RTO - send congestion window (sndCwnd) and send slow start threshold (sndSsthresh) - congestion control state(CaState) - ReorderSeen PiperOrigin-RevId: 354195361