summaryrefslogtreecommitdiffhomepage
path: root/pkg/tcpip/transport
AgeCommit message (Collapse)Author
2021-03-06Merge release-20210301.0-20-gfb733cdb8 (automated)gVisor bot
2021-03-05Increment the counters when sending Echo requestsArthur Sfez
Updates #5597 PiperOrigin-RevId: 361252003
2021-03-04Merge release-20210301.0-14-ga9face757 (automated)gVisor bot
2021-03-04Nit fix: Should use maxTimeout in backoffTimerTing-Yu Wang
The only user is in (*handshake).complete and it specifies MaxRTO, so there is no behavior changes. PiperOrigin-RevId: 360954447
2021-03-04Merge release-20210301.0-12-g1cd76d958 (automated)gVisor bot
2021-03-03Make dedicated methods for data operations in PacketBufferTing-Yu Wang
One of the preparation to decouple underlying buffer implementation. There are still some methods that tie to VectorisedView, and they will be changed gradually in later CLs. This CL also introduce a new ICMPv6ChecksumParams to replace long list of parameters when calling ICMPv6Checksum, aiming to be more descriptive. PiperOrigin-RevId: 360778149
2021-03-03Merge release-20210301.0-8-g3e69f5d08 (automated)gVisor bot
2021-03-03Add checklocks analyzer.Bhasker Hariharan
This validates that struct fields if annotated with "// checklocks:mu" where "mu" is a mutex field in the same struct then access to the field is only done with "mu" locked. All types that are guarded by a mutex must be annotated with // +checklocks:<mutex field name> For more details please refer to README.md. PiperOrigin-RevId: 360729328
2021-03-01Merge release-20210208.0-106-g865ca64ee (automated)gVisor bot
2021-03-01tcp: endpoint.Write has to send all data that has been read from payloadAndrei Vagin
io.Reader.ReadFull returns the number of bytes copied and an error if fewer bytes were read. PiperOrigin-RevId: 360247614
2021-02-27Merge release-20210208.0-105-g037bb2f45 (automated)gVisor bot
2021-02-26Fix panic due to zero length writes in TCP.Bhasker Hariharan
There is a short race where in Write an endpoint can transition from writable to non-writable state due to say an incoming RST during the time we release the endpoint lock and reacquire after copying the payload. In such a case if the write happens to be a zero sized write we end up trying to call sendData() even though nothing was queued. This can panic when trying to enable/disable TCP timers if the endpoint had already transitioned to a CLOSED/ERROR state due to the incoming RST as we cleanup timers when the protocol goroutine terminates. Sadly the race window is small enough that my attempts at reproducing the panic in a syscall test has not been successful. PiperOrigin-RevId: 359887905
2021-02-26Merge release-20210208.0-101-gda2505df9 (automated)gVisor bot
2021-02-26Use closure to avoid manual unlockingTamir Duberstein
Also increase refcount of raw.endpoint.route while in use. Avoid allocating an array of size zero. PiperOrigin-RevId: 359797788
2021-02-26Merge release-20210208.0-99-gf3de211bb (automated)gVisor bot
2021-02-25RACK: recovery logic should check for receive window before re-transmitting.Nayana Bidari
Use maybeSendSegment while sending segments in RACK recovery which checks if the receiver has space and splits the segments when the segment size is greater than MSS. PiperOrigin-RevId: 359641097
2021-02-25Merge release-20210208.0-97-g38c42bbf4 (automated)gVisor bot
2021-02-25Remove deadlock in raw.endpoint caused by recursive read lockingKevin Krakauer
Prevents the following deadlock: - Raw packet is sent via e.Write(), which read locks e.mu - Connect() is called, blocking on write locking e.mu - The packet is routed to loopback and back to e.HandlePacket(), which read locks e.mu Per the atomic.RWMutex documentation, this deadlocks: "If a goroutine holds a RWMutex for reading and another goroutine might call Lock, no goroutine should expect to be able to acquire a read lock until the initial read lock is released. In particular, this prohibits recursive read locking. This is to ensure that the lock eventually becomes available; a blocked Lock call excludes new readers from acquiring the lock." Also, release eps.mu earlier in deliverRawPacket. PiperOrigin-RevId: 359600926
2021-02-12Merge release-20210208.0-52-g845d0a65f (automated)gVisor bot
2021-02-11[rack] TLP: ACK Processing and PTO scheduling.Ayush Ranjan
This change implements TLP details enumerated in https://tools.ietf.org/html/draft-ietf-tcpm-rack-08#section-7.5.3 Fixes #5085 PiperOrigin-RevId: 357125037
2021-02-12Merge release-20210201.0-91-g91cf7b3ca (automated)gVisor bot
2021-02-11[netstack] Fix recovery entry and exit checks.Ayush Ranjan
Entry check: - Earlier implementation was preventing us from entering recovery even if SND.UNA is lost but dupAckCount is still below threshold. Fixed that. - We should only enter recovery when at least one more byte of data beyond the highest byte that was outstanding when fast retransmit was last entered is acked. Added that check. Exit check: - Earlier we were checking if SEG.ACK is in range [SND.UNA, SND.NXT]. The intention was to check if any unacknowledged data was ACKed. Note that (SEG.ACK - 1) is actually the sequence number which was ACKed. So we were incorrectly including (SND.UNA - 1) in the range. Fixed the check to now be (SEG.ACK - 1) in range [SND.UNA, SND.NXT). Additionally, moved a RACK specific test to the rack tests file. Added tests for the changes I made. PiperOrigin-RevId: 357091322
2021-02-11Merge release-20210201.0-83-gff04d019e (automated)gVisor bot
2021-02-10RACK: Fix re-transmitting the segment twice when entering recovery.Nayana Bidari
TestRACKWithDuplicateACK is flaky as the reorder window can expire before receiving three duplicate ACKs which will result in sending the first unacknowledged segment twice: when reorder timer expired and again after receiving the third duplicate ACK. This CL will fix this behavior and will not resend the segment again if it was already re-transmittted when reorder timer expired. Update the TestRACKWithDuplicateACK to test that the first segment is considered as lost and is re-transmitted. PiperOrigin-RevId: 356855168
2021-02-10Merge release-20210201.0-72-g298c129cc (automated)gVisor bot
2021-02-09Add support for setting SO_SNDBUF for unix domain sockets.Bhasker Hariharan
The limits for snd/rcv buffers for unix domain socket is controlled by the following sysctls on linux - net.core.rmem_default - net.core.rmem_max - net.core.wmem_default - net.core.wmem_max Today in gVisor we do not expose these sysctls but we do support setting the equivalent in netstack via stack.Options() method. But AF_UNIX sockets in gVisor can be used without netstack, with hostinet or even without any networking stack at all. Which means ideally these sysctls need to live as globals in gVisor. But rather than make this a big change for now we hardcode the limits in the AF_UNIX implementation itself (which in itself is better than where we were before) where it SO_SNDBUF was hardcoded to 16KiB. Further we bump the initial limit to a default value of 208 KiB to match linux from the paltry 16 KiB we use today. Updates #5132 PiperOrigin-RevId: 356665498
2021-02-09Merge release-20210201.0-60-g95500ece5 (automated)gVisor bot
2021-02-08Allow UDP sockets connect()ing to port 0Zeling Feng
We previously return EINVAL when connecting to port 0, however this is not the observed behavior on Linux. One of the observable effects after connecting to port 0 on Linux is that getpeername() will fail with ENOTCONN. PiperOrigin-RevId: 356413451
2021-02-08Merge release-20210201.0-55-gfe63db2e9 (automated)gVisor bot
2021-02-08RACK: Detect lossNayana Bidari
Detect packet loss using reorder window and re-transmit them after the reorder timer expires. PiperOrigin-RevId: 356321786
2021-02-03Merge release-20210125.0-74-ge3bce9689 (automated)gVisor bot
2021-02-03Add a function to enable RACK in tests.Nayana Bidari
- Adds a function to enable RACK in tests. - RACK update functions are guarded behind the flag tcpRecovery. PiperOrigin-RevId: 355435973
2021-02-02Merge release-20210125.0-63-g49f783fb6 (automated)gVisor bot
2021-02-02Rename HandleNDupAcks in TCP.Nayana Bidari
Rename HandleNDupAcks() to HandleLossDetected() as it will enter this when is detected after: - reorder window expires and TLP (in case of RACK) - dupAckCount >= 3 PiperOrigin-RevId: 355237858
2021-02-02Merge release-20210125.0-58-g8c7c5abaf (automated)gVisor bot
2021-02-02Add support for rate limiting out of window ACKs.Bhasker Hariharan
Netstack today will send dupACK's with no rate limit for incoming out of window segments. This can result in ACK loops for example if a TCP socket connects to itself (actually permitted by TCP). Where the ACK sent in response to packets being out of order itself gets considered as an out of window segment resulting in another ACK being generated. PiperOrigin-RevId: 355206877
2021-02-01Merge release-20210125.0-47-gebd3912c0 (automated)gVisor bot
2021-02-01Refactor HandleControlPacket/SockErrorGhanan Gowripalan
...to remove the need for the transport layer to deduce the type of error it received. Rename HandleControlPacket to HandleError as HandleControlPacket only handles errors. tcpip.SockError now holds a tcpip.SockErrorCause interface that different errors can implement. PiperOrigin-RevId: 354994306
2021-01-31Hide neighbor table kind from NetworkEndpointGhanan Gowripalan
The network endpoint should not need to have logic to handle different kinds of neighbor tables. Network endpoints can let the NIC know about differnt neighbor discovery messages and let the NIC decide which table to update. This allows us to remove the LinkAddressCache interface. PiperOrigin-RevId: 354812584
2021-01-29Merge release-20210125.0-24-gff4fc4278 (automated)gVisor bot
2021-01-28RACK: Update reorder window.Nayana Bidari
After receiving an ACK(cumulative or selective), RACK will update the reorder window which is used as a settling time before marking the packet as lost. This change will add an init function to initialize the variables in RACK and also store the reference to sender in rackControl. The reorder window is calculated as per rfc: https://tools.ietf.org/html/draft-ietf-tcpm-rack-08#section-7.2 Step 4. PiperOrigin-RevId: 354453528
2021-01-29Merge release-20210125.0-21-g8d1afb418 (automated)gVisor bot
2021-01-28Change tcpip.Error to an interfaceTamir Duberstein
This makes it possible to add data to types that implement tcpip.Error. ErrBadLinkEndpoint is removed as it is unused. PiperOrigin-RevId: 354437314
2021-01-28Merge release-20210125.0-12-g6012fe9b5 (automated)gVisor bot
2021-01-28Respect SO_BINDTODEVICE in unconnected UDP writesMarina Ciocea
Previously, sending on an unconnected UDP socket would ignore the SO_BINDTODEVICE option. Send on the configured interface when an UDP socket is bound to an interface through setsockop SO_BINDTODEVICE. Add packetimpact tests exercising UDP reads and writes with every combination of bound/unbound, broadcast/multicast/unicast destination, and bound/not-bound to device. PiperOrigin-RevId: 354299670
2021-01-28Merge release-20210125.0-11-gb85b23e50 (automated)gVisor bot
2021-01-27Confirm neighbor reachability with TCP ACKsGhanan Gowripalan
As per RFC 4861 section 7.3.1, A neighbor is considered reachable if the node has recently received a confirmation that packets sent recently to the neighbor were received by its IP layer. Positive confirmation can be gathered in two ways: hints from upper-layer protocols that indicate a connection is making "forward progress", or receipt of a Neighbor Advertisement message that is a response to a Neighbor Solicitation message. This change adds support for TCP to let the IP/link layers know that a neighbor is reachable. Test: integration_test.TestTCPConfirmNeighborReachability PiperOrigin-RevId: 354222833
2021-01-28Merge release-20210112.0-104-g99988e45e (automated)gVisor bot
2021-01-27Add support for more fields in netstack for TCP_INFONayana Bidari
This CL adds support for the following fields: - RTT, RTTVar, RTO - send congestion window (sndCwnd) and send slow start threshold (sndSsthresh) - congestion control state(CaState) - ReorderSeen PiperOrigin-RevId: 354195361
2021-01-27Merge release-20210112.0-98-g8e6604474 (automated)gVisor bot