summaryrefslogtreecommitdiffhomepage
path: root/pkg/tcpip/transport
AgeCommit message (Collapse)Author
2021-03-17Merge release-20210309.0-36-g3dd7ad13b (automated)gVisor bot
2021-03-16Fix tcp_fin_retransmission_netstack_testZeling Feng
Netstack does not check ACK number for FIN-ACK packets and goes into TIMEWAIT unconditionally. Fixing the state machine will give us back the retransmission of FIN. PiperOrigin-RevId: 363301883
2021-03-16Merge release-20210309.0-35-g5eede4e75 (automated)gVisor bot
2021-03-16Fix a race with synRcvdCount and acceptMithun Iyer
There is a race in handling new incoming connections on a listening endpoint that causes the endpoint to reply to more incoming SYNs than what is permitted by the listen backlog. The race occurs when there is a successful passive connection handshake and the synRcvdCount counter is decremented, followed by the endpoint delivered to the accept queue. In the window of time between synRcvdCount decrementing and the endpoint being enqueued for accept, new incoming SYNs can be handled without honoring the listen backlog value, as the backlog could be perceived not full. Fixes #5637 PiperOrigin-RevId: 363279372
2021-03-16Merge release-20210309.0-27-gb1d578772 (automated)gVisor bot
2021-03-15Make netstack (//pkg/tcpip) buildable for 32 bitKevin Krakauer
Doing so involved breaking dependencies between //pkg/tcpip and the rest of gVisor, which are discouraged anyways. Tested on the Go branch via: gvisor.dev/gvisor/pkg/tcpip/... Addresses #1446. PiperOrigin-RevId: 363081778
2021-03-12Merge release-20210301.0-44-g82d7fb2cb (automated)gVisor bot
2021-03-11improve readability of ports packageKevin Krakauer
Lots of small changes: - simplify package API via Reservation type - rename some single-letter variable names that were hard to follow - rename some types PiperOrigin-RevId: 362442366
2021-03-10Merge release-20210301.0-31-g2a888a106 (automated)gVisor bot
2021-03-09Give TCP flags a dedicated typeZeling Feng
- Implement Stringer for it so that we can improve error messages. - Use TCPFlags through the code base. There used to be a mixed usage of byte, uint8 and int as TCP flags. PiperOrigin-RevId: 361940150
2021-03-09Merge release-20210301.0-29-gabbdcebc5 (automated)gVisor bot
2021-03-08Implement /proc/sys/net/ipv4/ip_local_port_rangeKevin Krakauer
Speeds up the socket stress tests by a couple orders of magnitude. PiperOrigin-RevId: 361721050
2021-03-06Merge release-20210301.0-20-gfb733cdb8 (automated)gVisor bot
2021-03-05Increment the counters when sending Echo requestsArthur Sfez
Updates #5597 PiperOrigin-RevId: 361252003
2021-03-04Merge release-20210301.0-14-ga9face757 (automated)gVisor bot
2021-03-04Nit fix: Should use maxTimeout in backoffTimerTing-Yu Wang
The only user is in (*handshake).complete and it specifies MaxRTO, so there is no behavior changes. PiperOrigin-RevId: 360954447
2021-03-04Merge release-20210301.0-12-g1cd76d958 (automated)gVisor bot
2021-03-03Make dedicated methods for data operations in PacketBufferTing-Yu Wang
One of the preparation to decouple underlying buffer implementation. There are still some methods that tie to VectorisedView, and they will be changed gradually in later CLs. This CL also introduce a new ICMPv6ChecksumParams to replace long list of parameters when calling ICMPv6Checksum, aiming to be more descriptive. PiperOrigin-RevId: 360778149
2021-03-03Merge release-20210301.0-8-g3e69f5d08 (automated)gVisor bot
2021-03-03Add checklocks analyzer.Bhasker Hariharan
This validates that struct fields if annotated with "// checklocks:mu" where "mu" is a mutex field in the same struct then access to the field is only done with "mu" locked. All types that are guarded by a mutex must be annotated with // +checklocks:<mutex field name> For more details please refer to README.md. PiperOrigin-RevId: 360729328
2021-03-01Merge release-20210208.0-106-g865ca64ee (automated)gVisor bot
2021-03-01tcp: endpoint.Write has to send all data that has been read from payloadAndrei Vagin
io.Reader.ReadFull returns the number of bytes copied and an error if fewer bytes were read. PiperOrigin-RevId: 360247614
2021-02-27Merge release-20210208.0-105-g037bb2f45 (automated)gVisor bot
2021-02-26Fix panic due to zero length writes in TCP.Bhasker Hariharan
There is a short race where in Write an endpoint can transition from writable to non-writable state due to say an incoming RST during the time we release the endpoint lock and reacquire after copying the payload. In such a case if the write happens to be a zero sized write we end up trying to call sendData() even though nothing was queued. This can panic when trying to enable/disable TCP timers if the endpoint had already transitioned to a CLOSED/ERROR state due to the incoming RST as we cleanup timers when the protocol goroutine terminates. Sadly the race window is small enough that my attempts at reproducing the panic in a syscall test has not been successful. PiperOrigin-RevId: 359887905
2021-02-26Merge release-20210208.0-101-gda2505df9 (automated)gVisor bot
2021-02-26Use closure to avoid manual unlockingTamir Duberstein
Also increase refcount of raw.endpoint.route while in use. Avoid allocating an array of size zero. PiperOrigin-RevId: 359797788
2021-02-26Merge release-20210208.0-99-gf3de211bb (automated)gVisor bot
2021-02-25RACK: recovery logic should check for receive window before re-transmitting.Nayana Bidari
Use maybeSendSegment while sending segments in RACK recovery which checks if the receiver has space and splits the segments when the segment size is greater than MSS. PiperOrigin-RevId: 359641097
2021-02-25Merge release-20210208.0-97-g38c42bbf4 (automated)gVisor bot
2021-02-25Remove deadlock in raw.endpoint caused by recursive read lockingKevin Krakauer
Prevents the following deadlock: - Raw packet is sent via e.Write(), which read locks e.mu - Connect() is called, blocking on write locking e.mu - The packet is routed to loopback and back to e.HandlePacket(), which read locks e.mu Per the atomic.RWMutex documentation, this deadlocks: "If a goroutine holds a RWMutex for reading and another goroutine might call Lock, no goroutine should expect to be able to acquire a read lock until the initial read lock is released. In particular, this prohibits recursive read locking. This is to ensure that the lock eventually becomes available; a blocked Lock call excludes new readers from acquiring the lock." Also, release eps.mu earlier in deliverRawPacket. PiperOrigin-RevId: 359600926
2021-02-12Merge release-20210208.0-52-g845d0a65f (automated)gVisor bot
2021-02-11[rack] TLP: ACK Processing and PTO scheduling.Ayush Ranjan
This change implements TLP details enumerated in https://tools.ietf.org/html/draft-ietf-tcpm-rack-08#section-7.5.3 Fixes #5085 PiperOrigin-RevId: 357125037
2021-02-12Merge release-20210201.0-91-g91cf7b3ca (automated)gVisor bot
2021-02-11[netstack] Fix recovery entry and exit checks.Ayush Ranjan
Entry check: - Earlier implementation was preventing us from entering recovery even if SND.UNA is lost but dupAckCount is still below threshold. Fixed that. - We should only enter recovery when at least one more byte of data beyond the highest byte that was outstanding when fast retransmit was last entered is acked. Added that check. Exit check: - Earlier we were checking if SEG.ACK is in range [SND.UNA, SND.NXT]. The intention was to check if any unacknowledged data was ACKed. Note that (SEG.ACK - 1) is actually the sequence number which was ACKed. So we were incorrectly including (SND.UNA - 1) in the range. Fixed the check to now be (SEG.ACK - 1) in range [SND.UNA, SND.NXT). Additionally, moved a RACK specific test to the rack tests file. Added tests for the changes I made. PiperOrigin-RevId: 357091322
2021-02-11Merge release-20210201.0-83-gff04d019e (automated)gVisor bot
2021-02-10RACK: Fix re-transmitting the segment twice when entering recovery.Nayana Bidari
TestRACKWithDuplicateACK is flaky as the reorder window can expire before receiving three duplicate ACKs which will result in sending the first unacknowledged segment twice: when reorder timer expired and again after receiving the third duplicate ACK. This CL will fix this behavior and will not resend the segment again if it was already re-transmittted when reorder timer expired. Update the TestRACKWithDuplicateACK to test that the first segment is considered as lost and is re-transmitted. PiperOrigin-RevId: 356855168
2021-02-10Merge release-20210201.0-72-g298c129cc (automated)gVisor bot
2021-02-09Add support for setting SO_SNDBUF for unix domain sockets.Bhasker Hariharan
The limits for snd/rcv buffers for unix domain socket is controlled by the following sysctls on linux - net.core.rmem_default - net.core.rmem_max - net.core.wmem_default - net.core.wmem_max Today in gVisor we do not expose these sysctls but we do support setting the equivalent in netstack via stack.Options() method. But AF_UNIX sockets in gVisor can be used without netstack, with hostinet or even without any networking stack at all. Which means ideally these sysctls need to live as globals in gVisor. But rather than make this a big change for now we hardcode the limits in the AF_UNIX implementation itself (which in itself is better than where we were before) where it SO_SNDBUF was hardcoded to 16KiB. Further we bump the initial limit to a default value of 208 KiB to match linux from the paltry 16 KiB we use today. Updates #5132 PiperOrigin-RevId: 356665498
2021-02-09Merge release-20210201.0-60-g95500ece5 (automated)gVisor bot
2021-02-08Allow UDP sockets connect()ing to port 0Zeling Feng
We previously return EINVAL when connecting to port 0, however this is not the observed behavior on Linux. One of the observable effects after connecting to port 0 on Linux is that getpeername() will fail with ENOTCONN. PiperOrigin-RevId: 356413451
2021-02-08Merge release-20210201.0-55-gfe63db2e9 (automated)gVisor bot
2021-02-08RACK: Detect lossNayana Bidari
Detect packet loss using reorder window and re-transmit them after the reorder timer expires. PiperOrigin-RevId: 356321786
2021-02-03Merge release-20210125.0-74-ge3bce9689 (automated)gVisor bot
2021-02-03Add a function to enable RACK in tests.Nayana Bidari
- Adds a function to enable RACK in tests. - RACK update functions are guarded behind the flag tcpRecovery. PiperOrigin-RevId: 355435973
2021-02-02Merge release-20210125.0-63-g49f783fb6 (automated)gVisor bot
2021-02-02Rename HandleNDupAcks in TCP.Nayana Bidari
Rename HandleNDupAcks() to HandleLossDetected() as it will enter this when is detected after: - reorder window expires and TLP (in case of RACK) - dupAckCount >= 3 PiperOrigin-RevId: 355237858
2021-02-02Merge release-20210125.0-58-g8c7c5abaf (automated)gVisor bot
2021-02-02Add support for rate limiting out of window ACKs.Bhasker Hariharan
Netstack today will send dupACK's with no rate limit for incoming out of window segments. This can result in ACK loops for example if a TCP socket connects to itself (actually permitted by TCP). Where the ACK sent in response to packets being out of order itself gets considered as an out of window segment resulting in another ACK being generated. PiperOrigin-RevId: 355206877
2021-02-01Merge release-20210125.0-47-gebd3912c0 (automated)gVisor bot
2021-02-01Refactor HandleControlPacket/SockErrorGhanan Gowripalan
...to remove the need for the transport layer to deduce the type of error it received. Rename HandleControlPacket to HandleError as HandleControlPacket only handles errors. tcpip.SockError now holds a tcpip.SockErrorCause interface that different errors can implement. PiperOrigin-RevId: 354994306