summaryrefslogtreecommitdiffhomepage
path: root/pkg/tcpip/transport/tcp/snd.go
AgeCommit message (Collapse)Author
2020-01-14Changes TCP packet dispatch to use a pool of goroutines.Bhasker Hariharan
All inbound segments for connections in ESTABLISHED state are delivered to the endpoint's queue but for every segment delivered we also queue the endpoint for processing to a selected processor. This ensures that when there are a large number of connections in ESTABLISHED state the inbound packets are all handled by a small number of goroutines and significantly reduces the amount of work the goscheduler has to perform. We let connections in other states follow the current path where the endpoint's goroutine directly handles the segments. Updates #231 PiperOrigin-RevId: 289728325
2020-01-10panic fix in retransmitTimerExpired.Bhasker Hariharan
This is a band-aid fix for now to prevent panics. PiperOrigin-RevId: 289078453
2020-01-09New sync package.Ian Gudger
* Rename syncutil to sync. * Add aliases to sync types. * Replace existing usage of standard library sync package. This will make it easier to swap out synchronization primitives. For example, this will allow us to use primitives from github.com/sasha-s/go-deadlock to check for lock ordering violations. Updates #1472 PiperOrigin-RevId: 289033387
2019-12-11Add support for TCP_USER_TIMEOUT option.Bhasker Hariharan
The implementation follows the linux behavior where specifying a TCP_USER_TIMEOUT will cause the resend timer to honor the user specified timeout rather than the default rto based timeout. Further it alters when connections are timedout due to keepalive failures. It does not alter the behavior of when keepalives are sent. This is as per the linux behavior. PiperOrigin-RevId: 285099795
2019-12-06Add TCP stats for connection close and keep-alive timeouts.Mithun Iyer
Fix bugs in updates to TCP CurrentEstablished stat. Fixes #1277 PiperOrigin-RevId: 284292459
2019-10-15netstack: add counters for tcp CurrEstab and EstabResetsJianfeng Tan
Signed-off-by: Jianfeng Tan <henry.tjf@antfin.com>
2019-10-09Internal change.gVisor bot
PiperOrigin-RevId: 273861936
2019-08-26netstack/tcp: Add LastAck transition.Rahat Mahmood
Add missing state transition to LastAck, which should happen when the endpoint has already recieved a FIN from the remote side, and is sending its own FIN. PiperOrigin-RevId: 265568314
2019-08-09Add congestion control states to sender.Bhasker Hariharan
This change just introduces different congestion control states and ensures the sender.state is updated to reflect the current state of the connection. It is not used for any decisions yet but this is required before algorithms like Eiffel/PRR can be implemented. Fixes #394 PiperOrigin-RevId: 262638292
2019-08-02Automated rollback of changelist 261191548Rahat Mahmood
PiperOrigin-RevId: 261373749
2019-08-01Implement getsockopt(TCP_INFO).Rahat Mahmood
Export some readily-available fields for TCP_INFO and stub out the rest. PiperOrigin-RevId: 261191548
2019-06-13Add support for TCP receive buffer auto tuning.Bhasker Hariharan
The implementation is similar to linux where we track the number of bytes consumed by the application to grow the receive buffer of a given TCP endpoint. This ensures that the advertised window grows at a reasonable rate to accomodate for the sender's rate and prevents large amounts of data being held in stack buffers if the application is not actively reading or not reading fast enough. The original paper that was used to implement the linux receive buffer auto- tuning is available @ https://public.lanl.gov/radiant/pubs/drs/lacsi2001.pdf NOTE: Linux does not implement DRS as defined in that paper, it's just a good reference to understand the solution space. Updates #230 PiperOrigin-RevId: 253168283
2019-06-13Update canonical repository.Adin Scannell
This can be merged after: https://github.com/google/gvisor-website/pull/77 or https://github.com/google/gvisor-website/pull/78 PiperOrigin-RevId: 253132620
2019-06-12Add support for TCP_CONGESTION socket option.Bhasker Hariharan
This CL also cleans up the error returned for setting congestion control which was incorrectly returning EINVAL instead of ENOENT. PiperOrigin-RevId: 252889093
2019-06-06Track and export socket state.Rahat Mahmood
This is necessary for implementing network diagnostic interfaces like /proc/net/{tcp,udp,unix} and sock_diag(7). For pass-through endpoints such as hostinet, we obtain the socket state from the backend. For netstack, we add explicit tracking of TCP states. PiperOrigin-RevId: 251934850
2019-06-05netstack/tcp: fix calculating a number of outstanding packetsAndrei Vagin
In case of GSO, a segment can container more than one packet and we need to use the pCount() helper to get a number of packets. PiperOrigin-RevId: 251743020
2019-05-03Implement support for SACK based recovery(RFC 6675).Bhasker Hariharan
PiperOrigin-RevId: 246536003 Change-Id: I118b745f45040be9c70cb6a1028acdb06c78d8c9
2019-04-29Change copyright notice to "The gVisor Authors"Michael Pratt
Based on the guidelines at https://opensource.google.com/docs/releasing/authors/. 1. $ rg -l "Google LLC" | xargs sed -i 's/Google LLC.*/The gVisor Authors./' 2. Manual fixup of "Google Inc" references. 3. Add AUTHORS file. Authors may request to be added to this file. 4. Point netstack AUTHORS to gVisor AUTHORS. Drop CONTRIBUTORS. Fixes #209 PiperOrigin-RevId: 245823212 Change-Id: I64530b24ad021a7d683137459cafc510f5ee1de9
2019-03-28netstack/fdbased: add generic segmentation offload (GSO) supportAndrei Vagin
The linux packet socket can handle GSO packets, so we can segment packets to 64K instead of the MTU which is usually 1500. Here are numbers for the nginx-1m test: runsc: 579330.01 [Kbytes/sec] received runsc-gso: 1794121.66 [Kbytes/sec] received runc: 2122139.06 [Kbytes/sec] received and for tcp_benchmark: $ tcp_benchmark --duration 15 --ideal [ 4] 0.0-15.0 sec 86647 MBytes 48456 Mbits/sec $ tcp_benchmark --client --duration 15 --ideal [ 4] 0.0-15.0 sec 2173 MBytes 1214 Mbits/sec $ tcp_benchmark --client --duration 15 --ideal --gso 65536 [ 4] 0.0-15.0 sec 19357 MBytes 10825 Mbits/sec PiperOrigin-RevId: 240809103 Change-Id: I2637f104db28b5d4c64e1e766c610162a195775a
2019-03-20netstack: adjust the sequence number after trimming the packetAndrei Vagin
PiperOrigin-RevId: 239417224 Change-Id: I14a9adc31a6330a79a6156c105969cd5f1f63d20
2019-03-19netstack: reduce MSS from SYN to account tcp optionsAndrei Vagin
See: https://tools.ietf.org/html/rfc6691#section-2 PiperOrigin-RevId: 239305632 Change-Id: Ie8eb912a43332e6490045dc95570709c5b81855e
2019-03-14Remove duplicate TCP flag definitionsTamir Duberstein
PiperOrigin-RevId: 238467634 Change-Id: If4cd8efff7386fbee1195f051d15549b495910a9
2019-03-05Add new retransmissions and recovery related metrics.Bhasker Hariharan
PiperOrigin-RevId: 236945145 Change-Id: I051760d95154ea5574c8bb6aea526f488af5e07b
2019-02-25Add a SACK scoreboard to TCP endpoints.Bhasker Hariharan
This change does not make use of SACK information but adds support to track SACK information and store it in the endpoint. The actual SACK based recovery will be in a separate CL. Part of commits to add RFC 6675 support to Netstack. PiperOrigin-RevId: 235612264 Change-Id: I261f94844d7bad5abda803152ce6cc6125a467ff
2019-02-11Do not drop packets w/ missing TCP timestamps.Bhasker Hariharan
RFC7323 recommends that if the timestamp option was negotiated then all packets should carry a TCP Timestamp and any packets that do not should be dropped. Netstack implemented this behaviour. Linux OTOH does not and will accept such packets. This change makes Netstack behaviour compatible with Linux. Also now that we allow such packets, we do need to update RTO calculations based on these packets even if timestamp option is enabled. PiperOrigin-RevId: 233432268 Change-Id: I9f4742ae6b63930ac3b5e37d8c238761e6a4b29f
2018-12-04Fix available calculation when merging TCP segmentsIan Gudger
PiperOrigin-RevId: 224033418 Change-Id: I780be973e8be68ac93e8c9e7a100002e912f40d2
2018-11-14Clean up tcp.sendDataIan Gudger
PiperOrigin-RevId: 221484739 Change-Id: I44c71f79f99d0d00a2e70a7f06d7024a62a5de0a
2018-11-13Implement TCP_NODELAY and TCP_CORKIan Gudger
Previously, TCP_NODELAY was always enabled and we would lie about it being configurable. TCP_NODELAY is now disabled by default (to match Linux) in the socket layer so that non-gVisor users don't automatically start using this questionable optimization. PiperOrigin-RevId: 221368472 Change-Id: Ib0240f66d94455081f4e0ca94f09d9338b2c1356
2018-11-12Remove obsolete TODOIan Gudger
PiperOrigin-RevId: 221117846 Change-Id: I2a43fd8135b1d1194ff81e98644ce6b6182ece50
2018-11-05Merge segments in sender's writeListIan Gudger
PiperOrigin-RevId: 220185891 Change-Id: Iaea73fd7b2fa8c399b989cdcaabf4885f370df4b
2018-10-31Fix a race where keepalives could be sent while there is pending dataIan Gudger
PiperOrigin-RevId: 219571556 Change-Id: I5a1042c1cb05eb2711eb01627fd298bad6c543a6
2018-10-19Use correct company name in copyright headerIan Gudger
PiperOrigin-RevId: 217951017 Change-Id: Ie08bf6987f98467d07457bcf35b5f1ff6e43c035
2018-09-19Fix RTT estimation when timestamp option is enabled.Bhasker Hariharan
From RFC7323#Section-4 The [RFC6298] RTT estimator has weighting factors, alpha and beta, based on an implicit assumption that at most one RTTM will be sampled per RTT. When multiple RTTMs per RTT are available to update the RTT estimator, an implementation SHOULD try to adhere to the spirit of the history specified in [RFC6298]. An implementation suggestion is detailed in Appendix G. From RFC7323#appendix-G Appendix G. RTO Calculation Modification Taking multiple RTT samples per window would shorten the history calculated by the RTO mechanism in [RFC6298], and the below algorithm aims to maintain a similar history as originally intended by [RFC6298]. It is roughly known how many samples a congestion window worth of data will yield, not accounting for ACK compression, and ACK losses. Such events will result in more history of the path being reflected in the final value for RTO, and are uncritical. This modification will ensure that a similar amount of time is taken into account for the RTO estimation, regardless of how many samples are taken per window: ExpectedSamples = ceiling(FlightSize / (SMSS * 2)) alpha' = alpha / ExpectedSamples beta' = beta / ExpectedSamples Note that the factor 2 in ExpectedSamples is due to "Delayed ACKs". Instead of using alpha and beta in the algorithm of [RFC6298], use alpha' and beta' instead: RTTVAR <- (1 - beta') * RTTVAR + beta' * |SRTT - R'| SRTT <- (1 - alpha') * SRTT + alpha' * R' (for each sample R') PiperOrigin-RevId: 213644795 Change-Id: I52278b703540408938a8edb8c38be97b37f4a10e
2018-09-05Update {LinkEndpoint,NetworkEndpoint}#WritePacket to take a VectorisedViewBert Muthalaly
Makes it possible to avoid copying or allocating in cases where DeliverNetworkPacket (rx) needs to turn around and call WritePacket (tx) with its VectorisedView. Also removes the restriction on having VectorisedViews with multiple views in the write path. PiperOrigin-RevId: 211728717 Change-Id: Ie03a65ecb4e28bd15ebdb9c69f05eced18fdfcff
2018-09-05Implement TCP keepalivesTamir Duberstein
PiperOrigin-RevId: 211670620 Change-Id: Ia8a3d8ae53a7fece1dee08ee9c74964bd7f71bb7
2018-09-04Expose TCP RTTTamir Duberstein
PiperOrigin-RevId: 211504634 Change-Id: I9a7bcbbdd40e5036894930f709278725ef477293
2018-08-03Cubic implementation for Netstack.Bhasker Hariharan
This CL implements CUBIC as described in https://tools.ietf.org/html/rfc8312. PiperOrigin-RevId: 207353142 Change-Id: I329cbf3277f91127e99e488f07d906f6779c6603
2018-08-02Automated rollback of changelist 207037226Zhaozhong Ni
PiperOrigin-RevId: 207125440 Change-Id: I6c572afb4d693ee72a0c458a988b0e96d191cd49
2018-08-01Automated rollback of changelist 207007153Michael Pratt
PiperOrigin-RevId: 207037226 Change-Id: I8b5f1a056d4f3eab17846f2e0193bb737ecb5428
2018-08-01stateify: convert all packages to use explicit mode.Zhaozhong Ni
PiperOrigin-RevId: 207007153 Change-Id: Ifedf1cc3758dc18be16647a4ece9c840c1c636c9
2018-07-23Refactor new reno congestion control logic out of sender.Bhasker Hariharan
This CL also puts the congestion control logic behind an interface so that we can easily swap it out for say CUBIC in the future. PiperOrigin-RevId: 205732848 Change-Id: I891cdfd17d4d126b658b5faa0c6bd6083187944b
2018-07-10netstack: tcp socket connected state S/R support.Zhaozhong Ni
PiperOrigin-RevId: 203958972 Change-Id: Ia6fe16547539296d48e2c6731edacdd96bd6e93c
2018-07-09Switch netstack licenses to Apache 2.0.Nicolas Lacasse
Fixes #27 PiperOrigin-RevId: 203825288 Change-Id: Ie9f3a2b2c1e296b026b024f75c07da1a7e118633
2018-06-29Panic in netstack during cleanup where a FIN becomes a RST.Brian Geffon
There is a subtle bug where during cleanup with unread data a FIN can be converted to a RST, at that point the entire connection should be aborted as we're not expecting any ACKs to the RST. PiperOrigin-RevId: 202691271 Change-Id: Idae70800208ca26e07a379bc6b2b8090805d0a22
2018-06-26Automated rollback of changelist 201596247Brian Geffon
PiperOrigin-RevId: 202151720 Change-Id: I0491172c436bbb32b977f557953ba0bc41cfe299
2018-06-21netstack: tcp socket connected state S/R support.Zhaozhong Ni
PiperOrigin-RevId: 201596247 Change-Id: Id22f47b2cdcbe14aa0d930f7807ba75f91a56724
2018-05-22When sending a RST the acceptable ACK window shouldn't change.Brian Geffon
Today when we transmit a RST it's happening during the time-wait flow. Because a FIN is allowed to advance the acceptable ACK window we're incorrectly doing that for a RST. PiperOrigin-RevId: 197637565 Change-Id: I080190b06bd0225326cd68c1fbf37bd3fdbd414e
2018-05-03Fix misspellings.Cyrille Hemidy
PiperOrigin-RevId: 195307689 Change-Id: I499f19af49875a43214797d63376f20ae788d2f4
2018-04-28Check in gVisor.Googler
PiperOrigin-RevId: 194583126 Change-Id: Ica1d8821a90f74e7e745962d71801c598c652463