Age | Commit message (Collapse) | Author |
|
PiperOrigin-RevId: 242704699
Change-Id: I87db368ca343b3b4bf4f969b17d3aa4ce2f8bd4f
|
|
Having raw socket code together will make it easier to add support for other raw
network protocols. Currently, only ICMP uses the raw endpoint. However, adding
support for other protocols such as UDP shouldn't be much more difficult than
adding a few switch cases.
PiperOrigin-RevId: 241564875
Change-Id: I77e03adafe4ce0fd29ba2d5dfdc547d2ae8f25bf
|
|
PiperOrigin-RevId: 241025361
Change-Id: I292e7aea9a4b294b11e4f736e107010d9524586b
|
|
The panic was caused by modifying the tree while iterating which invalidated the
iterator.
Also fixes another bug in SACKScoreboard.Insert() which was causing blocks to be
merged incorrectly.
PiperOrigin-RevId: 240895053
Change-Id: Ia72b8244297962df5c04283346da5226434740af
|
|
The linux packet socket can handle GSO packets, so we can segment packets to
64K instead of the MTU which is usually 1500.
Here are numbers for the nginx-1m test:
runsc: 579330.01 [Kbytes/sec] received
runsc-gso: 1794121.66 [Kbytes/sec] received
runc: 2122139.06 [Kbytes/sec] received
and for tcp_benchmark:
$ tcp_benchmark --duration 15 --ideal
[ 4] 0.0-15.0 sec 86647 MBytes 48456 Mbits/sec
$ tcp_benchmark --client --duration 15 --ideal
[ 4] 0.0-15.0 sec 2173 MBytes 1214 Mbits/sec
$ tcp_benchmark --client --duration 15 --ideal --gso 65536
[ 4] 0.0-15.0 sec 19357 MBytes 10825 Mbits/sec
PiperOrigin-RevId: 240809103
Change-Id: I2637f104db28b5d4c64e1e766c610162a195775a
|
|
This is a preparation for GSO changes (cl/234508902).
RELNOTES[gofers]: Refactor checksum code to include length, which
it already did, but in a convoluted way. Should be a no-op.
PiperOrigin-RevId: 240460794
Change-Id: I537381bc670b5a9f5d70a87aa3eb7252e8f5ace2
|
|
PiperOrigin-RevId: 239417224
Change-Id: I14a9adc31a6330a79a6156c105969cd5f1f63d20
|
|
See: https://tools.ietf.org/html/rfc6691#section-2
PiperOrigin-RevId: 239305632
Change-Id: Ie8eb912a43332e6490045dc95570709c5b81855e
|
|
PiperOrigin-RevId: 238467634
Change-Id: If4cd8efff7386fbee1195f051d15549b495910a9
|
|
IP_MULTICAST_LOOP controls whether or not multicast packets sent on the default
route are looped back. In order to implement this switch, support for sending
and looping back multicast packets on the default route had to be implemented.
For now we only support IPv4 multicast.
PiperOrigin-RevId: 237534603
Change-Id: I490ac7ff8e8ebef417c7eb049a919c29d156ac1c
|
|
PiperOrigin-RevId: 236945145
Change-Id: I051760d95154ea5574c8bb6aea526f488af5e07b
|
|
PiperOrigin-RevId: 236926132
Change-Id: I5cf103f22766e6e65a581de780c7bb9ca0fa3181
|
|
Broadly, this change:
* Enables sockets to be created via `socket(AF_INET, SOCK_RAW, IPPROTO_ICMP)`.
* Passes the network-layer (IP) header up the stack to the transport endpoint,
which can pass it up to the socket layer. This allows a raw socket to return
the entire IP packet to users.
* Adds functions to stack.TransportProtocol, stack.Stack, stack.transportDemuxer
that enable incoming packets to be delivered to raw endpoints. New raw sockets
of other protocols (not ICMP) just need to register with the stack.
* Enables ping.endpoint to return IP headers when created via SOCK_RAW.
PiperOrigin-RevId: 235993280
Change-Id: I60ed994f5ff18b2cbd79f063a7fdf15d093d845a
|
|
This change does not make use of SACK information but adds support to track
SACK information and store it in the endpoint.
The actual SACK based recovery will be in a separate CL.
Part of commits to add RFC 6675 support to Netstack.
PiperOrigin-RevId: 235612264
Change-Id: I261f94844d7bad5abda803152ce6cc6125a467ff
|
|
This change adds support for the SO_BROADCAST socket option in gVisor Netstack.
This support includes getsockopt()/setsockopt() functionality for both UDP and
TCP endpoints (the latter being a NOOP), dispatching broadcast messages up and
down the stack, and route finding/creation for broadcast packets. Finally, a
suite of tests have been implemented, exercising this functionality through the
Linux syscall API.
PiperOrigin-RevId: 234850781
Change-Id: If3e666666917d39f55083741c78314a06defb26c
|
|
RFC7323 recommends that if the timestamp option was negotiated
then all packets should carry a TCP Timestamp and any packets that
do not should be dropped.
Netstack implemented this behaviour. Linux OTOH does not and will
accept such packets. This change makes Netstack behaviour compatible
with Linux.
Also now that we allow such packets, we do need to update RTO calculations
based on these packets even if timestamp option is enabled.
PiperOrigin-RevId: 233432268
Change-Id: I9f4742ae6b63930ac3b5e37d8c238761e6a4b29f
|
|
Nothing reads them and they can simply get stale.
Generated with:
$ sed -i "s/licenses(\(.*\)).*/licenses(\1)/" **/BUILD
PiperOrigin-RevId: 231818945
Change-Id: Ibc3f9838546b7e94f13f217060d31f4ada9d4bf0
|
|
quoting what "rscheff@gmx.at" pointed out over email.
"IsLost in RFC3517 is defined as >= (DupThresh * SMSS) while
RFC6675 improves upon this, and defines IsLost as >
((DupThresh - 1) * SMSS + 1).
The latter addresses situations where partial segments (size < MSS)
are sent (eg. last segment of a http protocol message sent with PSH
being less than MSS is common)."
PiperOrigin-RevId: 231512331
Change-Id: I1addd4a92e3e7baeb0bdda46463ebfae435da958
|
|
PiperOrigin-RevId: 229427852
Change-Id: I9de8ed63f4a7672dacd3b282c863c599d00acd52
|
|
PiperOrigin-RevId: 229243918
Change-Id: Ie14ef34e66ae851ed080f57b7d26a369a66f7664
|
|
This option allows multiple sockets to be bound to the same port.
Incoming packets are distributed to sockets using a hash based on source and
destination addresses. This means that all packets from one sender will be
received by the same server socket.
PiperOrigin-RevId: 227153413
Change-Id: I59b6edda9c2209d5b8968671e9129adb675920cf
|
|
We don't explicitly support out-of-band data and treat it like normal in-band
data. This is equilivent to SO_OOBINLINE being enabled, so always report that
it is enabled.
PiperOrigin-RevId: 226572742
Change-Id: I4c30ccb83265e76c30dea631cbf86822e6ee1c1b
|
|
PiperOrigin-RevId: 226542979
Change-Id: Ife11ebd0a85b8a63078e6daa71b4a99a82080ac9
|
|
Within gVisor, plumb new socket options to netstack.
Within netstack, fix GetSockOpt and SetSockOpt return value logic.
PiperOrigin-RevId: 226532229
Change-Id: If40734e119eed633335f40b4c26facbebc791c74
|
|
PiperOrigin-RevId: 225421480
Change-Id: I1e9259b0b7e8490164e830b73338a615129c7f0e
|
|
PiperOrigin-RevId: 224696233
Change-Id: I45c425d9e32adee5dcce29ca7439a06567b26014
|
|
* Clarify tcpip.Endpoint.Write contract regarding short writes.
* Enforce tcpip.Endpoint.Write contract regarding short writes.
* Update relevant users of tcpip.Endpoint.Write.
PiperOrigin-RevId: 224377586
Change-Id: I24299ecce902eb11317ee13dae3b8d8a7c5b097d
|
|
PiperOrigin-RevId: 224227677
Change-Id: I08b0e0c0574170556269900653e5bcf9e9e5c9c9
|
|
PiperOrigin-RevId: 224214981
Change-Id: I4c1dd5b1c856f7a4f9866a5dda44a5297e92486a
|
|
PiperOrigin-RevId: 224033418
Change-Id: I780be973e8be68ac93e8c9e7a100002e912f40d2
|
|
PiperOrigin-RevId: 224033238
Change-Id: Ie5b1854b29340843b02c123766d290a8738d7631
|
|
PiperOrigin-RevId: 223425575
Change-Id: Idd777e04c69e6ffcbfb0bdbea828a8b8b42d7672
|
|
Moving the wakeup logic into the disable blocks is an optimization.
PiperOrigin-RevId: 221677028
Change-Id: Ib5a5a6d52cc77b4bbc5dedcad9ee1dbb3da98deb
|
|
PiperOrigin-RevId: 221484739
Change-Id: I44c71f79f99d0d00a2e70a7f06d7024a62a5de0a
|
|
Previously, TCP_NODELAY was always enabled and we would lie about it being
configurable. TCP_NODELAY is now disabled by default (to match Linux) in the
socket layer so that non-gVisor users don't automatically start using this
questionable optimization.
PiperOrigin-RevId: 221368472
Change-Id: Ib0240f66d94455081f4e0ca94f09d9338b2c1356
|
|
PiperOrigin-RevId: 221117846
Change-Id: I2a43fd8135b1d1194ff81e98644ce6b6182ece50
|
|
PiperOrigin-RevId: 220866996
Change-Id: I89d48215df57c00d6a6ec512fc18712a2ea9080b
|
|
PiperOrigin-RevId: 220185891
Change-Id: Iaea73fd7b2fa8c399b989cdcaabf4885f370df4b
|
|
PiperOrigin-RevId: 219571556
Change-Id: I5a1042c1cb05eb2711eb01627fd298bad6c543a6
|
|
PiperOrigin-RevId: 218537640
Change-Id: I1c5f55a46390174e1f5caeff74b1a364fa3268d9
|
|
This change also adds extensive testing to the p9 package via mocks. The sanity
checks and type checks are moved from the gofer into the core package, where
they can be more easily validated.
PiperOrigin-RevId: 218296768
Change-Id: I4fc3c326e7bf1e0e140a454cbacbcc6fd617ab55
|
|
PiperOrigin-RevId: 217951017
Change-Id: Ie08bf6987f98467d07457bcf35b5f1ff6e43c035
|
|
Currently, in the face of FileMem fragmentation and a large sendmsg or
recvmsg call, host sockets may pass > 1024 iovecs to the host, which
will immediately cause the host to return EMSGSIZE.
When we detect this case, use a single intermediate buffer to pass to
the kernel, copying to/from the src/dst buffer.
To avoid creating unbounded intermediate buffers, enforce message size
checks and truncation w.r.t. the send buffer size. The same
functionality is added to netstack unix sockets for feature parity.
PiperOrigin-RevId: 216590198
Change-Id: I719a32e71c7b1098d5097f35e6daf7dd5190eff7
|
|
Previously, if address resolution for UDP or Ping sockets required sending
packets using Write in Transport layer, Resolve would return ErrWouldBlock
and Write would return ErrNoLinkAddress. Meanwhile startAddressResolution
would run in background. Further calls to Write using same address would also
return ErrNoLinkAddress until resolution has been completed successfully.
Since Write is not allowed to block and System Calls need to be
interruptible in System Call layer, the caller to Write is responsible for
blocking upon return of ErrWouldBlock.
Now, when startAddressResolution is called a notification channel for
the completion of the address resolution is returned.
The channel will traverse up to the calling function of Write as well as
ErrNoLinkAddress. Once address resolution is complete (success or not) the
channel is closed. The caller would call Write again to send packets and
check if address resolution was compeleted successfully or not.
Fixes google/gvisor#5
Change-Id: Idafaf31982bee1915ca084da39ae7bd468cebd93
PiperOrigin-RevId: 214962200
|
|
tcp.endpoint.hardError is protected by tcp.endpoint.mu.
PiperOrigin-RevId: 213730698
Change-Id: I4e4f322ac272b145b500b1a652fbee0c7b985be2
|
|
From RFC7323#Section-4
The [RFC6298] RTT estimator has weighting factors, alpha and beta, based on an
implicit assumption that at most one RTTM will be sampled per RTT. When
multiple RTTMs per RTT are available to update the RTT estimator, an
implementation SHOULD try to adhere to the spirit of the history specified in
[RFC6298]. An implementation suggestion is detailed in Appendix G.
From RFC7323#appendix-G
Appendix G. RTO Calculation Modification
Taking multiple RTT samples per window would shorten the history calculated
by the RTO mechanism in [RFC6298], and the below algorithm aims to maintain a
similar history as originally intended by [RFC6298].
It is roughly known how many samples a congestion window worth of data will
yield, not accounting for ACK compression, and ACK losses. Such events will
result in more history of the path being reflected in the final value for
RTO, and are uncritical. This modification will ensure that a similar amount
of time is taken into account for the RTO estimation, regardless of how many
samples are taken per window:
ExpectedSamples = ceiling(FlightSize / (SMSS * 2))
alpha' = alpha / ExpectedSamples
beta' = beta / ExpectedSamples
Note that the factor 2 in ExpectedSamples is due to "Delayed ACKs".
Instead of using alpha and beta in the algorithm of [RFC6298], use alpha' and
beta' instead:
RTTVAR <- (1 - beta') * RTTVAR + beta' * |SRTT - R'|
SRTT <- (1 - alpha') * SRTT + alpha' * R'
(for each sample R')
PiperOrigin-RevId: 213644795
Change-Id: I52278b703540408938a8edb8c38be97b37f4a10e
|
|
PiperOrigin-RevId: 213387851
Change-Id: Icc6850761bc11afd0525f34863acd77584155140
|
|
PiperOrigin-RevId: 213053370
Change-Id: I60ea89572b4fca53fd126c870fcbde74fcf52562
|
|
PiperOrigin-RevId: 212757571
Change-Id: I04200df9e45c21eb64951cd2802532fa84afcb1a
|
|
PiperOrigin-RevId: 212750821
Change-Id: I822fd63e48c684b45fd91f9ce057867b7eceb792
|