Age | Commit message (Collapse) | Author |
|
|
|
TCP now tracks the overhead of the segment structure itself in it's out-of-order
queue (pending). This is required to ensure that a malicious sender sending 1
byte out-of-order segments cannot queue like 1000's of segments which bloat up
memory usage.
We also reduce the default receive window to 32KB. With TCP moderation there is
no need to keep this window at 1MB which means that for new connections the
default out-of-order queue will be small unless the application actually reads
the data that is being sent. This prevents a sender from just maliciously
filling up pending buf with lots of tiny out-of-order segments.
PiperOrigin-RevId: 323450913
|
|
|
|
Changes the API of tcpip.Clock to also provide a method for scheduling and
rescheduling work after a specified duration. This change also implements the
AfterFunc method for existing implementations of tcpip.Clock.
This is the groundwork required to mock time within tests. All references to
CancellableTimer has been replaced with the tcpip.Job interface, allowing for
custom implementations of scheduling work.
This is a BREAKING CHANGE for clients that implement their own tcpip.Clock or
use tcpip.CancellableTimer. Migration plan:
1. Add AfterFunc(d, f) to tcpip.Clock
2. Replace references of tcpip.CancellableTimer with tcpip.Job
3. Replace calls to tcpip.CancellableTimer#StopLocked with tcpip.Job#Cancel
4. Replace calls to tcpip.CancellableTimer#Reset with tcpip.Job#Schedule
5. Replace calls to tcpip.NewCancellableTimer with tcpip.NewJob.
PiperOrigin-RevId: 322906897
|
|
|
|
PiperOrigin-RevId: 322853192
|
|
|
|
Fixes #3334
PiperOrigin-RevId: 322846384
|
|
Previously, ICMP destination unreachable datagrams were ignored by TCP
endpoints. This caused connect to hang when an intermediate router
couldn't find a route to the host.
This manifested as a Kokoro error when Docker IPv6 was enabled. The Ruby
image test would try to install the sinatra gem and hang indefinitely
attempting to use an IPv6 address.
Fixes #3079.
|
|
|
|
Updates #173
PiperOrigin-RevId: 322665518
|
|
|
|
Updates #173
PiperOrigin-RevId: 321690756
|
|
|
|
Packet sockets also seem to allow double binding and do not return an error on
linux. This was tested by running the syscall test in a linux namespace as root
and the current test DoubleBind fails@HEAD.
Passes after this change.
Updates #173
PiperOrigin-RevId: 321445137
|
|
|
|
As in Linux, we must periodically clean up unused connections.
PiperOrigin-RevId: 321003353
|
|
|
|
Updates #2746
PiperOrigin-RevId: 320757963
|
|
|
|
RFC-1122 (and others) specify that UDP should not receive
datagrams that have a source address that is a multicast address.
Packets should never be received FROM a multicast address.
See also, RFC 768: 'User Datagram Protocol'
J. Postel, ISI, 28 August 1980
A UDP datagram received with an invalid IP source address
(e.g., a broadcast or multicast address) must be discarded
by UDP or by the IP layer (see rfc 1122 Section 3.2.1.3).
This CL does not address TCP or broadcast which is more complicated.
Also adds a test for both ipv6 and ipv4 UDP.
Fixes #3154
PiperOrigin-RevId: 320547674
|
|
|
|
Updates #2746
Fixes #3158
PiperOrigin-RevId: 320497190
|
|
PiperOrigin-RevId: 320250773
|
|
|
|
RFC 6864 imposes various restrictions on the uniqueness of the IPv4
Identification field for non-atomic datagrams, defined as an IP datagram that
either can be fragmented (DF=0) or is already a fragment (MF=1 or positive
fragment offset). In order to be compliant, the ID field is assigned for all
non-atomic datagrams.
Add a TCP unit test that induces retransmissions and checks that the IPv4
ID field is unique every time. Add basic handling of the IP_MTU_DISCOVER
socket option so that the option can be used to disable PMTU discovery,
effectively setting DF=0. Attempting to set the sockopt to anything other
than disabled will fail because PMTU discovery is currently not implemented,
and the default behavior matches that of disabled.
PiperOrigin-RevId: 320081842
|
|
|
|
The current convention is when a header is set to pkt.XxxHeader field, it
gets removed from pkt.Data. ICMP does not currently follow this convention.
PiperOrigin-RevId: 320078606
|
|
|
|
Updates #2746
PiperOrigin-RevId: 319887810
|
|
stack_x_test: 2m -> 20s
tcp_x_test: 80s -> 25s
PiperOrigin-RevId: 319828101
|
|
|
|
PiperOrigin-RevId: 319770124
|
|
|
|
Avoid a race where an arbitrary goroutine scheduling delay can cause the
processor to miss events and hang indefinitely.
Reduce allocations by storing processors by-value in the dispatcher, and
by using a single WaitGroup rather than one per processor.
PiperOrigin-RevId: 319665861
|
|
|
|
The application can choose to initiate a non-blocking connect and
later block on a read, when the endpoint is still in SYN-SENT state.
PiperOrigin-RevId: 319311016
|
|
|
|
a) When GSO is in use we should not cap the segment to maxPayloadSize in
sender.maybeSendSegment as the GSO logic will cap the segment to the correct
size. Without this the host GSO is not used as we end up breaking up large
segments into small MSS sized segments before writing the packets to the
host.
b) The check to not split a segment due to it not fitting in the receiver window
when there are pending segments is incorrect as segments in writeList can be
really large as we just take the write call's buffer size and create a single
large segment. So a write of say 128KB will just be 1 segment in the
writeList.
The linux code checks if 1 MSS sized segments fits in the receiver's window
and if not then does not split the current segment. gVisor's check was
incorrect that it was checking if the whole segment which could be >>> 1 MSS
would fit in the receiver's window. This was causing us to prematurely stop
sending and falling back to retransmit timer/probe from the other end to send
data.
This was seen when running HTTPD benchmarks where @ HEAD when sending large
files the benchmark was taking forever to run.
The tcp_splitseg_mss_test.go is being deleted as the test as written doesn't
test what is intended correctly. This is because GSO is enabled by default and
the reason the MSS+1 sized segment is sent is because GSO is in use. A proper
test will require disabling GSO on linux and netstack which is going to take a
bit of work in packetimpact to do it correctly.
Separately a new test probably should be written that verifies that a segment >
availableWindow is not split if the availableWindow is < 1 MSS.
Fixes #3107
PiperOrigin-RevId: 319172089
|
|
|
|
...by calling (*tcp.endpoint).EndpointState only once when possible.
Avoid wrapping (*sleep.Waker).Assert in a useless func while I'm here.
PiperOrigin-RevId: 319074149
|
|
|
|
IPv6 raw sockets never include the IPv6 header.
PiperOrigin-RevId: 318582989
|
|
|
|
SO_NO_CHECK is used to skip the UDP checksum generation on a TX socket
(UDP checksum is optional on IPv4).
Test:
- TestNoChecksum
- SoNoCheckOffByDefault (UdpSocketTest)
- SoNoCheck (UdpSocketTest)
Fixes #3055
PiperOrigin-RevId: 318575215
|
|
|
|
Linux controls socket send/receive buffers using a few sysctl variables
- net.core.rmem_default
- net.core.rmem_max
- net.core.wmem_max
- net.core.wmem_default
- net.ipv4.tcp_rmem
- net.ipv4.tcp_wmem
The first 4 control the default socket buffer sizes for all sockets
raw/packet/tcp/udp and also the maximum permitted socket buffer that can be
specified in setsockopt(SOL_SOCKET, SO_(RCV|SND)BUF,...).
The last two control the TCP auto-tuning limits and override the default
specified in rmem_default/wmem_default as well as the max limits.
Netstack today only implements tcp_rmem/tcp_wmem and incorrectly uses it
to limit the maximum size in setsockopt() as well as uses it for raw/udp
sockets.
This changelist introduces the other 4 and updates the udp/raw sockets to use
the newly introduced variables. The values for min/max match the current
tcp_rmem/wmem values and the default value buffers for UDP/RAW sockets is
updated to match the linux value of 212KiB up from the really low current value
of 32 KiB.
Updates #3043
Fixes #3043
PiperOrigin-RevId: 318089805
|
|
|
|
|
|
For TCP sockets, SO_REUSEADDR relaxes the rules for binding addresses.
gVisor/netstack already supported a behavior similar to SO_REUSEADDR, but did
not allow disabling it. This change brings the SO_REUSEADDR behavior closer to
the behavior implemented by Linux and adds a new SO_REUSEADDR disabled
behavior. Like Linux, SO_REUSEADDR is now disabled by default.
PiperOrigin-RevId: 317984380
|