Age | Commit message (Collapse) | Author |
|
Updates #173
PiperOrigin-RevId: 321690756
|
|
Updates #2746
PiperOrigin-RevId: 320757963
|
|
RFC-1122 (and others) specify that UDP should not receive
datagrams that have a source address that is a multicast address.
Packets should never be received FROM a multicast address.
See also, RFC 768: 'User Datagram Protocol'
J. Postel, ISI, 28 August 1980
A UDP datagram received with an invalid IP source address
(e.g., a broadcast or multicast address) must be discarded
by UDP or by the IP layer (see rfc 1122 Section 3.2.1.3).
This CL does not address TCP or broadcast which is more complicated.
Also adds a test for both ipv6 and ipv4 UDP.
Fixes #3154
PiperOrigin-RevId: 320547674
|
|
Updates #2746
Fixes #3158
PiperOrigin-RevId: 320497190
|
|
RFC 6864 imposes various restrictions on the uniqueness of the IPv4
Identification field for non-atomic datagrams, defined as an IP datagram that
either can be fragmented (DF=0) or is already a fragment (MF=1 or positive
fragment offset). In order to be compliant, the ID field is assigned for all
non-atomic datagrams.
Add a TCP unit test that induces retransmissions and checks that the IPv4
ID field is unique every time. Add basic handling of the IP_MTU_DISCOVER
socket option so that the option can be used to disable PMTU discovery,
effectively setting DF=0. Attempting to set the sockopt to anything other
than disabled will fail because PMTU discovery is currently not implemented,
and the default behavior matches that of disabled.
PiperOrigin-RevId: 320081842
|
|
SO_NO_CHECK is used to skip the UDP checksum generation on a TX socket
(UDP checksum is optional on IPv4).
Test:
- TestNoChecksum
- SoNoCheckOffByDefault (UdpSocketTest)
- SoNoCheck (UdpSocketTest)
Fixes #3055
PiperOrigin-RevId: 318575215
|
|
Linux controls socket send/receive buffers using a few sysctl variables
- net.core.rmem_default
- net.core.rmem_max
- net.core.wmem_max
- net.core.wmem_default
- net.ipv4.tcp_rmem
- net.ipv4.tcp_wmem
The first 4 control the default socket buffer sizes for all sockets
raw/packet/tcp/udp and also the maximum permitted socket buffer that can be
specified in setsockopt(SOL_SOCKET, SO_(RCV|SND)BUF,...).
The last two control the TCP auto-tuning limits and override the default
specified in rmem_default/wmem_default as well as the max limits.
Netstack today only implements tcp_rmem/tcp_wmem and incorrectly uses it
to limit the maximum size in setsockopt() as well as uses it for raw/udp
sockets.
This changelist introduces the other 4 and updates the udp/raw sockets to use
the newly introduced variables. The values for min/max match the current
tcp_rmem/wmem values and the default value buffers for UDP/RAW sockets is
updated to match the linux value of 212KiB up from the really low current value
of 32 KiB.
Updates #3043
Fixes #3043
PiperOrigin-RevId: 318089805
|
|
Test:
- TestIncrementChecksumErrors
Fixes #2943
PiperOrigin-RevId: 317348158
|
|
Updates #173,#6
Fixes #2888
PiperOrigin-RevId: 317087652
|
|
PiperOrigin-RevId: 312559963
|
|
As per RFC 1122 and Linux retransmit timeout handling:
- The segment retransmit timeout needs to exponentially increase and
cap at a predefined value.
- TCP connection needs to timeout after a predefined number of
segment retransmissions.
- TCP connection should not timeout when the retranmission timeout
exceeds MaxRTO, predefined upper bound.
Fixes #2673
PiperOrigin-RevId: 311463961
|
|
This change adds support for TCP_SYNCNT and TCP_WINDOW_CLAMP options
in GetSockOpt/SetSockOpt. This change does not really change any
behaviour in Netstack and only stores/returns the stored value.
Actual honoring of these options will be added as required.
Fixes #2626, #2625
PiperOrigin-RevId: 311453777
|
|
This change makes SynRcvdCountThreshold and the global synRcvdCount into a stack
configurable value. This is required because in cases like mod_proxy which
create multiple Stack instances the count will be a global value that impacts
all Stack instances.
Further the tests relied on modifying the global threshold to simulate tests
where we want to verify SYN cookie based behaviour. This lead to data races due
to the global being modified/read without locks or atomics.
PiperOrigin-RevId: 306947723
|
|
Tests now use a MinRTO of 3s instead of default 200ms. This reduced flakiness in
a lot of the congestion control/recovery tests which were flaky due to
retransmit timer firing too early in case the test executors were overloaded.
This change also bumps some of the timeouts in tests which were too sensitive to
timer variations and reduces the number of slow start iterations which can
make the tests run for too long and also trigger retansmit timeouts etc if
the executor is overloaded.
PiperOrigin-RevId: 306562645
|
|
PiperOrigin-RevId: 305699233
|
|
This feature will match UID and GID of the packet creator, for locally
generated packets. This match is only valid in the OUTPUT and POSTROUTING
chains. Forwarded packets do not have any socket associated with them.
Packets from kernel threads do have a socket, but usually no owner.
|
|
Protocol dispatchers were previously leaked. Bypassing TIME_WAIT is required to
test this change.
Also fix a race when a socket in SYN-RCVD is closed. This is also required to
test this change.
PiperOrigin-RevId: 296922548
|
|
Added the ability to get/set the IP_RECVTCLASS socket option on UDP endpoints.
If enabled, traffic class from the incoming Network Header passed as ancillary
data in the ControlMessages.
Adding Get/SetSockOptBool to decrease the overhead of getting/setting simple
options. (This was absorbed in a CL that will be landing before this one).
Test:
* Added unit test to udp_test.go that tests getting/setting as well as
verifying that we receive expected TOS from incoming packet.
* Added a syscall test for verifying getting/setting
* Removed test skip for existing syscall test to enable end to end test.
PiperOrigin-RevId: 295840218
|
|
PiperOrigin-RevId: 294952610
|
|
Addresses may be added before a NIC is enabled. Make sure DAD is
performed on the permanent IPv6 addresses when they get enabled.
Test:
- stack_test.TestDoDADWhenNICEnabled
- stack.TestDisabledRxStatsWhenNICDisabled
PiperOrigin-RevId: 293697429
|
|
From RFC 793 s3.9 p58 Event Processing:
If RECEIVE Call arrives in CLOSED state and the user has access to such a
connection, the return should be "error: connection does not exist"
Fixes #1598
PiperOrigin-RevId: 293494287
|
|
PiperOrigin-RevId: 292233574
|
|
Such a stat accounts for all connections that are currently
established and not yet transitioned to close state.
Also fix bug in double increment of CurrentEstablished stat.
Fixes #1579
PiperOrigin-RevId: 290827365
|
|
PiperOrigin-RevId: 290793754
|
|
CERT Advisory CA-96.21 III. Solution advises that devices drop packets which
could not have correctly arrived on the wire, such as receiving a packet where
the source IP address is owned by the device that sent it.
Fixes #1507
PiperOrigin-RevId: 290378240
|
|
PiperOrigin-RevId: 289718534
|
|
|
|
* Rename syncutil to sync.
* Add aliases to sync types.
* Replace existing usage of standard library sync package.
This will make it easier to swap out synchronization primitives. For example,
this will allow us to use primitives from github.com/sasha-s/go-deadlock to
check for lock ordering violations.
Updates #1472
PiperOrigin-RevId: 289033387
|
|
This makes it possible to call the sockopt from go even when the NIC has no
name.
PiperOrigin-RevId: 288955236
|
|
ending up with the wrong chains and is indexing -1 into rules.
|
|
...and port V6OnlyOption to it.
PiperOrigin-RevId: 288789451
|
|
PiperOrigin-RevId: 288772878
|
|
PiperOrigin-RevId: 287217899
|
|
Added the ability to get/set the IP_RECVTOS socket option on UDP endpoints. If
enabled, TOS from the incoming Network Header passed as ancillary data in the
ControlMessages.
Test:
* Added unit test to udp_test.go that tests getting/setting as well as
verifying that we receive expected TOS from incoming packet.
* Added a syscall test
PiperOrigin-RevId: 287029703
|
|
The implementation follows the linux behavior where specifying
a TCP_USER_TIMEOUT will cause the resend timer to honor the
user specified timeout rather than the default rto based timeout.
Further it alters when connections are timedout due to keepalive
failures. It does not alter the behavior of when keepalives are
sent. This is as per the linux behavior.
PiperOrigin-RevId: 285099795
|
|
Fix bugs in updates to TCP CurrentEstablished stat.
Fixes #1277
PiperOrigin-RevId: 284292459
|
|
This involves allowing getsockopt/setsockopt for the corresponding socket
options, as well as allowing hostinet to process control messages received from
the actual recvmsg syscall.
PiperOrigin-RevId: 282851425
|
|
This change adds explicit support for honoring the 2MSL timeout
for sockets in TIME_WAIT state. It also adds support for the
TCP_LINGER2 option that allows modification of the FIN_WAIT2
state timeout duration for a given socket.
It also adds an option to modify the Stack wide TIME_WAIT timeout
but this is only for testing. On Linux this is fixed at 60s.
Further, we also now correctly process RST's in CLOSE_WAIT and
close the socket similar to linux without moving it to error
state.
We also now handle SYN in ESTABLISHED state as per
RFC5961#section-4.1. Earlier we would just drop these SYNs.
Which can result in some tests that pass on linux to fail on
gVisor.
Netstack now honors TIME_WAIT correctly as well as handles the
following cases correctly.
- TCP RSTs in TIME_WAIT are ignored.
- A duplicate TCP FIN during TIME_WAIT extends the TIME_WAIT
and a dup ACK is sent in response to the FIN as the dup FIN
indicates potential loss of the original final ACK.
- An out of order segment during TIME_WAIT generates a dup ACK.
- A new SYN w/ a sequence number > the highest sequence number
in the previous connection closes the TIME_WAIT early and
opens a new connection.
Further to make the SYN case work correctly the ISN (Initial
Sequence Number) generation for Netstack has been updated to
be as per RFC. Its not a pure random number anymore and follows
the recommendation in https://tools.ietf.org/html/rfc6528#page-3.
The current hash used is not a cryptographically secure hash
function. A separate change will update the hash function used
to Siphash similar to what is used in Linux.
PiperOrigin-RevId: 279106406
|
|
This change allows the netstack to do NDP's Router Discovery as outlined by
RFC 4861 section 6.3.4.
Note, this change will not break existing uses of netstack as the default
configuration for the stack options is set in such a way that Router Discovery
will not be performed. See `stack.Options` and `stack.NDPConfigurations` for
more details.
This change introduces 2 options required to take advantage of Router Discovery,
all available under NDPConfigurations:
- HandleRAs: Whether or not NDP RAs are processes
- DiscoverDefaultRouters: Whether or not Router Discovery is performed
Another note: for a NIC to process Router Advertisements, it must not be a
router itself. Currently the netstack does not have per-interface routing
configuration; the routing/forwarding configuration is controlled stack-wide.
Therefore, if the stack is configured to enable forwarding/routing, no Router
Advertisements will be processed.
Tests: Unittest to make sure that Router Discovery and updates to the routing
table only occur if explicitly configured to do so. Unittest to make sure at
max stack.MaxDiscoveredDefaultRouters discovered default routers are remembered.
PiperOrigin-RevId: 278965143
|
|
DelayOption is set on all new endpoints in gVisor.
PiperOrigin-RevId: 276746791
|
|
PiperOrigin-RevId: 276380008
|
|
Like (AF_INET, SOCK_RAW) sockets, AF_PACKET sockets require CAP_NET_RAW. With
runsc, you'll need to pass `--net-raw=true` to enable them.
Binding isn't supported yet.
PiperOrigin-RevId: 275909366
|
|
...and do not populate link address cache at dispatch. This partially
reverts 313c767b0001bf6271405f1b765b60a334d6e911, which caused malformed
packets (e.g. NDP Neighbor Adverts with incorrect hop limit values) to
populate the address cache. In particular, this masked a bug that was
introduced to the Neighbor Advert generation code in
7c1587e3401a010d1865df61dbaf117c77dd062e.
PiperOrigin-RevId: 274865182
|
|
Signed-off-by: Jianfeng Tan <henry.tjf@antfin.com>
|
|
PiperOrigin-RevId: 274700093
|
|
Strengthen the header.IPv4.IsValid check to correctly check
for IHL/TotalLength fields. Also add a check to make sure
fragmentOffsets + size of the fragment do not cause a wrap
around for the end of the fragment.
PiperOrigin-RevId: 274049313
|
|
PiperOrigin-RevId: 273861936
|
|
Also change the default TTL to 64 to match Linux.
PiperOrigin-RevId: 273430341
|
|
The behavior for sending and receiving local broadcast (255.255.255.255)
traffic is as follows:
Outgoing
--------
* A broadcast packet sent on a socket that is bound to an interface goes out
that interface
* A broadcast packet sent on an unbound socket follows the route table to
select the outgoing interface
+ if an explicit route entry exists for 255.255.255.255/32, use that one
+ else use the default route
* Broadcast packets are looped back and delivered following the rules for
incoming packets (see next). This is the same behavior as for multicast
packets, except that it cannot be disabled via sockopt.
Incoming
--------
* Sockets wishing to receive broadcast packets must bind to either INADDR_ANY
(0.0.0.0) or INADDR_BROADCAST (255.255.255.255). No other socket receives
broadcast packets.
* Broadcast packets are multiplexed to all sockets matching it. This is the
same behavior as for multicast packets.
* A socket can bind to 255.255.255.255:<port> and then receive its own
broadcast packets sent to 255.255.255.255:<port>
In addition, this change implicitly fixes an issue with multicast reception. If
two sockets want to receive a given multicast stream and one is bound to ANY
while the other is bound to the multicast address, only one of them will
receive the traffic.
PiperOrigin-RevId: 272792377
|
|
PiperOrigin-RevId: 271644926
|