Age | Commit message (Collapse) | Author |
|
Link address resolution is performed at the link layer (if required) so
we can defer it from the transport layer. When link resolution is
required, packets will be queued and sent once link resolution
completes. If link resolution fails, the transport layer will receive a
control message indicating that the stack failed to route the packet.
tcpip.Endpoint.Write no longer returns a channel now that writes do not
wait for link resolution at the transport layer.
tcpip.ErrNoLinkAddress is no longer used so it is removed.
Removed calls to stack.Route.ResolveWith from the transport layer so
that link resolution is performed when a route is created in response
to an incoming packet (e.g. to complete TCP handshakes or send a RST).
Tests:
- integration_test.TestForwarding
- integration_test.TestTCPLinkResolutionFailure
Fixes #4458
RELNOTES: n/a
PiperOrigin-RevId: 351684158
|
|
Whether the variable was found is already returned by syscall.Getenv.
os.Getenv drops this value while os.Lookupenv passes it along.
PiperOrigin-RevId: 351674032
|
|
It is now composed by a NetworkInterface interface which lets us delete
the methods we don't need.
PiperOrigin-RevId: 351613267
|
|
This change implements TLP details enumerated in
https://tools.ietf.org/html/draft-ietf-tcpm-rack-08#section-7.6
Fixes #5131
PiperOrigin-RevId: 351558449
|
|
When a control packet is delivered, it is delivered to a transport
endpoint with a matching stack.TransportEndpointID so there is no
need to pass the ID to the endpoint as it already knows its ID.
PiperOrigin-RevId: 351497588
|
|
PiperOrigin-RevId: 351491836
|
|
This change implements TLP details enumerated in
https://tools.ietf.org/html/draft-ietf-tcpm-rack-08#section-7.5.1.
Fixes #5083
PiperOrigin-RevId: 351467357
|
|
These are primarily simplification and lint mistakes. However, minor
fixes are also included and tests added where appropriate.
PiperOrigin-RevId: 351425971
|
|
Read now takes a destination io.Writer, count, options. Keeping the method name
Read, in contrast to the Write method.
This enables:
* direct transfer of views under VV
* zero copy
It also eliminates the need for sentry to keep a slice of view because
userspace had requested a read that is smaller than the view returned, removing
the complexity there.
Read/Peek/ReadPacket are now consolidated together and some duplicate code is
removed.
PiperOrigin-RevId: 350636322
|
|
Ethernet frames are usually filtered at the hardware-level so there is
no need to filter the frames in software.
For test purposes, a new link endpoint was introduced to filter frames
based on their destination.
PiperOrigin-RevId: 350422941
|
|
IPv4 was always supported but UDP never supported joining/leaving IPv6
multicast groups via socket options.
Add: IPPROTO_IPV6, IPV6_JOIN_GROUP/IPV6_ADD_MEMBERSHIP
Remove: IPPROTO_IPV6, IPV6_LEAVE_GROUP/IPV6_DROP_MEMBERSHIP
Test: integration_test.TestUDPAddRemoveMembershipSocketOption
PiperOrigin-RevId: 350396072
|
|
PiperOrigin-RevId: 348696094
|
|
This condition was inverted in 360006d.
PiperOrigin-RevId: 348679088
|
|
Removes the period of time in which subseqeuent traffic to a Failed neighbor
immediately fails with ErrNoLinkAddress. A Failed neighbor is one in which
address resolution fails; or in other words, the neighbor's IP address cannot
be translated to a MAC address.
This means removing the Failed state for linkAddrCache and allowing transitiong
out of Failed into Incomplete for neighborCache. Previously, both caches would
transition entries to Failed after address resolution fails. In this state, any
subsequent traffic requested within an unreachable time would immediately fail
with ErrNoLinkAddress. This does not follow RFC 4861 section 7.3.3:
If address resolution fails, the entry SHOULD be deleted, so that subsequent
traffic to that neighbor invokes the next-hop determination procedure again.
Invoking next-hop determination at this point ensures that alternate default
routers are tried.
The API for getting a link address for a given address, whether through the link
address cache or the neighbor table, is updated to optionally take a callback
which will be called when address resolution completes. This allows `Route` to
handle completing link resolution internally, so callers of (*Route).Resolve
(e.g. endpoints) don’t have to keep track of when it completes and update the
Route accordingly.
This change also removes the wakers from LinkAddressCache, NeighborCache, and
Route in favor of the callbacks, and callers that previously used a waker can
now just pass a callback to (*Route).Resolve that will notify the waker on
resolution completion.
Fixes #4796
Startblock:
has LGTM from sbalana
and then
add reviewer ghanan
PiperOrigin-RevId: 348597478
|
|
...when performing source address selection for IPv6.
These are defined in RFC 6724 section 5 rule 6 (prefer matching label)
and rule 8 (use longest matching prefix).
This change also considers ULA of global scope instead of its own scope,
as per RFC 6724 section 3.1:
Also, note that ULAs are considered as global, not
site-local, scope but are handled via the prefix policy table as
discussed in Section 10.6.
Test: stack_test.TestIPv6SourceAddressSelectionScope
Startblock:
has LGTM from peterjohnston
and then
add reviewer brunodalbo
PiperOrigin-RevId: 348580996
|
|
Reported-by: syzbot+48c43f82fe7738fceae9@syzkaller.appspotmail.com
PiperOrigin-RevId: 348540796
|
|
PiperOrigin-RevId: 348530530
|
|
PiperOrigin-RevId: 348055514
|
|
Introduces the per-socket error queue and the necessary cmsg mechanisms.
PiperOrigin-RevId: 348028508
|
|
PiperOrigin-RevId: 347974624
|
|
Startblock:
has LGTM from asfez
and then
add reviewer tamird
PiperOrigin-RevId: 347928471
|
|
PiperOrigin-RevId: 347911316
|
|
sacked_out is required in RACK to check the number of duplicate
acknowledgements during updating the reorder window. If there is no reordering
and the value for sacked_out is greater than the classic threshold value 3,
then reorder window is set to zero.
It is calculated by counting the number of segments sacked in the ACK and is
reduced when a cumulative ACK is received which covers the SACK blocks. This
value is set to zero when the connection enters recovery.
PiperOrigin-RevId: 347872246
|
|
When the scaled receive window size > 65535 (max uint16), we advertise
the scaled value as 65535, but are not adjusting the saved receive
window value when doing so. This would keep our current window
calculation logic to be incorrect, as the saved receive window value
is different from what was advertised.
Fixes #4903
PiperOrigin-RevId: 347771340
|
|
RFC 2711 specifies that the router alert's length field is always 2
so we should make sure only 2 bytes are read from a router alert
option's data field.
Test: header.TestIPv6OptionsExtHdrIterErr
PiperOrigin-RevId: 347727876
|
|
Startblock:
has LGTM from asfez
and then
add reviewer brunodalbo
PiperOrigin-RevId: 347716242
|
|
PiperOrigin-RevId: 347650354
|
|
packetEPs may get into a state that `len < cap`, casuing append() modifying the
original slice storage.
Reported-by: syzbot+978dd0e9c2600ab7a76b@syzkaller.appspotmail.com
PiperOrigin-RevId: 347634351
|
|
PiperOrigin-RevId: 347437786
|
|
SO_OOBINLINE option is set/get as boolean value, which is the same as linux.
As we currently do not support disabling this option, we always return it as
true.
PiperOrigin-RevId: 347413905
|
|
- Deep-copy pkt.Data and hold it instead of shallow-copy (vv.Clone).
This allows the pkt's backing array, which includes the header portion,
to be freed.
- Remove fragHeap. The fragments are now held in holes struct instead.
- Stop reserving the initial capacity of holes slice.
PiperOrigin-RevId: 347198744
|
|
Adds IPv6 extension header serializer and Hop by Hop options serializer.
Add RouterAlert option serializer and use it in MLD.
Fixed #4996
Startblock:
has LGTM from marinaciocea
and then
add reviewer ghanan
PiperOrigin-RevId: 347174537
|
|
We do not rely on error for getsockopt options(which have boolean values)
anymore. This will cause issue in sendmsg where we used to return error
for IPV6_V6Only option. Fix the panic by returning error (for sockets other
than TCP and UDP) if the address does not match the type(AF_INET/AF_INET6) of
the socket.
PiperOrigin-RevId: 347063838
|
|
tcpip.ControlMessages can not contain Linux specific structures which makes it
painful to convert back and forth from Linux to tcpip back to Linux when passing
around control messages in hostinet and raw sockets.
Now we convert to the Linux version of the control message as soon as we are
out of tcpip.
PiperOrigin-RevId: 347027065
|
|
fdbased endpoint was enabling fragment reassembly on the host AF_PACKET socket
to ensure that fragments are delivered inorder to the right dispatcher. But this
prevents fragments from being delivered to gvisor at all and makes testing of
gvisor's fragment reassembly code impossible.
The potential impact from this is minimal since IP Fragmentation is not really
that prevelant and in cases where we do get fragments we may deliver the
fragment out of order to the TCP layer as multiple network dispatchers may
process the fragments and deliver a reassembled fragment after the next packet
has been delivered to the TCP endpoint. While not desirable I believe the impact
from this is minimal due to low prevalence of fragmentation.
Also removed PktType and Hatype fields when binding the socket as these are not
used when binding. Its just confusing to have them specified.
See: https://man7.org/linux/man-pages/man7/packet.7.html
"Fields used for binding are
sll_family (should be AF_PACKET), sll_protocol, and sll_ifindex."
Fixes #5055
PiperOrigin-RevId: 346919439
|
|
This change also considers interfaces and network endpoints enabled up
up to the point all work to disable them are complete. This was needed
so that protocols can perform shutdown work while being disabled (e.g.
sending a packet which requires the endpoint to be enabled to obtain a
source address).
Bug #4682, #4861
Fixes #4888
Startblock:
has LGTM from peterjohnston
and then
add reviewer brunodalbo
PiperOrigin-RevId: 346869702
|
|
Fixes #5004
PiperOrigin-RevId: 346643745
|
|
Earlier we could not save tcpip.Error objects in structs because upon restore
the constant's address changes in netstack's error translation map and
translating the error would panic because the map is based on the address of the
tcpip.Error instead of the error itself.
Now I made that translations map use the error message as key instead of the
address. Added relevant synchronization mechanisms to protect the structure
and initialize it upon restore.
PiperOrigin-RevId: 346590485
|
|
The loopback interface will never have any neighbouring nodes so
advertising its interest in multicast groups is unnecessary.
Bug #4682, #4861
Startblock:
has LGTM from asfez
and then
add reviewer tamird
PiperOrigin-RevId: 346587604
|
|
startblock:
has LGTM from peterjohnston
and then
add reviewer ghanan,tamird
PiperOrigin-RevId: 346565589
|
|
PiperOrigin-RevId: 346487763
|
|
PiperOrigin-RevId: 346197760
|
|
Removes comment lines about MaxUnsolicitedReportDelay. This is already
documented in the comment for GenericMulticastProtocolOptions.
PiperOrigin-RevId: 346185053
|
|
With the recent changes db36d948fa63ce950d94a5e8e9ebc37956543661, we try
to balance the receive window advertisements between payload lengths vs
segment overhead length. This works fine when segment size are much
higher than the overhead, but not otherwise. In cases where the segment
length is smaller than the segment overhead, we may end up not
advertising zero receive window for long time and end up tail-dropping
segments. This is especially pronounced when application socket reads
are slow or stopped. In this change we do not grow the right edge of
the receive window for smaller segment sizes similar to Linux.
Also, we keep track of the socket buffer usage and let the window grow
if the application is actively reading data.
Fixes #4903
PiperOrigin-RevId: 345832012
|
|
Startblock:
has LGTM from asfez
and then
add reviewer tamird
PiperOrigin-RevId: 345815146
|
|
PiperOrigin-RevId: 345701623
|
|
Currently we rely on the user to take the lock on the endpoint that owns the
route, in order to modify it safely. We can instead move
`Route.RemoteLinkAddress` under `Route`'s mutex, and allow non-locking and
thread-safe access to other fields of `Route`.
PiperOrigin-RevId: 345461586
|
|
PiperOrigin-RevId: 345399936
|
|
This change lets us split the v4 stats from the v6 stats, which will be
useful when adding stats for each network endpoint.
PiperOrigin-RevId: 345322615
|
|
However, receiving duplicated fragments will not cause reassembly to
fail. This is what Linux does too:
https://github.com/torvalds/linux/blob/38525c6/net/ipv4/inet_fragment.c#L355
PiperOrigin-RevId: 345309546
|