Age | Commit message (Collapse) | Author |
|
Note, this includes static entries to match linux's behaviour.
```
$ ip neigh show dev eth0
192.168.42.1 lladdr fc:ec:da:70:6e:f9 STALE
$ sudo ip neigh add 192.168.42.172 lladdr 22:33:44:55:66:77 dev eth0
$ ip neigh show dev eth0
192.168.42.1 lladdr fc:ec:da:70:6e:f9 STALE
192.168.42.172 lladdr 22:33:44:55:66:77 PERMANENT
$ sudo ifconfig eth0 down
$ ip neigh show dev eth0
$ sudo ifconfig eth0 up
$ ip neigh show dev eth0
```
Test: stack_test.TestClearNeighborCacheOnNICDisable
PiperOrigin-RevId: 351696306
|
|
Link address resolution is performed at the link layer (if required) so
we can defer it from the transport layer. When link resolution is
required, packets will be queued and sent once link resolution
completes. If link resolution fails, the transport layer will receive a
control message indicating that the stack failed to route the packet.
tcpip.Endpoint.Write no longer returns a channel now that writes do not
wait for link resolution at the transport layer.
tcpip.ErrNoLinkAddress is no longer used so it is removed.
Removed calls to stack.Route.ResolveWith from the transport layer so
that link resolution is performed when a route is created in response
to an incoming packet (e.g. to complete TCP handshakes or send a RST).
Tests:
- integration_test.TestForwarding
- integration_test.TestTCPLinkResolutionFailure
Fixes #4458
RELNOTES: n/a
PiperOrigin-RevId: 351684158
|
|
When a control packet is delivered, it is delivered to a transport
endpoint with a matching stack.TransportEndpointID so there is no
need to pass the ID to the endpoint as it already knows its ID.
PiperOrigin-RevId: 351497588
|
|
These are primarily simplification and lint mistakes. However, minor
fixes are also included and tests added where appropriate.
PiperOrigin-RevId: 351425971
|
|
Read now takes a destination io.Writer, count, options. Keeping the method name
Read, in contrast to the Write method.
This enables:
* direct transfer of views under VV
* zero copy
It also eliminates the need for sentry to keep a slice of view because
userspace had requested a read that is smaller than the view returned, removing
the complexity there.
Read/Peek/ReadPacket are now consolidated together and some duplicate code is
removed.
PiperOrigin-RevId: 350636322
|
|
PiperOrigin-RevId: 348696094
|
|
Removes the period of time in which subseqeuent traffic to a Failed neighbor
immediately fails with ErrNoLinkAddress. A Failed neighbor is one in which
address resolution fails; or in other words, the neighbor's IP address cannot
be translated to a MAC address.
This means removing the Failed state for linkAddrCache and allowing transitiong
out of Failed into Incomplete for neighborCache. Previously, both caches would
transition entries to Failed after address resolution fails. In this state, any
subsequent traffic requested within an unreachable time would immediately fail
with ErrNoLinkAddress. This does not follow RFC 4861 section 7.3.3:
If address resolution fails, the entry SHOULD be deleted, so that subsequent
traffic to that neighbor invokes the next-hop determination procedure again.
Invoking next-hop determination at this point ensures that alternate default
routers are tried.
The API for getting a link address for a given address, whether through the link
address cache or the neighbor table, is updated to optionally take a callback
which will be called when address resolution completes. This allows `Route` to
handle completing link resolution internally, so callers of (*Route).Resolve
(e.g. endpoints) don’t have to keep track of when it completes and update the
Route accordingly.
This change also removes the wakers from LinkAddressCache, NeighborCache, and
Route in favor of the callbacks, and callers that previously used a waker can
now just pass a callback to (*Route).Resolve that will notify the waker on
resolution completion.
Fixes #4796
Startblock:
has LGTM from sbalana
and then
add reviewer ghanan
PiperOrigin-RevId: 348597478
|
|
...when performing source address selection for IPv6.
These are defined in RFC 6724 section 5 rule 6 (prefer matching label)
and rule 8 (use longest matching prefix).
This change also considers ULA of global scope instead of its own scope,
as per RFC 6724 section 3.1:
Also, note that ULAs are considered as global, not
site-local, scope but are handled via the prefix policy table as
discussed in Section 10.6.
Test: stack_test.TestIPv6SourceAddressSelectionScope
Startblock:
has LGTM from peterjohnston
and then
add reviewer brunodalbo
PiperOrigin-RevId: 348580996
|
|
PiperOrigin-RevId: 347974624
|
|
sacked_out is required in RACK to check the number of duplicate
acknowledgements during updating the reorder window. If there is no reordering
and the value for sacked_out is greater than the classic threshold value 3,
then reorder window is set to zero.
It is calculated by counting the number of segments sacked in the ACK and is
reduced when a cumulative ACK is received which covers the SACK blocks. This
value is set to zero when the connection enters recovery.
PiperOrigin-RevId: 347872246
|
|
packetEPs may get into a state that `len < cap`, casuing append() modifying the
original slice storage.
Reported-by: syzbot+978dd0e9c2600ab7a76b@syzkaller.appspotmail.com
PiperOrigin-RevId: 347634351
|
|
Adds IPv6 extension header serializer and Hop by Hop options serializer.
Add RouterAlert option serializer and use it in MLD.
Fixed #4996
Startblock:
has LGTM from marinaciocea
and then
add reviewer ghanan
PiperOrigin-RevId: 347174537
|
|
tcpip.ControlMessages can not contain Linux specific structures which makes it
painful to convert back and forth from Linux to tcpip back to Linux when passing
around control messages in hostinet and raw sockets.
Now we convert to the Linux version of the control message as soon as we are
out of tcpip.
PiperOrigin-RevId: 347027065
|
|
This change also considers interfaces and network endpoints enabled up
up to the point all work to disable them are complete. This was needed
so that protocols can perform shutdown work while being disabled (e.g.
sending a packet which requires the endpoint to be enabled to obtain a
source address).
Bug #4682, #4861
Fixes #4888
Startblock:
has LGTM from peterjohnston
and then
add reviewer brunodalbo
PiperOrigin-RevId: 346869702
|
|
Startblock:
has LGTM from asfez
and then
add reviewer tamird
PiperOrigin-RevId: 345815146
|
|
PiperOrigin-RevId: 345701623
|
|
Currently we rely on the user to take the lock on the endpoint that owns the
route, in order to modify it safely. We can instead move
`Route.RemoteLinkAddress` under `Route`'s mutex, and allow non-locking and
thread-safe access to other fields of `Route`.
PiperOrigin-RevId: 345461586
|
|
PiperOrigin-RevId: 345399936
|
|
This change lets us split the v4 stats from the v6 stats, which will be
useful when adding stats for each network endpoint.
PiperOrigin-RevId: 345322615
|
|
...by using the fake clock.
TestRouterSolicitation no longer runs its sub-tests in parallel now that
the sub-tests are not long-running - the fake clock simulates time
moving forward.
PiperOrigin-RevId: 345165794
|
|
PiperOrigin-RevId: 345162450
|
|
Before this change, the join count and the state for IGMP/MLD was held
across different types which required multiple locks to be held when
accessing a multicast group's state.
Bug #4682, #4861
Fixes #4916
PiperOrigin-RevId: 345019091
|
|
Test: ip_test.TestMGPWithNICLifecycle
Bug #4682, #4861
PiperOrigin-RevId: 344888091
|
|
Ports the following options:
- TCP_NODELAY
- TCP_CORK
- TCP_QUICKACK
Also deletes the {Get/Set}SockOptBool interface methods from all implementations
PiperOrigin-RevId: 344378824
|
|
We will use SocketOptions for all kinds of options, not just SOL_SOCKET options
because (1) it is consistent with Linux which defines all option variables on
the top level socket struct, (2) avoid code complexity. Appropriate checks
have been added for matching option level to the endpoint type.
Ported the following options to this new utility:
- IP_MULTICAST_LOOP
- IP_RECVTOS
- IPV6_RECVTCLASS
- IP_PKTINFO
- IP_HDRINCL
- IPV6_V6ONLY
Changes in behavior (these are consistent with what Linux does AFAICT):
- Now IP_MULTICAST_LOOP can be set for TCP (earlier it was a noop) but does not
affect the endpoint itself.
- We can now getsockopt IP_HDRINCL (earlier we would get an error).
- Now we return ErrUnknownProtocolOption if SOL_IP or SOL_IPV6 options are used
on unix sockets.
- Now we return ErrUnknownProtocolOption if SOL_IPV6 options are used on non
AF_INET6 endpoints.
This change additionally makes the following modifications:
- Add State() uint32 to commonEndpoint because both tcpip.Endpoint and
transport.Endpoint interfaces have it. It proves to be quite useful.
- Gets rid of SocketOptionsHandler.IsListening(). It was an anomaly as it was
not a handler. It is now implemented on netstack itself.
- Gets rid of tcp.endpoint.EndpointInfo and directly embeds
stack.TransportEndpointInfo. There was an unnecessary level of embedding
which served no purpose.
- Removes some checks dual_stack_test.go that used the errors from
GetSockOptBool(tcpip.V6OnlyOption) to confirm some state. This is not
consistent with the new design and also seemed to be testing the
implementation instead of behavior.
PiperOrigin-RevId: 344354051
|
|
...as defined by RFC 2710. Querier (router)-side MLDv1 is not yet
supported.
The core state machine is shared with IGMPv2.
This is guarded behind a flag (ipv6.Options.MLDEnabled).
Tests: ip_test.TestMGP*
Bug #4861
PiperOrigin-RevId: 344344095
|
|
Multiple goroutines may use the same stack.Route concurrently so
the stack.Route should make sure that any functions called on it
are thread-safe.
Fixes #4073
PiperOrigin-RevId: 344320491
|
|
Fix a panic when two entries in Failed state are removed at the same time.
PiperOrigin-RevId: 344143777
|
|
Add a NIC-specific neighbor table statistic so we can determine how many
packets have been queued to Failed neighbors, indicating an unhealthy local
network. This change assists us to debug in-field issues where subsequent
traffic to a neighbor fails.
Fixes #4819
PiperOrigin-RevId: 344131119
|
|
PiperOrigin-RevId: 344009602
|
|
Added headers, stats, checksum parsing capabilities from RFC 2236 describing
IGMPv2.
IGMPv2 state machine is implemented for each condition, sending and receiving
IGMP Membership Reports and Leave Group messages with backwards compatibility
with IGMPv1 routers.
Test:
* Implemented igmp header parser and checksum calculator in header/igmp_test.go
* ipv4/igmp_test.go tests incoming and outgoing IGMP messages and pathways.
* Added unit test coverage for IGMPv2 RFC behavior + IGMPv1 backwards
compatibility in ipv4/igmp_test.go.
Fixes #4682
PiperOrigin-RevId: 343408809
|
|
Closes #4022
PiperOrigin-RevId: 343378647
|
|
Group addressable endpoints can simply check if it has joined the
multicast group without maintaining address endpoints. This also
helps remove the dependency on AddressableEndpoint from
GroupAddressableEndpoint.
Now that group addresses are not tracked with address endpoints, we can
avoid accidentally obtaining a route with a multicast local address.
PiperOrigin-RevId: 343336912
|
|
PiperOrigin-RevId: 343211553
|
|
This changes also introduces:
- `SocketOptionsHandler` interface which can be implemented by endpoints to
handle endpoint specific behavior on SetSockOpt. This is analogous to what
Linux does.
- `DefaultSocketOptionsHandler` which is a default implementation of the above.
This is embedded in all endpoints so that we don't have to uselessly
implement empty functions. Endpoints with specific behavior can override the
embedded method by manually defining its own implementation.
PiperOrigin-RevId: 343158301
|
|
Packets should be properly routed when sending packets to addresses
in the loopback subnet which are not explicitly assigned to the loopback
interface.
Tests:
- integration_test.TestLoopbackAcceptAllInSubnetUDP
- integration_test.TestLoopbackAcceptAllInSubnetTCP
PiperOrigin-RevId: 343135643
|
|
Redefine stack.WritePacket into stack.WritePacketToRemote which lets the NIC
decide whether to append link headers.
PiperOrigin-RevId: 343071742
|
|
- Make AddressableEndpoint optional for NetworkEndpoint.
Not all NetworkEndpoints need to support addressing (e.g. ARP), so
AddressableEndpoint should only be implemented for protocols that
support addressing such as IPv4 and IPv6.
With this change, tcpip.ErrNotSupported will be returned by the stack
when attempting to modify addresses on a network endpoint that does
not support addressing.
Now that packets are fully handled at the network layer, and (with this
change) addresses are optional for network endpoints, we no longer need
the workaround for ARP where a fake ARP address was added to each NIC
that performs ARP so that packets would be delivered to the ARP layer.
PiperOrigin-RevId: 342722547
|
|
Detect if the ACK is a duplicate and update in RACK.
PiperOrigin-RevId: 342332569
|
|
Store all the socket level options in a struct and call {Get/Set}SockOpt on
this struct. This will avoid implementing socket level options on all
endpoints. This CL contains implementing one socket level option for tcp and
udp endpoints.
PiperOrigin-RevId: 342203981
|
|
RELNOTES: n/a
PiperOrigin-RevId: 342176296
|
|
The NIC should not hold network-layer state or logic - network packet
handling/forwarding should be performed at the network layer instead
of the NIC.
Fixes #4688
PiperOrigin-RevId: 342166985
|
|
Most packets don't have options but they are an integral part of the
standard. Teaching the ipv4 code how to handle them will simplify future
testing and use. Because Options are so rare it is worth making sure
that the extra work is kept out of the fast path as much as possible.
Prior to this change, all usages of the IHL field of the IPv4Fields/Encode
system set it to the same constant value except in a couple of tests
for bad values. From this change IHL will not be a constant as it will
depend on the size of any Options. Since ipv4.Encode() now handles the
options it becomes a possible source of errors to let the callers set
this value, so remove it entirely and calculate the value from the size
of the Options if present (or not) therefore guaranteeing a correct value.
Fixes #4709
RELNOTES: n/a
PiperOrigin-RevId: 341864765
|
|
cl/340002915 modified the code to return EADDRNOTAVAIL if connect
is called for a localhost address which isn't set.
But actually, Linux returns EADDRNOTAVAIL for ipv6 addresses and ENETUNREACH
for ipv4 addresses.
Updates #4735
PiperOrigin-RevId: 341479129
|
|
This change adds a Subnet() method to AddressableEndpoint so that we
can avoid repeated calls to AddressableEndpoint.AddressWithPrefix().Subnet().
Updates #231
PiperOrigin-RevId: 340969877
|
|
* Remove stack.Route from incoming packet path.
There is no need to pass around a stack.Route during the incoming path
of a packet. Instead, pass around the packet's link/network layer
information in the packet buffer since all layers may need this
information.
* Support address bound and outgoing packet NIC in routes.
When forwarding is enabled, the source address of a packet may be bound
to a different interface than the outgoing interface. This change
updates stack.Route to hold both NICs so that one can be used to write
packets while the other is used to check if the route's bound address
is valid. Note, we need to hold the address's interface so we can check
if the address is a spoofed address.
* Introduce the concept of a local route.
Local routes are routes where the packet never needs to leave the stack;
the destination is stack-local. We can now route between interfaces
within a stack if the packet never needs to leave the stack, even when
forwarding is disabled.
* Always obtain a route from the stack before sending a packet.
If a packet needs to be sent in response to an incoming packet, a route
must be obtained from the stack to ensure the stack is configured to
send packets to the packet's source from the packet's destination.
* Enable spoofing if a stack may send packets from unowned addresses.
This change required changes to some netgophers since previously,
promiscuous mode was enough to let the netstack respond to all
incoming packets regardless of the packet's destination address. Now
that a stack.Route is not held for each incoming packet, finding a route
may fail with local addresses we don't own but accepted packets for
while in promiscuous mode. Since we also want to be able to send from
any address (in response the received promiscuous mode packets), we need
to enable spoofing.
* Skip transport layer checksum checks for locally generated packets.
If a packet is locally generated, the stack can safely assume that no
errors were introduced while being locally routed since the packet is
never sent out the wire.
Some bugs fixed:
- transport layer checksum was never calculated after NAT.
- handleLocal didn't handle routing across interfaces.
- stack didn't support forwarding across interfaces.
- always consult the routing table before creating an endpoint.
Updates #4688
Fixes #3906
PiperOrigin-RevId: 340943442
|
|
Send NUD probes in another gorountine to free the thread of execution for
finishing the state transition. This is necessary to avoid deadlock where
sending and processing probes are done in the same call stack, such as loopback
and integration tests.
Fixes #4701
PiperOrigin-RevId: 340362481
|
|
In the docker container, the ipv6 loopback address is not set,
and connect("::1") has to return ENEADDRNOTAVAIL in this case.
Without this fix, it returns EHOSTUNREACH.
PiperOrigin-RevId: 340002915
|
|
PiperOrigin-RevId: 339945377
|
|
PiperOrigin-RevId: 339750876
|