summaryrefslogtreecommitdiffhomepage
path: root/pkg/tcpip/stack
AgeCommit message (Collapse)Author
2021-01-31Default to NUD/neighborCache instead of linkAddrCacheGhanan Gowripalan
This change flips gvisor to use Neighbor unreachability detection by default to populate the neighbor table as defined by RFC 4861 section 7. Although RFC 4861 is targeted at IPv6, the same algorithm is used for link resolution on IPv4 networks using ARP. Integrators may still use the legacy link address cache by setting stack.Options.UseLinkAddrCache to true; stack.Options.UseNeighborCache is now unused and will be removed. A later change will remove linkAddrCache and associated code. Updates #4658. PiperOrigin-RevId: 354850531
2021-01-31Use different neighbor tables per network endpointGhanan Gowripalan
This stores each protocol's neighbor state separately. This change also removes the need for each neighbor entry to keep track of their own link address resolver now that all the entries in a cache will use the same resolver. PiperOrigin-RevId: 354818155
2021-01-31Hide neighbor table kind from NetworkEndpointGhanan Gowripalan
The network endpoint should not need to have logic to handle different kinds of neighbor tables. Network endpoints can let the NIC know about differnt neighbor discovery messages and let the NIC decide which table to update. This allows us to remove the LinkAddressCache interface. PiperOrigin-RevId: 354812584
2021-01-30Extract route table from Stack lockTamir Duberstein
PiperOrigin-RevId: 354746864
2021-01-30Implement LinkAddressResolver on NetworkEndpointsGhanan Gowripalan
This removes the need to provide the link address request with the NIC the request is being performed on since the NetworkEndpoints already have a reference to the NIC. PiperOrigin-RevId: 354721940
2021-01-29Make fragmentation return a reassembled PacketBufferTing-Yu Wang
This allows later decoupling of the backing network buffer implementation. PiperOrigin-RevId: 354643297
2021-01-28Avoid locking when route doesn't require resolutionGhanan Gowripalan
When a route does not need to resolve a remote link address to send a packet, avoid having to obtain the pending packets queue's lock. PiperOrigin-RevId: 354456280
2021-01-28RACK: Update reorder window.Nayana Bidari
After receiving an ACK(cumulative or selective), RACK will update the reorder window which is used as a settling time before marking the packet as lost. This change will add an init function to initialize the variables in RACK and also store the reference to sender in rackControl. The reorder window is calculated as per rfc: https://tools.ietf.org/html/draft-ietf-tcpm-rack-08#section-7.2 Step 4. PiperOrigin-RevId: 354453528
2021-01-28Acquire entry lock with cache lock heldTamir Duberstein
Avoid a race condition in which an entry is acquired while it is being evicted by overlapping the entry lock with the cache lock. PiperOrigin-RevId: 354452639
2021-01-28Change tcpip.Error to an interfaceTamir Duberstein
This makes it possible to add data to types that implement tcpip.Error. ErrBadLinkEndpoint is removed as it is unused. PiperOrigin-RevId: 354437314
2021-01-27Confirm neighbor reachability with TCP ACKsGhanan Gowripalan
As per RFC 4861 section 7.3.1, A neighbor is considered reachable if the node has recently received a confirmation that packets sent recently to the neighbor were received by its IP layer. Positive confirmation can be gathered in two ways: hints from upper-layer protocols that indicate a connection is making "forward progress", or receipt of a Neighbor Advertisement message that is a response to a Neighbor Solicitation message. This change adds support for TCP to let the IP/link layers know that a neighbor is reachable. Test: integration_test.TestTCPConfirmNeighborReachability PiperOrigin-RevId: 354222833
2021-01-27Rename anonymous struct "mu"Tamir Duberstein
This clarifies that there is a lock involved. PiperOrigin-RevId: 354213848
2021-01-27Move protected fields under anonymous mutexTamir Duberstein
Fixes #5150. PiperOrigin-RevId: 354194385
2021-01-27Merge pull request #4705 from mlevesquedion:fix-cmp-diff-reporting-in-nud-testsgVisor bot
PiperOrigin-RevId: 354187603
2021-01-26Initialize the send buffer handler in endpoint creation.Nayana Bidari
- This CL will initialize the function handler used for getting the send buffer size limits during endpoint creation and does not require the caller of SetSendBufferSize(..) to know the endpoint type(tcp/udp/..) PiperOrigin-RevId: 353992634
2021-01-26Drop nicID from transport endpoint reg/cleanup fnsGhanan Gowripalan
...as it is unused. PiperOrigin-RevId: 353896981
2021-01-26Move SO_SNDBUF to socketops.Nayana Bidari
This CL moves {S,G}etsockopt of SO_SNDBUF from all endpoints to socketops. For unix sockets, we do not support setting of this option. PiperOrigin-RevId: 353871484
2021-01-22Refactor GetMainNICAddressArthur Sfez
It previously returned an error but it could only be UnknownNICID. It now returns a boolean to indicate whether the nic exists or not. PiperOrigin-RevId: 353337489
2021-01-22Pass RouteInfo to the route resolve callbackGhanan Gowripalan
The route resolution callback will be called with a stack.ResolvedFieldsResult which will hold the route info so callers can avoid attempting resolution again to check if a previous resolution attempt succeeded or not. Test: integration_test.TestRouteResolvedFields PiperOrigin-RevId: 353319019
2021-01-22Define tcpip.Payloader in terms of io.ReaderTamir Duberstein
Fixes #1509. PiperOrigin-RevId: 353295589
2021-01-21Resolve static link addresses in GetLinkAddressGhanan Gowripalan
If a network address has a static mapping to a link address, calculate it in GetLinkAddress. Test: stack_test.TestStaticGetLinkAddress PiperOrigin-RevId: 353179616
2021-01-21iptables: support matching the input interface nameToshi Kikuchi
We have support for the output interface name, but not for the input interface name. This change adds the support for the input interface name, and adds the test cases for it. Fixes #5300 PiperOrigin-RevId: 353179389
2021-01-21Only use callback for GetLinkAddressGhanan Gowripalan
GetLinkAddress's callback will be called immediately with a stack.LinkResolutionResult which will hold the link address so no need to also return the link address from the function. Fixes #5151. PiperOrigin-RevId: 353157857
2021-01-21Do not cache remote link address in RouteGhanan Gowripalan
...unless explicitly requested via ResolveWith. Remove cancelled channels from pending packets as we can use the link resolution channel in a FIFO to limit the number of maximum pending resolutions we should queue packets for. This change also defers starting the goroutine that handles link resolution completion to when link resolution succeeds, fails or gets cancelled due to the max number of pending resolutions being reached. Fixes #751. PiperOrigin-RevId: 353130577
2021-01-21Queue packets in WritePackets when resolving link addressGhanan Gowripalan
Test: integration_test.TestWritePacketsLinkResolution Fixes #4458. PiperOrigin-RevId: 353108826
2021-01-21Populate EgressRoute, GSO, Netproto in NICGhanan Gowripalan
fdbased and qdisc layers expect these fields to already be populated before being reached. PiperOrigin-RevId: 353099492
2021-01-20rewrite diff check to match example in cmp.Diff docsMichaël Lévesque-Dion
2021-01-19Do not have a stack-wide linkAddressCacheGhanan Gowripalan
Link addresses are cached on a per NIC basis so instead of having a single cache that includes the NIC ID for neighbor entry lookups, use a single cache per NIC. PiperOrigin-RevId: 352684111
2021-01-19Per NIC NetworkEndpoint statisticsArthur Sfez
To facilitate the debugging of multi-homed setup, track Network protocols statistics for each endpoint. Note that the original stack-wide stats still exist. A new type of statistic counter is introduced, which track two versions of a stat at the same time. This lets a network endpoint increment both the local stat and the stack-wide stat at the same time. Fixes #4605 PiperOrigin-RevId: 352663276
2021-01-19Drop CheckLocalAddress from LinkAddressCacheGhanan Gowripalan
PiperOrigin-RevId: 352623277
2021-01-17Do not use a stack-wide queue of pending packetsGhanan Gowripalan
Packets may be pending on link resolution to complete before being sent. Link resolution is performed for neighbors which are unique to a NIC so hold link resolution related state under the NIC, not the stack. Note, this change may result in more queued packets but that is okay as RFC 4861 section 7.2.2 recommends that the stack maintain a queue of packets for each neighbor that is waiting for link resolution to complete, not a fixed limit per stack. PiperOrigin-RevId: 352322155
2021-01-15Resolve known link address on route creationGhanan Gowripalan
If a Route is being created through a link that requires link address resolution and a remote address that has a known mapping to a link address, populate the link address when the route is created. This removes the need for neighbor/link address caches to perform this check. Fixes #5149 PiperOrigin-RevId: 352122401
2021-01-15Support GetLinkAddress with neighborCacheGhanan Gowripalan
Test: integration_test.TestGetLinkAddress PiperOrigin-RevId: 352119404
2021-01-15Only pass stack.Route's fields to LinkEndpointsGhanan Gowripalan
stack.Route is used to send network packets and resolve link addresses. A LinkEndpoint does not need to do either of these and only needs the route's fields at the time of the packet write request. Since LinkEndpoints only need the route's fields when writing packets, pass a stack.RouteInfo instead. PiperOrigin-RevId: 352108405
2021-01-15Remove count argument from tcpip.Endpoint.ReadTamir Duberstein
The same intent can be specified via the io.Writer. PiperOrigin-RevId: 352098747
2021-01-13Clear neighbor table on NIC downGhanan Gowripalan
Note, this includes static entries to match linux's behaviour. ``` $ ip neigh show dev eth0 192.168.42.1 lladdr fc:ec:da:70:6e:f9 STALE $ sudo ip neigh add 192.168.42.172 lladdr 22:33:44:55:66:77 dev eth0 $ ip neigh show dev eth0 192.168.42.1 lladdr fc:ec:da:70:6e:f9 STALE 192.168.42.172 lladdr 22:33:44:55:66:77 PERMANENT $ sudo ifconfig eth0 down $ ip neigh show dev eth0 $ sudo ifconfig eth0 up $ ip neigh show dev eth0 ``` Test: stack_test.TestClearNeighborCacheOnNICDisable PiperOrigin-RevId: 351696306
2021-01-13Do not resolve remote link address at transport layerGhanan Gowripalan
Link address resolution is performed at the link layer (if required) so we can defer it from the transport layer. When link resolution is required, packets will be queued and sent once link resolution completes. If link resolution fails, the transport layer will receive a control message indicating that the stack failed to route the packet. tcpip.Endpoint.Write no longer returns a channel now that writes do not wait for link resolution at the transport layer. tcpip.ErrNoLinkAddress is no longer used so it is removed. Removed calls to stack.Route.ResolveWith from the transport layer so that link resolution is performed when a route is created in response to an incoming packet (e.g. to complete TCP handshakes or send a RST). Tests: - integration_test.TestForwarding - integration_test.TestTCPLinkResolutionFailure Fixes #4458 RELNOTES: n/a PiperOrigin-RevId: 351684158
2021-01-12Drop TransportEndpointID from HandleControlPacketGhanan Gowripalan
When a control packet is delivered, it is delivered to a transport endpoint with a matching stack.TransportEndpointID so there is no need to pass the ID to the endpoint as it already knows its ID. PiperOrigin-RevId: 351497588
2021-01-12Fix simple mistakes identified by goreportcard.Adin Scannell
These are primarily simplification and lint mistakes. However, minor fixes are also included and tests added where appropriate. PiperOrigin-RevId: 351425971
2021-01-07netstack: Refactor tcpip.Endpoint.ReadTing-Yu Wang
Read now takes a destination io.Writer, count, options. Keeping the method name Read, in contrast to the Write method. This enables: * direct transfer of views under VV * zero copy It also eliminates the need for sentry to keep a slice of view because userspace had requested a read that is smaller than the view returned, removing the complexity there. Read/Peek/ReadPacket are now consolidated together and some duplicate code is removed. PiperOrigin-RevId: 350636322
2020-12-22Move SO_BINDTODEVICE to socketops.Nayana Bidari
PiperOrigin-RevId: 348696094
2020-12-22Invoke address resolution upon subsequent traffic to Failed neighborPeter Johnston
Removes the period of time in which subseqeuent traffic to a Failed neighbor immediately fails with ErrNoLinkAddress. A Failed neighbor is one in which address resolution fails; or in other words, the neighbor's IP address cannot be translated to a MAC address. This means removing the Failed state for linkAddrCache and allowing transitiong out of Failed into Incomplete for neighborCache. Previously, both caches would transition entries to Failed after address resolution fails. In this state, any subsequent traffic requested within an unreachable time would immediately fail with ErrNoLinkAddress. This does not follow RFC 4861 section 7.3.3: If address resolution fails, the entry SHOULD be deleted, so that subsequent traffic to that neighbor invokes the next-hop determination procedure again. Invoking next-hop determination at this point ensures that alternate default routers are tried. The API for getting a link address for a given address, whether through the link address cache or the neighbor table, is updated to optionally take a callback which will be called when address resolution completes. This allows `Route` to handle completing link resolution internally, so callers of (*Route).Resolve (e.g. endpoints) don’t have to keep track of when it completes and update the Route accordingly. This change also removes the wakers from LinkAddressCache, NeighborCache, and Route in favor of the callbacks, and callers that previously used a waker can now just pass a callback to (*Route).Resolve that will notify the waker on resolution completion. Fixes #4796 Startblock: has LGTM from sbalana and then add reviewer ghanan PiperOrigin-RevId: 348597478
2020-12-21Prefer matching labels and longest matching prefixGhanan Gowripalan
...when performing source address selection for IPv6. These are defined in RFC 6724 section 5 rule 6 (prefer matching label) and rule 8 (use longest matching prefix). This change also considers ULA of global scope instead of its own scope, as per RFC 6724 section 3.1: Also, note that ULAs are considered as global, not site-local, scope but are handled via the prefix policy table as discussed in Section 10.6. Test: stack_test.TestIPv6SourceAddressSelectionScope Startblock: has LGTM from peterjohnston and then add reviewer brunodalbo PiperOrigin-RevId: 348580996
2020-12-17Remove duplicate `return`Tamir Duberstein
PiperOrigin-RevId: 347974624
2020-12-16Add support to count the number of packets SACKed.Nayana Bidari
sacked_out is required in RACK to check the number of duplicate acknowledgements during updating the reorder window. If there is no reordering and the value for sacked_out is greater than the classic threshold value 3, then reorder window is set to zero. It is calculated by counting the number of segments sacked in the ACK and is reduced when a cumulative ACK is received which covers the SACK blocks. This value is set to zero when the connection enters recovery. PiperOrigin-RevId: 347872246
2020-12-15Fix a data race in packetEPsTing-Yu Wang
packetEPs may get into a state that `len < cap`, casuing append() modifying the original slice storage. Reported-by: syzbot+978dd0e9c2600ab7a76b@syzkaller.appspotmail.com PiperOrigin-RevId: 347634351
2020-12-12Introduce IPv6 extension header serialization facilitiesBruno Dal Bo
Adds IPv6 extension header serializer and Hop by Hop options serializer. Add RouterAlert option serializer and use it in MLD. Fixed #4996 Startblock: has LGTM from marinaciocea and then add reviewer ghanan PiperOrigin-RevId: 347174537
2020-12-11[netstack] Decouple tcpip.ControlMessages from the IP control messges.Ayush Ranjan
tcpip.ControlMessages can not contain Linux specific structures which makes it painful to convert back and forth from Linux to tcpip back to Linux when passing around control messages in hostinet and raw sockets. Now we convert to the Linux version of the control message as soon as we are out of tcpip. PiperOrigin-RevId: 347027065
2020-12-10Use specified source address for IGMP/MLD packetsGhanan Gowripalan
This change also considers interfaces and network endpoints enabled up up to the point all work to disable them are complete. This was needed so that protocols can perform shutdown work while being disabled (e.g. sending a packet which requires the endpoint to be enabled to obtain a source address). Bug #4682, #4861 Fixes #4888 Startblock: has LGTM from peterjohnston and then add reviewer brunodalbo PiperOrigin-RevId: 346869702
2020-12-04Remove stack.ReadOnlyAddressableEndpointStateGhanan Gowripalan
Startblock: has LGTM from asfez and then add reviewer tamird PiperOrigin-RevId: 345815146