summaryrefslogtreecommitdiffhomepage
path: root/pkg/tcpip/stack/registration.go
AgeCommit message (Collapse)Author
2021-02-08Support performing DAD for any addressGhanan Gowripalan
...as long as the network protocol supports duplicate address detection. This CL provides the facilities for a netstack integrator to perform DAD. DHCP recommends that clients effectively perform DAD before accepting an offer. As per RFC 2131 section 4.4.1 pg 38, The client SHOULD perform a check on the suggested address to ensure that the address is not already in use. For example, if the client is on a network that supports ARP, the client may issue an ARP request for the suggested request. The implementation of ARP-based IPv4 DAD effectively operates the same as IPv6's NDP DAD - using ARP requests and responses in place of NDP neighbour solicitations and advertisements, respectively. DAD performed by calls to (*Stack).CheckDuplicateAddress don't interfere with DAD performed when a new IPv6 address is added. This is so that integrator requests to check for duplicate addresses aren't unexpectedly aborted when addresses are removed. A network package internal package provides protocol agnostic DAD state management that specific protocols that provide DAD can use. Fixes #4550. Tests: - internal/ip_test.* - integration_test.TestDAD - arp_test.TestDADARPRequestPacket - ipv6.TestCheckDuplicateAddress PiperOrigin-RevId: 356405593
2021-02-06Check local address directly through NICGhanan Gowripalan
Network endpoints that wish to check addresses on another NIC-local network endpoint may now do so through the NetworkInterface. This fixes a lock ordering issue between NIC removal and link resolution. Before this change: NIC Removal takes the stack lock, neighbor cache lock then neighbor entries' locks. When performing IPv4 link resolution, we take the entry lock then ARP would try check IPv4 local addresses through the stack which tries to obtain the stack's lock. Now that ARP can check IPv4 addreses through the NIC, we avoid the lock ordering issue, while also removing the need for stack to lookup the NIC. PiperOrigin-RevId: 356034245
2021-02-01Refactor HandleControlPacket/SockErrorGhanan Gowripalan
...to remove the need for the transport layer to deduce the type of error it received. Rename HandleControlPacket to HandleError as HandleControlPacket only handles errors. tcpip.SockError now holds a tcpip.SockErrorCause interface that different errors can implement. PiperOrigin-RevId: 354994306
2021-01-31Use different neighbor tables per network endpointGhanan Gowripalan
This stores each protocol's neighbor state separately. This change also removes the need for each neighbor entry to keep track of their own link address resolver now that all the entries in a cache will use the same resolver. PiperOrigin-RevId: 354818155
2021-01-31Hide neighbor table kind from NetworkEndpointGhanan Gowripalan
The network endpoint should not need to have logic to handle different kinds of neighbor tables. Network endpoints can let the NIC know about differnt neighbor discovery messages and let the NIC decide which table to update. This allows us to remove the LinkAddressCache interface. PiperOrigin-RevId: 354812584
2021-01-30Implement LinkAddressResolver on NetworkEndpointsGhanan Gowripalan
This removes the need to provide the link address request with the NIC the request is being performed on since the NetworkEndpoints already have a reference to the NIC. PiperOrigin-RevId: 354721940
2021-01-28Change tcpip.Error to an interfaceTamir Duberstein
This makes it possible to add data to types that implement tcpip.Error. ErrBadLinkEndpoint is removed as it is unused. PiperOrigin-RevId: 354437314
2021-01-19Do not have a stack-wide linkAddressCacheGhanan Gowripalan
Link addresses are cached on a per NIC basis so instead of having a single cache that includes the NIC ID for neighbor entry lookups, use a single cache per NIC. PiperOrigin-RevId: 352684111
2021-01-19Per NIC NetworkEndpoint statisticsArthur Sfez
To facilitate the debugging of multi-homed setup, track Network protocols statistics for each endpoint. Note that the original stack-wide stats still exist. A new type of statistic counter is introduced, which track two versions of a stat at the same time. This lets a network endpoint increment both the local stat and the stack-wide stat at the same time. Fixes #4605 PiperOrigin-RevId: 352663276
2021-01-19Drop CheckLocalAddress from LinkAddressCacheGhanan Gowripalan
PiperOrigin-RevId: 352623277
2021-01-15Support GetLinkAddress with neighborCacheGhanan Gowripalan
Test: integration_test.TestGetLinkAddress PiperOrigin-RevId: 352119404
2021-01-15Only pass stack.Route's fields to LinkEndpointsGhanan Gowripalan
stack.Route is used to send network packets and resolve link addresses. A LinkEndpoint does not need to do either of these and only needs the route's fields at the time of the packet write request. Since LinkEndpoints only need the route's fields when writing packets, pass a stack.RouteInfo instead. PiperOrigin-RevId: 352108405
2021-01-13Do not resolve remote link address at transport layerGhanan Gowripalan
Link address resolution is performed at the link layer (if required) so we can defer it from the transport layer. When link resolution is required, packets will be queued and sent once link resolution completes. If link resolution fails, the transport layer will receive a control message indicating that the stack failed to route the packet. tcpip.Endpoint.Write no longer returns a channel now that writes do not wait for link resolution at the transport layer. tcpip.ErrNoLinkAddress is no longer used so it is removed. Removed calls to stack.Route.ResolveWith from the transport layer so that link resolution is performed when a route is created in response to an incoming packet (e.g. to complete TCP handshakes or send a RST). Tests: - integration_test.TestForwarding - integration_test.TestTCPLinkResolutionFailure Fixes #4458 RELNOTES: n/a PiperOrigin-RevId: 351684158
2021-01-12Drop TransportEndpointID from HandleControlPacketGhanan Gowripalan
When a control packet is delivered, it is delivered to a transport endpoint with a matching stack.TransportEndpointID so there is no need to pass the ID to the endpoint as it already knows its ID. PiperOrigin-RevId: 351497588
2020-12-22Invoke address resolution upon subsequent traffic to Failed neighborPeter Johnston
Removes the period of time in which subseqeuent traffic to a Failed neighbor immediately fails with ErrNoLinkAddress. A Failed neighbor is one in which address resolution fails; or in other words, the neighbor's IP address cannot be translated to a MAC address. This means removing the Failed state for linkAddrCache and allowing transitiong out of Failed into Incomplete for neighborCache. Previously, both caches would transition entries to Failed after address resolution fails. In this state, any subsequent traffic requested within an unreachable time would immediately fail with ErrNoLinkAddress. This does not follow RFC 4861 section 7.3.3: If address resolution fails, the entry SHOULD be deleted, so that subsequent traffic to that neighbor invokes the next-hop determination procedure again. Invoking next-hop determination at this point ensures that alternate default routers are tried. The API for getting a link address for a given address, whether through the link address cache or the neighbor table, is updated to optionally take a callback which will be called when address resolution completes. This allows `Route` to handle completing link resolution internally, so callers of (*Route).Resolve (e.g. endpoints) don’t have to keep track of when it completes and update the Route accordingly. This change also removes the wakers from LinkAddressCache, NeighborCache, and Route in favor of the callbacks, and callers that previously used a waker can now just pass a callback to (*Route).Resolve that will notify the waker on resolution completion. Fixes #4796 Startblock: has LGTM from sbalana and then add reviewer ghanan PiperOrigin-RevId: 348597478
2020-12-04Introduce IPv4 options serializer and add RouterAlert to IGMPBruno Dal Bo
PiperOrigin-RevId: 345701623
2020-12-01Track join count in multicast group protocol stateGhanan Gowripalan
Before this change, the join count and the state for IGMP/MLD was held across different types which required multiple locks to be held when accessing a multicast group's state. Bug #4682, #4861 Fixes #4916 PiperOrigin-RevId: 345019091
2020-11-18Introduce stack.WritePacketToRemote, remove LinkEndpoint.WriteRawPacketBruno Dal Bo
Redefine stack.WritePacket into stack.WritePacketToRemote which lets the NIC decide whether to append link headers. PiperOrigin-RevId: 343071742
2020-11-16Remove ARP address workaroundGhanan Gowripalan
- Make AddressableEndpoint optional for NetworkEndpoint. Not all NetworkEndpoints need to support addressing (e.g. ARP), so AddressableEndpoint should only be implemented for protocols that support addressing such as IPv4 and IPv6. With this change, tcpip.ErrNotSupported will be returned by the stack when attempting to modify addresses on a network endpoint that does not support addressing. Now that packets are fully handled at the network layer, and (with this change) addresses are optional for network endpoints, we no longer need the workaround for ARP where a fake ARP address was added to each NIC that performs ARP so that packets would be delivered to the ARP layer. PiperOrigin-RevId: 342722547
2020-11-12Change AllocationSize to SizeWithPadding as requestedJulian Elischer
RELNOTES: n/a PiperOrigin-RevId: 342176296
2020-11-12Move packet handling to NetworkEndpointGhanan Gowripalan
The NIC should not hold network-layer state or logic - network packet handling/forwarding should be performed at the network layer instead of the NIC. Fixes #4688 PiperOrigin-RevId: 342166985
2020-11-11Teach netstack how to add options to IPv4 packetsJulian Elischer
Most packets don't have options but they are an integral part of the standard. Teaching the ipv4 code how to handle them will simplify future testing and use. Because Options are so rare it is worth making sure that the extra work is kept out of the fast path as much as possible. Prior to this change, all usages of the IHL field of the IPv4Fields/Encode system set it to the same constant value except in a couple of tests for bad values. From this change IHL will not be a constant as it will depend on the size of any Options. Since ipv4.Encode() now handles the options it becomes a possible source of errors to let the callers set this value, so remove it entirely and calculate the value from the size of the Options if present (or not) therefore guaranteeing a correct value. Fixes #4709 RELNOTES: n/a PiperOrigin-RevId: 341864765
2020-11-05Cache addressEndpoint.addr.Subnet() to avoid allocations.Bhasker Hariharan
This change adds a Subnet() method to AddressableEndpoint so that we can avoid repeated calls to AddressableEndpoint.AddressWithPrefix().Subnet(). Updates #231 PiperOrigin-RevId: 340969877
2020-11-05Use stack.Route exclusively for writing packetsGhanan Gowripalan
* Remove stack.Route from incoming packet path. There is no need to pass around a stack.Route during the incoming path of a packet. Instead, pass around the packet's link/network layer information in the packet buffer since all layers may need this information. * Support address bound and outgoing packet NIC in routes. When forwarding is enabled, the source address of a packet may be bound to a different interface than the outgoing interface. This change updates stack.Route to hold both NICs so that one can be used to write packets while the other is used to check if the route's bound address is valid. Note, we need to hold the address's interface so we can check if the address is a spoofed address. * Introduce the concept of a local route. Local routes are routes where the packet never needs to leave the stack; the destination is stack-local. We can now route between interfaces within a stack if the packet never needs to leave the stack, even when forwarding is disabled. * Always obtain a route from the stack before sending a packet. If a packet needs to be sent in response to an incoming packet, a route must be obtained from the stack to ensure the stack is configured to send packets to the packet's source from the packet's destination. * Enable spoofing if a stack may send packets from unowned addresses. This change required changes to some netgophers since previously, promiscuous mode was enough to let the netstack respond to all incoming packets regardless of the packet's destination address. Now that a stack.Route is not held for each incoming packet, finding a route may fail with local addresses we don't own but accepted packets for while in promiscuous mode. Since we also want to be able to send from any address (in response the received promiscuous mode packets), we need to enable spoofing. * Skip transport layer checksum checks for locally generated packets. If a packet is locally generated, the stack can safely assume that no errors were introduced while being locally routed since the packet is never sent out the wire. Some bugs fixed: - transport layer checksum was never calculated after NAT. - handleLocal didn't handle routing across interfaces. - stack didn't support forwarding across interfaces. - always consult the routing table before creating an endpoint. Updates #4688 Fixes #3906 PiperOrigin-RevId: 340943442
2020-10-22Pass NetworkInterface to LinkAddressRequestGhanan Gowripalan
Previously a link endpoint was passed to stack.LinkAddressResolver.LinkAddressRequest. With this change, implementations that want a route for the link address request may find one through the stack. Other implementations that want to send a packet without a route may continue to do so using the network interface directly. Test: - arp_test.TestLinkAddressRequest - ipv6.TestLinkAddressRequest PiperOrigin-RevId: 338577474
2020-10-09Automated rollback of changelist 336304024Ghanan Gowripalan
PiperOrigin-RevId: 336339194
2020-10-09Automated rollback of changelist 336185457Bhasker Hariharan
PiperOrigin-RevId: 336304024
2020-10-08Do not resolve routes immediatelyGhanan Gowripalan
When a response needs to be sent to an incoming packet, the stack should consult its neighbour table to determine the remote address's link address. When an entry does not exist in the stack's neighbor table, the stack should queue the packet while link resolution completes. See comments. PiperOrigin-RevId: 336185457
2020-10-05Remove AssignableAddressEndpoint.NetworkEndpointGhanan Gowripalan
We can get the network endpoint directly from the NIC. This is a preparatory CL for when a Route needs to hold a dedicated NIC as its output interface. This is because when forwarding is enabled, packets may be sent from a NIC different from the NIC a route's local address is associated with. PiperOrigin-RevId: 335484500
2020-09-30Use the ICMP error response facilityJulian Elischer
Add code in IPv6 to send ICMP packets while processing extension headers. Add some accounting in processing IPV6 Extension headers which allows us to report meaningful information back in ICMP parameter problem packets. IPv4 also needs to send a message when an unsupported protocol is requested. Add some tests to generate both ipv4 and ipv6 packets with various errors and check the responses. Add some new checkers and cleanup some inconsistencies in the messages in that file. Add new error types for the ICMPv4/6 generators. Fix a bug in the ICMPv4 generator that stopped it from generating "Unknown protocol" messages. Updates #2211 PiperOrigin-RevId: 334661716
2020-09-29Return permanent addresses when NIC is downGhanan Gowripalan
Test: stack_test.TestGetMainNICAddressWhenNICDisabled PiperOrigin-RevId: 334513286
2020-09-29Trim Network/Transport Endpoint/ProtocolGhanan Gowripalan
* Remove Capabilities and NICID methods from NetworkEndpoint. * Remove linkEP and stack parameters from NetworkProtocol.NewEndpoint. The LinkEndpoint can be fetched from the NetworkInterface. The stack is passed to the NetworkProtocol when it is created so the NetworkEndpoint can get it from its protocol. * Remove stack parameter from TransportProtocol.NewEndpoint. Like the NetworkProtocol/Endpoint, the stack is passed to the TransportProtocol when it is created. PiperOrigin-RevId: 334332721
2020-09-29Move IP state from NIC to NetworkEndpoint/ProtocolGhanan Gowripalan
* Add network address to network endpoints. Hold network-specific state in the NetworkEndpoint instead of the stack. This results in the stack no longer needing to "know" about the network endpoints and special case certain work for various endpoints (e.g. IPv6 DAD). * Provide NetworkEndpoints with an NetworkInterface interface. Instead of just passing the NIC ID of a NIC, pass an interface so the network endpoint may query other information about the NIC such as whether or not it is a loopback device. * Move NDP code and state to the IPv6 package. NDP is IPv6 specific so there is no need for it to live in the stack. * Control forwarding through NetworkProtocols instead of Stack Forwarding should be controlled on a per-network protocol basis so forwarding configurations are now controlled through network protocols. * Remove stack.referencedNetworkEndpoint. Now that addresses are exposed via AddressEndpoint and only one NetworkEndpoint is created per interface, there is no need for a referenced NetworkEndpoint. * Assume network teardown methods are infallible. Fixes #3871, #3916 PiperOrigin-RevId: 334319433
2020-09-26Remove generic ICMP errorsGhanan Gowripalan
Generic ICMP errors were required because the transport dispatcher was given the responsibility of sending ICMP errors in response to transport packet delivery failures. Instead, the transport dispatcher should let network layer know it failed to deliver a packet (and why) and let the network layer make the decision as to what error to send (if any). Fixes #4068 PiperOrigin-RevId: 333962333
2020-09-23Extract ICMP error sender from UDPJulian Elischer
Store transport protocol number on packet buffers for use in ICMP error generation. Updates #2211. PiperOrigin-RevId: 333252762
2020-09-08Improve type safety for transport protocol optionsGhanan Gowripalan
The existing implementation for TransportProtocol.{Set}Option take arguments of an empty interface type which all types (implicitly) implement; any type may be passed to the functions. This change introduces marker interfaces for transport protocol options that may be set or queried which transport protocol option types implement to ensure that invalid types are caught at compile time. Different interfaces are used to allow the compiler to enforce read-only or set-only socket options. RELNOTES: n/a PiperOrigin-RevId: 330559811
2020-08-28Improve type safety for network protocol optionsGhanan Gowripalan
The existing implementation for NetworkProtocol.{Set}Option take arguments of an empty interface type which all types (implicitly) implement; any type may be passed to the functions. This change introduces marker interfaces for network protocol options that may be set or queried which network protocol option types implement to ensure that invalid types are caught at compile time. Different interfaces are used to allow the compiler to enforce read-only or set-only socket options. PiperOrigin-RevId: 328980359
2020-08-25Add option to replace linkAddrCache with neighborCacheSam Balana
This change adds an option to replace the current implementation of ARP through linkAddrCache, with an implementation of NUD through neighborCache. Switching to using NUD for both ARP and NDP is beneficial for the reasons described by RFC 4861 Section 3.1: "[Using NUD] significantly improves the robustness of packet delivery in the presence of failing routers, partially failing or partitioned links, or nodes that change their link-layer addresses. For instance, mobile nodes can move off-link without losing any connectivity due to stale ARP caches." "Unlike ARP, Neighbor Unreachability Detection detects half-link failures and avoids sending traffic to neighbors with which two-way connectivity is absent." Along with these changes exposes the API for querying and operating the neighbor cache. Operations include: - Create a static entry - List all entries - Delete all entries - Remove an entry by address This also exposes the API to change the NUD protocol constants on a per-NIC basis to allow Neighbor Discovery to operate over links with widely varying performance characteristics. See [RFC 4861 Section 10][1] for the list of constants. Finally, an API for subscribing to NUD state changes is exposed through NUDDispatcher. See [RFC 4861 Appendix C][3] for the list of edges. Tests: pkg/tcpip/network/arp:arp_test + TestDirectRequest pkg/tcpip/network/ipv6:ipv6_test + TestLinkResolution + TestNDPValidation + TestNeighorAdvertisementWithTargetLinkLayerOption + TestNeighorSolicitationResponse + TestNeighorSolicitationWithSourceLinkLayerOption + TestRouterAdvertValidation pkg/tcpip/stack:stack_test + TestCacheWaker + TestForwardingWithFakeResolver + TestForwardingWithFakeResolverManyPackets + TestForwardingWithFakeResolverManyResolutions + TestForwardingWithFakeResolverPartialTimeout + TestForwardingWithFakeResolverTwoPackets + TestIPv6SourceAddressSelectionScopeAndSameAddress [1]: https://tools.ietf.org/html/rfc4861#section-10 [2]: https://tools.ietf.org/html/rfc4861#appendix-C Fixes #1889 Fixes #1894 Fixes #1895 Fixes #1947 Fixes #1948 Fixes #1949 Fixes #1950 PiperOrigin-RevId: 328365034
2020-08-14Use a single NetworkEndpoint per NIC per protocolGhanan Gowripalan
The NetworkEndpoint does not need to be created for each address. Most of the work the NetworkEndpoint does is address agnostic. PiperOrigin-RevId: 326759605
2020-08-10Set the NetworkProtocolNumber of all PacketBuffers.Kevin Krakauer
NetworkEndpoints set the number on outgoing packets in Write() and NetworkProtocols set them on incoming packets in Parse(). Needed for #3549. PiperOrigin-RevId: 325938745
2020-07-27Add ability to send unicast ARP requests and Neighbor SolicitationsSam Balana
The previous implementation of LinkAddressRequest only supported sending broadcast ARP requests and multicast Neighbor Solicitations. The ability to send these packets as unicast is required for Neighbor Unreachability Detection. Tests: pkg/tcpip/network/arp:arp_test - TestLinkAddressRequest pkg/tcpip/network/ipv6:ipv6_test - TestLinkAddressRequest Updates #1889 Updates #1894 Updates #1895 Updates #1947 Updates #1948 Updates #1949 Updates #1950 PiperOrigin-RevId: 323451569
2020-07-22make connect(2) fail when dest is unreachableKevin Krakauer
Previously, ICMP destination unreachable datagrams were ignored by TCP endpoints. This caused connect to hang when an intermediate router couldn't find a route to the host. This manifested as a Kokoro error when Docker IPv6 was enabled. The Ruby image test would try to install the sinatra gem and hang indefinitely attempting to use an IPv6 address. Fixes #3079.
2020-07-22Support for receiving outbound packets in AF_PACKET.Bhasker Hariharan
Updates #173 PiperOrigin-RevId: 322665518
2020-07-15Fix minor bugs in a couple of interface IOCTLs.Bhasker Hariharan
gVisor incorrectly returns the wrong ARP type for SIOGIFHWADDR. This breaks tcpdump as it tries to interpret the packets incorrectly. Similarly, SIOCETHTOOL is used by tcpdump to query interface properties which fails with an EINVAL since we don't implement it. For now change it to return EOPNOTSUPP to indicate that we don't support the query rather than return EINVAL. NOTE: ARPHRD types for link endpoints are distinct from NIC capabilities and NIC flags. In Linux all 3 exist eg. ARPHRD types are stored in dev->type field while NIC capabilities are more like the device features which can be queried using SIOCETHTOOL but not modified and NIC Flags are fields that can be modified from user space. eg. NIC status (UP/DOWN/MULTICAST/BROADCAST) etc. Updates #2746 PiperOrigin-RevId: 321436525
2020-06-07netstack: parse incoming packet headers up-frontKevin Krakauer
Netstack has traditionally parsed headers on-demand as a packet moves up the stack. This is conceptually simple and convenient, but incompatible with iptables, where headers can be inspected and mangled before even a routing decision is made. This changes header parsing to happen early in the incoming packet path, as soon as the NIC gets the packet from a link endpoint. Even if an invalid packet is found (e.g. a TCP header of insufficient length), the packet is passed up the stack for proper stats bookkeeping. PiperOrigin-RevId: 315179302
2020-06-03Pass PacketBuffer as pointer.Ting-Yu Wang
Historically we've been passing PacketBuffer by shallow copying through out the stack. Right now, this is only correct as the caller would not use PacketBuffer after passing into the next layer in netstack. With new buffer management effort in gVisor/netstack, PacketBuffer will own a Buffer (to be added). Internally, both PacketBuffer and Buffer may have pointers and shallow copying shouldn't be used. Updates #2404. PiperOrigin-RevId: 314610879
2020-05-29Update WritePacket* API to take ownership of packets to be written.Ting-Yu Wang
Updates #2404. PiperOrigin-RevId: 313834784
2020-05-27Remove linkEP from DeliverNetworkPacketSam Balana
The specified LinkEndpoint is not being used in a significant way. No behavior change, existing tests pass. This change is a breaking change. PiperOrigin-RevId: 313496602
2020-04-30FIFO QDisc implementationBhasker Hariharan
Updates #231 PiperOrigin-RevId: 309323808
2020-04-03Refactor software GSO code.Bhasker Hariharan
Software GSO implementation currently has a complicated code path with implicit assumptions that all packets to WritePackets carry same Data and it does this to avoid allocations on the path etc. But this makes it hard to reuse the WritePackets API. This change breaks all such assumptions by introducing a new Vectorised View API ReadToVV which can be used to cleanly split a VV into multiple independent VVs. Further this change also makes packet buffers linkable to form an intrusive list. This allows us to get rid of the array of packet buffers that are passed in the WritePackets API call and replace it with a list of packet buffers. While this code does introduce some more allocations in the benchmarks it doesn't cause any degradation. Updates #231 PiperOrigin-RevId: 304731742