summaryrefslogtreecommitdiffhomepage
path: root/pkg/tcpip/network
AgeCommit message (Collapse)Author
2021-02-09add IPv4 options processing for forwarding and reassemblyJulian Elischer
IPv4 forwarding and reassembly needs support for option processing and regular processing also needs options to be processed before being passed to the transport layer. This patch extends option processing to those cases and provides additional testing. A small change to the ICMP error generation API code was required to allow it to know when a packet was being forwarded or not. Updates #4586 PiperOrigin-RevId: 356446681
2021-02-08Remove unnecessary lockingGhanan Gowripalan
The thing the lock protects will never be accessed concurrently. PiperOrigin-RevId: 356423331
2021-02-08Support performing DAD for any addressGhanan Gowripalan
...as long as the network protocol supports duplicate address detection. This CL provides the facilities for a netstack integrator to perform DAD. DHCP recommends that clients effectively perform DAD before accepting an offer. As per RFC 2131 section 4.4.1 pg 38, The client SHOULD perform a check on the suggested address to ensure that the address is not already in use. For example, if the client is on a network that supports ARP, the client may issue an ARP request for the suggested request. The implementation of ARP-based IPv4 DAD effectively operates the same as IPv6's NDP DAD - using ARP requests and responses in place of NDP neighbour solicitations and advertisements, respectively. DAD performed by calls to (*Stack).CheckDuplicateAddress don't interfere with DAD performed when a new IPv6 address is added. This is so that integrator requests to check for duplicate addresses aren't unexpectedly aborted when addresses are removed. A network package internal package provides protocol agnostic DAD state management that specific protocols that provide DAD can use. Fixes #4550. Tests: - internal/ip_test.* - integration_test.TestDAD - arp_test.TestDADARPRequestPacket - ipv6.TestCheckDuplicateAddress PiperOrigin-RevId: 356405593
2021-02-06Remove linkAddrCacheGhanan Gowripalan
It was replaced by NUD/neighborCache. Fixes #4658. PiperOrigin-RevId: 356085221
2021-02-06Use fine grained locks while sending NDP packetsGhanan Gowripalan
Previously when sending NDP DAD or RS messages, we would hold a shared lock which lead to deadlocks (due to synchronous packet loooping (e.g. pipe and loopback link endpoints)) and lock contention. Writing packets may be an expensive operation which could prevent other goroutines from doing meaningful work if a shared lock is held while writing packets. This change upates the NDP DAD/RS timers to not hold shared locks while sending packets. PiperOrigin-RevId: 356053146
2021-02-06Remove (*stack.Stack).FindNetworkEndpointGhanan Gowripalan
The network endpoints only look for other network endpoints of the same kind. Since the network protocols keeps track of all endpoints, go through the protocol to find an endpoint with an address instead of the stack. PiperOrigin-RevId: 356051498
2021-02-06Check local address directly through NICGhanan Gowripalan
Network endpoints that wish to check addresses on another NIC-local network endpoint may now do so through the NetworkInterface. This fixes a lock ordering issue between NIC removal and link resolution. Before this change: NIC Removal takes the stack lock, neighbor cache lock then neighbor entries' locks. When performing IPv4 link resolution, we take the entry lock then ARP would try check IPv4 local addresses through the stack which tries to obtain the stack's lock. Now that ARP can check IPv4 addreses through the NIC, we avoid the lock ordering issue, while also removing the need for stack to lookup the NIC. PiperOrigin-RevId: 356034245
2021-02-05Batch write packets after iptables checksGhanan Gowripalan
After IPTables checks a batch of packets, we can write packets that are not dropped or locally destined as a batch instead of individually. This previously caused a bug since WritePacket* functions expect to take ownership of passed PacketBuffer{List}. WritePackets assumed the list of PacketBuffers will not be invalidated when calling WritePacket for each PacketBuffer in the list, but this is not true. WritePacket may add the passed PacketBuffer into a different list which would modify the PacketBuffer in such a way that it no longer points to the next PacketBuffer to write. Example: Given a PB list of PB_a -> PB_b -> PB_c WritePackets may be iterating over the list and calling WritePacket for each PB. When WritePacket takes PB_a, it may add it to a new list which would update pointers such that PB_a no longer points to PB_b. Test: integration_test.TestIPTableWritePackets PiperOrigin-RevId: 355969560
2021-02-05Refactor locally delivered packetsGhanan Gowripalan
Make it clear that failing to parse a looped back is not a packet sending error but a malformed received packet error. FindNetworkEndpoint returns nil when no network endpoint is found instead of an error. PiperOrigin-RevId: 355954946
2021-02-01Refactor HandleControlPacket/SockErrorGhanan Gowripalan
...to remove the need for the transport layer to deduce the type of error it received. Rename HandleControlPacket to HandleError as HandleControlPacket only handles errors. tcpip.SockError now holds a tcpip.SockErrorCause interface that different errors can implement. PiperOrigin-RevId: 354994306
2021-01-31Default to NUD/neighborCache instead of linkAddrCacheGhanan Gowripalan
This change flips gvisor to use Neighbor unreachability detection by default to populate the neighbor table as defined by RFC 4861 section 7. Although RFC 4861 is targeted at IPv6, the same algorithm is used for link resolution on IPv4 networks using ARP. Integrators may still use the legacy link address cache by setting stack.Options.UseLinkAddrCache to true; stack.Options.UseNeighborCache is now unused and will be removed. A later change will remove linkAddrCache and associated code. Updates #4658. PiperOrigin-RevId: 354850531
2021-01-31Use closure for IPv6 testContext cleanupGhanan Gowripalan
PiperOrigin-RevId: 354827491
2021-01-31Remove NICs before closing their link endpointsGhanan Gowripalan
...in IPv6 ICMP tests. A channel link endpoint's channel is closed when the link endpoint is closed. When the stack tries to send packets through a NIC with a closed channel endpoint, a panic will occur when attempting to write to a closed channel (https://golang.org/ref/spec#Close). To make sure the stack does not try to send packets through a NIC, we remove it. PiperOrigin-RevId: 354822085
2021-01-31Use different neighbor tables per network endpointGhanan Gowripalan
This stores each protocol's neighbor state separately. This change also removes the need for each neighbor entry to keep track of their own link address resolver now that all the entries in a cache will use the same resolver. PiperOrigin-RevId: 354818155
2021-01-31Hide neighbor table kind from NetworkEndpointGhanan Gowripalan
The network endpoint should not need to have logic to handle different kinds of neighbor tables. Network endpoints can let the NIC know about differnt neighbor discovery messages and let the NIC decide which table to update. This allows us to remove the LinkAddressCache interface. PiperOrigin-RevId: 354812584
2021-01-30Implement LinkAddressResolver on NetworkEndpointsGhanan Gowripalan
This removes the need to provide the link address request with the NIC the request is being performed on since the NetworkEndpoints already have a reference to the NIC. PiperOrigin-RevId: 354721940
2021-01-29Make fragmentation return a reassembled PacketBufferTing-Yu Wang
This allows later decoupling of the backing network buffer implementation. PiperOrigin-RevId: 354643297
2021-01-29Clear IGMPv1 present flag on NIC downGhanan Gowripalan
This is dynamic state that can be re-learned when the NIC comes back up. Test: ipv4_test.TestIgmpV1Present PiperOrigin-RevId: 354630921
2021-01-29Refresh delayed report timers on query messagesGhanan Gowripalan
...as per As per RFC 2236 section 3 page 3 (for IGMPv2) and RFC 2710 section 4 page 5 (for MLDv1). See comments in code for more details. Test: ip_test.TestHandleQuery PiperOrigin-RevId: 354603068
2021-01-29Discard invalid Neighbor AdvertisementsPeter Johnston
...per RFC 4861 s7.1.2. Startblock: has LGTM from sbalana and then add reviewer ghanan PiperOrigin-RevId: 354539026
2021-01-28Change tcpip.Error to an interfaceTamir Duberstein
This makes it possible to add data to types that implement tcpip.Error. ErrBadLinkEndpoint is removed as it is unused. PiperOrigin-RevId: 354437314
2021-01-27Merge pull request #4705 from mlevesquedion:fix-cmp-diff-reporting-in-nud-testsgVisor bot
PiperOrigin-RevId: 354187603
2021-01-26Do not use stack.Route to send NDP NSGhanan Gowripalan
When sending packets through a stack.Route, we attempt to perform link resolution. Neighbor Solicitation messages do not need link resolution to be performed so send the packets out the interface directly instead. PiperOrigin-RevId: 353967435
2021-01-25Adjust included data size on icmp errorsJulian Elischer
The RFC for icmpv6 specifies that an errant packet should be included in the returned ICMP packet, and that it should include up to the amount needed to fill the minimum MTU (1280 bytes) if possible. The current code included the Link header in that calculation but the RFC is referring to the IP MTU not the link MTU. Some conformance tests check this and report an error agains the stack for this. The full header length shoudl however continue to be used when allocating header space. Make the same change for IPv4 for consistency. Add a test for icmp payload sizing. Test that the included data in an ICMP error packet conforms to the requirements of RFC 972, RFC 4443 section 2.4 and RFC 1812 Section 4.3.2.3. Fixes #5311 PiperOrigin-RevId: 353790203
2021-01-25Add per endpoint ARP statisticsArthur Sfez
The ARP stat NetworkUnreachable was removed, and was replaced by InterfaceHasNoLocalAddress. No stats are recorded when dealing with an missing endpoint (ErrNotConnected) (because if there is no endpoint, there is no valid per-endpoint stats). PiperOrigin-RevId: 353759462
2021-01-22Refactor GetMainNICAddressArthur Sfez
It previously returned an error but it could only be UnknownNICID. It now returns a boolean to indicate whether the nic exists or not. PiperOrigin-RevId: 353337489
2021-01-22Do not modify IGMP packets when verifying checksumGhanan Gowripalan
PiperOrigin-RevId: 353336894
2021-01-22Define tcpip.Payloader in terms of io.ReaderTamir Duberstein
Fixes #1509. PiperOrigin-RevId: 353295589
2021-01-21iptables: support matching the input interface nameToshi Kikuchi
We have support for the output interface name, but not for the input interface name. This change adds the support for the input interface name, and adds the test cases for it. Fixes #5300 PiperOrigin-RevId: 353179389
2021-01-21Only use callback for GetLinkAddressGhanan Gowripalan
GetLinkAddress's callback will be called immediately with a stack.LinkResolutionResult which will hold the link address so no need to also return the link address from the function. Fixes #5151. PiperOrigin-RevId: 353157857
2021-01-20Change the way the IP options report problemsJulian Elischer
The error messages are not needed or used as these are not processing errors so much as errors to be reported back to the packet sender. Implicitly describe whether each error should generate ICMP packets or not. Most do but there are a couple that do not. Slightly alter some test expectations for Linux compatibility and add a couple more. Improve Linux compatibility on error packet returns. Some cosmetic changes to tests to match the upcoming packet impact version of the same tests. PiperOrigin-RevId: 352889785
2021-01-20rewrite diff check to match example in cmp.Diff docsMichaël Lévesque-Dion
2021-01-19Do not have a stack-wide linkAddressCacheGhanan Gowripalan
Link addresses are cached on a per NIC basis so instead of having a single cache that includes the NIC ID for neighbor entry lookups, use a single cache per NIC. PiperOrigin-RevId: 352684111
2021-01-19Per NIC NetworkEndpoint statisticsArthur Sfez
To facilitate the debugging of multi-homed setup, track Network protocols statistics for each endpoint. Note that the original stack-wide stats still exist. A new type of statistic counter is introduced, which track two versions of a stat at the same time. This lets a network endpoint increment both the local stat and the stack-wide stat at the same time. Fixes #4605 PiperOrigin-RevId: 352663276
2021-01-19Drop CheckLocalAddress from LinkAddressCacheGhanan Gowripalan
PiperOrigin-RevId: 352623277
2021-01-15Only pass stack.Route's fields to LinkEndpointsGhanan Gowripalan
stack.Route is used to send network packets and resolve link addresses. A LinkEndpoint does not need to do either of these and only needs the route's fields at the time of the packet write request. Since LinkEndpoints only need the route's fields when writing packets, pass a stack.RouteInfo instead. PiperOrigin-RevId: 352108405
2021-01-15Remove count argument from tcpip.Endpoint.ReadTamir Duberstein
The same intent can be specified via the io.Writer. PiperOrigin-RevId: 352098747
2021-01-15Correctly return EMSGSIZE when packet is too big in raw socket.Ting-Yu Wang
IPv4 previously accepts the packet, while IPv6 panics. Neither is the behavior in Linux. splice() in Linux has different behavior than in gVisor. This change documents it in the SpliceTooLong test. Reported-by: syzbot+b550e78e5c24d1d521f2@syzkaller.appspotmail.com PiperOrigin-RevId: 352091286
2021-01-14Add stats for ARPArthur Sfez
Fixes #4963 Startblock: has LGTM from sbalana and then add reviewer ghanan PiperOrigin-RevId: 351886320
2021-01-13Do not resolve remote link address at transport layerGhanan Gowripalan
Link address resolution is performed at the link layer (if required) so we can defer it from the transport layer. When link resolution is required, packets will be queued and sent once link resolution completes. If link resolution fails, the transport layer will receive a control message indicating that the stack failed to route the packet. tcpip.Endpoint.Write no longer returns a channel now that writes do not wait for link resolution at the transport layer. tcpip.ErrNoLinkAddress is no longer used so it is removed. Removed calls to stack.Route.ResolveWith from the transport layer so that link resolution is performed when a route is created in response to an incoming packet (e.g. to complete TCP handshakes or send a RST). Tests: - integration_test.TestForwarding - integration_test.TestTCPLinkResolutionFailure Fixes #4458 RELNOTES: n/a PiperOrigin-RevId: 351684158
2021-01-12Fix simple mistakes identified by goreportcard.Adin Scannell
These are primarily simplification and lint mistakes. However, minor fixes are also included and tests added where appropriate. PiperOrigin-RevId: 351425971
2021-01-07netstack: Refactor tcpip.Endpoint.ReadTing-Yu Wang
Read now takes a destination io.Writer, count, options. Keeping the method name Read, in contrast to the Write method. This enables: * direct transfer of views under VV * zero copy It also eliminates the need for sentry to keep a slice of view because userspace had requested a read that is smaller than the view returned, removing the complexity there. Read/Peek/ReadPacket are now consolidated together and some duplicate code is removed. PiperOrigin-RevId: 350636322
2020-12-22Invoke address resolution upon subsequent traffic to Failed neighborPeter Johnston
Removes the period of time in which subseqeuent traffic to a Failed neighbor immediately fails with ErrNoLinkAddress. A Failed neighbor is one in which address resolution fails; or in other words, the neighbor's IP address cannot be translated to a MAC address. This means removing the Failed state for linkAddrCache and allowing transitiong out of Failed into Incomplete for neighborCache. Previously, both caches would transition entries to Failed after address resolution fails. In this state, any subsequent traffic requested within an unreachable time would immediately fail with ErrNoLinkAddress. This does not follow RFC 4861 section 7.3.3: If address resolution fails, the entry SHOULD be deleted, so that subsequent traffic to that neighbor invokes the next-hop determination procedure again. Invoking next-hop determination at this point ensures that alternate default routers are tried. The API for getting a link address for a given address, whether through the link address cache or the neighbor table, is updated to optionally take a callback which will be called when address resolution completes. This allows `Route` to handle completing link resolution internally, so callers of (*Route).Resolve (e.g. endpoints) don’t have to keep track of when it completes and update the Route accordingly. This change also removes the wakers from LinkAddressCache, NeighborCache, and Route in favor of the callbacks, and callers that previously used a waker can now just pass a callback to (*Route).Resolve that will notify the waker on resolution completion. Fixes #4796 Startblock: has LGTM from sbalana and then add reviewer ghanan PiperOrigin-RevId: 348597478
2020-12-21Prefer matching labels and longest matching prefixGhanan Gowripalan
...when performing source address selection for IPv6. These are defined in RFC 6724 section 5 rule 6 (prefer matching label) and rule 8 (use longest matching prefix). This change also considers ULA of global scope instead of its own scope, as per RFC 6724 section 3.1: Also, note that ULAs are considered as global, not site-local, scope but are handled via the prefix policy table as discussed in Section 10.6. Test: stack_test.TestIPv6SourceAddressSelectionScope Startblock: has LGTM from peterjohnston and then add reviewer brunodalbo PiperOrigin-RevId: 348580996
2020-12-16Cleanup locking in multicast group protocol testsGhanan Gowripalan
Startblock: has LGTM from asfez and then add reviewer tamird PiperOrigin-RevId: 347928471
2020-12-15Don't split enabled flag across multicast group stateGhanan Gowripalan
Startblock: has LGTM from asfez and then add reviewer brunodalbo PiperOrigin-RevId: 347716242
2020-12-12Reduce the memory overhead in IP fragment managementToshi Kikuchi
- Deep-copy pkt.Data and hold it instead of shallow-copy (vv.Clone). This allows the pkt's backing array, which includes the header portion, to be freed. - Remove fragHeap. The fragments are now held in holes struct instead. - Stop reserving the initial capacity of holes slice. PiperOrigin-RevId: 347198744
2020-12-12Introduce IPv6 extension header serialization facilitiesBruno Dal Bo
Adds IPv6 extension header serializer and Hop by Hop options serializer. Add RouterAlert option serializer and use it in MLD. Fixed #4996 Startblock: has LGTM from marinaciocea and then add reviewer ghanan PiperOrigin-RevId: 347174537
2020-12-10Use specified source address for IGMP/MLD packetsGhanan Gowripalan
This change also considers interfaces and network endpoints enabled up up to the point all work to disable them are complete. This was needed so that protocols can perform shutdown work while being disabled (e.g. sending a packet which requires the endpoint to be enabled to obtain a source address). Bug #4682, #4861 Fixes #4888 Startblock: has LGTM from peterjohnston and then add reviewer brunodalbo PiperOrigin-RevId: 346869702
2020-12-09Do not perform IGMP/MLD on loopback interfacesGhanan Gowripalan
The loopback interface will never have any neighbouring nodes so advertising its interest in multicast groups is unnecessary. Bug #4682, #4861 Startblock: has LGTM from asfez and then add reviewer tamird PiperOrigin-RevId: 346587604