summaryrefslogtreecommitdiffhomepage
path: root/pkg/tcpip/stack
AgeCommit message (Collapse)Author
2021-05-21Add aggregated NIC statsArthur Sfez
This change also includes miscellaneous improvements: * UnknownProtocolRcvdPackets has been separated into two stats, to specify at which layer the unknown protocol was found (L3 or L4) * MalformedRcvdPacket is not aggregated across every endpoint anymore. Doing it this way did not add useful information, and it was also error-prone (example: ipv6 forgot to increment this aggregated stat, it only incremented its own ipv6.MalformedPacketsReceived). It is now only incremented the NIC. * Removed TestStatsString test which was outdated and had no real utility. PiperOrigin-RevId: 375057472
2021-05-19Send ICMP errors when link address resolution failsNick Brown
Before this change, we would silently drop packets when link resolution failed. This change brings us into line with RFC 792 (IPv4) and RFC 4443 (IPv6), both of which specify that gateways should return an ICMP error to the sender when link resolution fails. PiperOrigin-RevId: 374699789
2021-05-18Emit more information on panicTamir Duberstein
PiperOrigin-RevId: 374464969
2021-05-14Validate DAD configs when initializing DAD stateGhanan Gowripalan
Make sure that the initial configurations used by the DAD state is valid. Before this change, an invalid DAD configuration (with a zero-valued retransmit timer) was used so the DAD state would attempt to resolve DAD immediately. This lead to a deadlock in TestDADResolve as when DAD resolves, the stack notifies the NDP dispatcher which would attempt to write to an unbuffered channel while holding a lock. The test goroutine also attempts to obtain a stack.Route (before receiving from the channel) which ends up attempting to take the same lock. Test: stack_test.TestDADResolve PiperOrigin-RevId: 373888540
2021-05-14Control forwarding per NetworkEndpointGhanan Gowripalan
...instead of per NetworkProtocol to better conform with linux (https://www.kernel.org/doc/Documentation/networking/ip-sysctl.txt): ``` conf/interface/* forwarding - BOOLEAN Enable IP forwarding on this interface. This controls whether packets received _on_ this interface can be forwarded. ``` Fixes #5932. PiperOrigin-RevId: 373888000
2021-05-14Fix panic on consume in a mixed push/consume caseTing-Yu Wang
headerOffset() is incorrectly taking account of previous push(), so it thinks there is more data to consume. This change switches to use pk.reserved as pivot point. Reported-by: syzbot+64fef9acd509976f9ce7@syzkaller.appspotmail.com PiperOrigin-RevId: 373846283
2021-05-13Check filter table when forwarding IP packetsGhanan Gowripalan
This change updates the forwarding path to perform the forwarding hook with iptables so that the filter table is consulted before a packet is forwarded Updates #170. Test: iptables_test.TestForwardingHook PiperOrigin-RevId: 373702359
2021-05-13Migrate PacketBuffer to use pkg/bufferTing-Yu Wang
Benchmark iperf3: Before After native->runsc 5.14 5.01 (Gbps) runsc->native 4.15 4.07 (Gbps) It did introduce overhead, mainly at the bridge between pkg/buffer and VectorisedView, the ExtractVV method. Once endpoints start migrating away from VV, this overhead will be gone. Updates #2404 PiperOrigin-RevId: 373651666
2021-05-13Rename SetForwarding to SetForwardingDefaultAndAllNICsGhanan Gowripalan
...to make it clear to callers that all interfaces are updated with the forwarding flag and that future NICs will be created with the new forwarding state. PiperOrigin-RevId: 373618435
2021-05-12Send ICMP errors when unable to forward fragmented packetsNick Brown
Before this change, we would silently drop packets when the packet was too big to be sent out through the NIC (and, for IPv4 packets, if DF was set). This change brings us into line with RFC 792 (IPv4) and RFC 4443 (IPv6), both of which specify that gateways should return an ICMP error to the sender when the packet can't be fragmented. PiperOrigin-RevId: 373480078
2021-05-11Merge pull request #5694 from kevinGC:align32gVisor bot
PiperOrigin-RevId: 373271579
2021-05-11Change AcquireAssignedAddress to use RLock.Bhasker Hariharan
This is a hot path for all incoming packets and we don't need an exclusive lock here as we are not modifying any of the fields protected by mu here. PiperOrigin-RevId: 373181254
2021-05-06Moved to atomicbitops and renamed filesKevin Krakauer
2021-05-06Solicit routers as long as RAs are handledGhanan Gowripalan
...to conform with Linux's `accept_ra` sysctl option. ``` accept_ra - INTEGER Accept Router Advertisements; autoconfigure using them. It also determines whether or not to transmit Router Solicitations. If and only if the functional setting is to accept Router Advertisements, Router Solicitations will be transmitted. Possible values are: 0 Do not accept Router Advertisements. 1 Accept Router Advertisements if forwarding is disabled. 2 Overrule forwarding behaviour. Accept Router Advertisements even if forwarding is enabled. Functional default: enabled if local forwarding is disabled. disabled if local forwarding is enabled. ``` With this change, routers may be solicited even if the stack is has forwarding enabled, as long as the interface is configured to handle RAs when forwarding is enabled. PiperOrigin-RevId: 372406501
2021-05-05Allow handling RAs when forwarding is enabledGhanan Gowripalan
...to conform with Linux's `accept_ra` sysctl option. ``` accept_ra - INTEGER Accept Router Advertisements; autoconfigure using them. It also determines whether or not to transmit Router Solicitations. If and only if the functional setting is to accept Router Advertisements, Router Solicitations will be transmitted. Possible values are: 0 Do not accept Router Advertisements. 1 Accept Router Advertisements if forwarding is disabled. 2 Overrule forwarding behaviour. Accept Router Advertisements even if forwarding is enabled. Functional default: enabled if local forwarding is disabled. disabled if local forwarding is enabled. ``` PiperOrigin-RevId: 372214640
2021-05-05Don't cleanup NDP state when enabling forwardingGhanan Gowripalan
...to match linux behaviour: ``` $ sudo sysctl net.ipv6.conf.eno1.forwarding net.ipv6.conf.eno1.forwarding = 0 $ ip addr list dev eno1 2: eno1: <...> ... inet6 PREFIX:TEMP_IID/64 scope global temporary dynamic valid_lft 209363sec preferred_lft 64024sec inet6 PREFIX:GLOBAL_STABLE_IID/64 scope global dynamic mngtmpaddr ... valid_lft 209363sec preferred_lft 209363sec inet6 fe80::LINKLOCAL_STABLE_IID/64 scope link valid_lft forever preferred_lft forever $ sudo sysctl -w "net.ipv6.conf.all.forwarding=1" net.ipv6.conf.all.forwarding = 1 $ sudo sysctl net.ipv6.conf.eno1.forwarding net.ipv6.conf.eno1.forwarding = 1 $ ip addr list dev eno1 2: eno1: <...> ... inet6 PREFIX:TEMP_IID/64 scope global temporary dynamic valid_lft 209339sec preferred_lft 64000sec inet6 PREFIX:GLOBAL_STABLE_IID/64 scope global dynamic mngtmpaddr ... valid_lft 209339sec preferred_lft 209339sec inet6 fe80::LINKLOCAL_STABLE_IID/64 scope link valid_lft forever preferred_lft forever $ ip -6 route list ... PREFIX::/64 dev eno1 proto ra metric 100 expires 209241sec pref medium default via fe80::ROUTER_IID dev eno1 proto ra ... ``` PiperOrigin-RevId: 372146689
2021-05-03Implement standard clock safelyGhanan Gowripalan
Previously, tcpip.StdClock depended on linking with the unexposed method time.now to implement tcpip.Clock using the time package. This change updates the standard clock to not require manually linking to this unexported method and use publicly documented functions from the time package. PiperOrigin-RevId: 371805101
2021-05-03Convey GSO capabilities through GSOEndpointGhanan Gowripalan
...as all GSO capable endpoints must implement GSOEndpoint. PiperOrigin-RevId: 371804175
2021-05-03netstack: Add a test for mixed Push/ConsumeTing-Yu Wang
Not really designed to be used this way, but it works and it's been relied upon. Add a test. PiperOrigin-RevId: 371802756
2021-04-29Fix up TODOs in the codeFabricio Voznika
PiperOrigin-RevId: 371231148
2021-04-29netstack: Rename pkt.Data().TrimFront() to DeleteFront(), and ...Ting-Yu Wang
... it may now invalidate backing slice references This is currently safe because TrimFront() in VectorisedView only shrinks the view. This may not hold under the a different buffer implementation. Reordering method calls order to allow this. PiperOrigin-RevId: 371167610
2021-04-21Only carry GSO options in the packet bufferGhanan Gowripalan
With this change, GSO options no longer needs to be passed around as a function argument in the write path. This change is done in preparation for a later change that defers segmentation, and may change GSO options for a packet as it flows down the stack. Updates #170. PiperOrigin-RevId: 369774872
2021-04-20Move SO_RCVBUF to socketops.Nayana Bidari
Fixes #2926, #674 PiperOrigin-RevId: 369457123
2021-04-19De-duplicate TCP state in TCPEndpointState vs tcp.endpointNick Brown
This change replaces individual private members in tcp.endpoint with a single private TCPEndpointState member. Some internal substructures within endpoint (receiver, sender) have been broken into a public substructure (which is then copied into the TCPEndpointState returned from completeState()) alongside other private fields. Fixes #4466 PiperOrigin-RevId: 369329514
2021-04-17Avoid ignoring incoming packet by demuxer on endpoint lookup failureMithun Iyer
This fixes a race that occurs while the endpoint is being unregistered and the transport demuxer attempts to match the incoming packet to any endpoint. The race specifically occurs when the unregistration (and deletion of the endpoint) occurs, after a successful endpointsByNIC lookup and before the endpoints map is further looked up with ingress NICID of the packet. The fix is to notify the caller of lookup-with-NICID failure, so that the logic falls through to handling unknown destination packets. For TCP this can mean replying back with RST. The syscall test in this CL catches this race as the ACK completing the handshake could get silently dropped on a listener close, causing no RST sent to the peer and timing out the poll waiting for POLLHUP. Fixes #5850 PiperOrigin-RevId: 369023779
2021-04-15Use nicer formatting for IP addresses in testsKevin Krakauer
This was semi-automated -- there are many addresses that were not replaced. Future commits should clean those up. Parse4 and Parse6 were given their own package because //pkg/test can introduce dependency cycles, as it depends transitively on //pkg/tcpip and some other netstack packages. PiperOrigin-RevId: 368726528
2021-04-09iptables: support postrouting hook and SNAT targetToshi Kikuchi
The current SNAT implementation has several limitations: - SNAT source port has to be specified. It is not optional. - SNAT source port range is not supported. - SNAT for UDP is a one-way translation. No response packets are handled (because conntrack doesn't support UDP currently). - SNAT and REDIRECT can't work on the same connection. Fixes #5489 PiperOrigin-RevId: 367750325
2021-04-09Rename IsV6LinkLocalAddress to IsV6LinkLocalUnicastAddressGhanan Gowripalan
To match the V4 variant. PiperOrigin-RevId: 367691981
2021-04-08Join all routers group when forwarding is enabledGhanan Gowripalan
See comments inline code for rationale. Test: ip_test.TestJoinLeaveAllRoutersGroup PiperOrigin-RevId: 367449434
2021-03-24Add POLLRDNORM/POLLWRNORM support.Bhasker Hariharan
On Linux these are meant to be equivalent to POLLIN/POLLOUT. Rather than hack these on in sys_poll etc it felt cleaner to just cleanup the call sites to notify for both events. This is what linux does as well. Fixes #5544 PiperOrigin-RevId: 364859977
2021-03-24Unexpose immutable fields in stack.RouteNick Brown
This change sets the inner `routeInfo` struct to be a named private member and replaces direct access with access through getters. Note that direct access to the fields of `routeInfo` is still possible through the `RouteInfo` struct. Fixes #4902 PiperOrigin-RevId: 364822872
2021-03-22Return tcpip.Error from (*Stack).GetMainNICAddressGhanan Gowripalan
PiperOrigin-RevId: 364381970
2021-03-17Do not use martian loopback packets in testsGhanan Gowripalan
Transport demuxer and UDP tests should not use a loopback address as the source address for packets injected into the stack as martian loopback packets will be dropped in a later change. PiperOrigin-RevId: 363479681
2021-03-16Detect looped-back NDP DAD messagesGhanan Gowripalan
...as per RFC 7527. If a looped-back DAD message is received, do not fail DAD since our own DAD message does not indicate that a neighbor has the address assigned. Test: ndp_test.TestDADResolveLoopback PiperOrigin-RevId: 363224288
2021-03-16Do not call into Stack from LinkAddressRequestGhanan Gowripalan
Calling into the stack from LinkAddressRequest is not needed as we already have a reference to the network endpoint (IPv6) or network interface (IPv4/ARP). PiperOrigin-RevId: 363213973
2021-03-11improve readability of ports packageKevin Krakauer
Lots of small changes: - simplify package API via Reservation type - rename some single-letter variable names that were hard to follow - rename some types PiperOrigin-RevId: 362442366
2021-03-08Implement /proc/sys/net/ipv4/ip_local_port_rangeKevin Krakauer
Speeds up the socket stress tests by a couple orders of magnitude. PiperOrigin-RevId: 361721050
2021-03-05Include duplicate address holder info in DADResultGhanan Gowripalan
The integrator may be interested in who owns a duplicate address so pass this information (if available) along. Fixes #5605. PiperOrigin-RevId: 361213556
2021-03-05Make stack.DADResult an interfaceGhanan Gowripalan
While I'm here, update NDPDispatcher.OnDuplicateAddressDetectionStatus to take a DADResult and rename it to OnDuplicateAddressDetectionResult. Fixes #5606. PiperOrigin-RevId: 360965416
2021-03-03Make dedicated methods for data operations in PacketBufferTing-Yu Wang
One of the preparation to decouple underlying buffer implementation. There are still some methods that tie to VectorisedView, and they will be changed gradually in later CLs. This CL also introduce a new ICMPv6ChecksumParams to replace long list of parameters when calling ICMPv6Checksum, aiming to be more descriptive. PiperOrigin-RevId: 360778149
2021-03-03Assert UpdatedAtNanos in neighbor cache testsSam Balana
Changes the neighbor_cache_test.go tests to always assert UpdatedAtNanos. Completes the assertion of UpdatedAtNanos in every NUD test, a field that was historically not checked due to the lack of a deterministic, controllable clock. This is no longer true with the tcpip.Clock interface. While the tests have been adjusted to use Clock, asserting by the UpdatedAtNanos was neglected. Fixes #4663 PiperOrigin-RevId: 360730077
2021-03-02Plumb link address request errors up to requesterTamir Duberstein
Prevent the situation where callers to (*stack).GetLinkAddress provide incorrect arguments and are unable to observe this condition. Updates #5583. PiperOrigin-RevId: 360481557
2021-02-26Assert UpdatedAtNanos in neighbor entry testsSam Balana
Changes the neighbor_entry_test.go tests to always assert UpdatedAtNanos. This field was historically not checked due to the lack of a deterministic, controllable clock. This is no longer true with the tcpip.Clock interface. While the tests have been adjusted to use Clock, asserting by the UpdatedAtNanos was neglected. Subsequent work is needed to assert UpdatedAtNanos in the neighbor cache tests. Updates #4663 PiperOrigin-RevId: 359868254
2021-02-26Embed sync.Mutex for entryTestLinkResolver and testNUDDispatcherSam Balana
Converts entryTestLinkResolver and testNUDDispatcher to use the embedded sync.Mutex pattern for fields that may be accessed concurrently from different gorountines. Fixes #5541 PiperOrigin-RevId: 359826169
2021-02-26Use helper functions in neighbor entry testsSam Balana
Adds helper functions for transitioning into common states. This reduces the boilerplate by a fair amount, decreasing the barriers to entry for new features added to neighborEntry. PiperOrigin-RevId: 359810465
2021-02-26Use closure to avoid manual unlockingTamir Duberstein
Also increase refcount of raw.endpoint.route while in use. Avoid allocating an array of size zero. PiperOrigin-RevId: 359797788
2021-02-25Remove deadlock in raw.endpoint caused by recursive read lockingKevin Krakauer
Prevents the following deadlock: - Raw packet is sent via e.Write(), which read locks e.mu - Connect() is called, blocking on write locking e.mu - The packet is routed to loopback and back to e.HandlePacket(), which read locks e.mu Per the atomic.RWMutex documentation, this deadlocks: "If a goroutine holds a RWMutex for reading and another goroutine might call Lock, no goroutine should expect to be able to acquire a read lock until the initial read lock is released. In particular, this prohibits recursive read locking. This is to ensure that the lock eventually becomes available; a blocked Lock call excludes new readers from acquiring the lock." Also, release eps.mu earlier in deliverRawPacket. PiperOrigin-RevId: 359600926
2021-02-24Cleanup temp SLAAC address jobs on DAD conflictsGhanan Gowripalan
Previously, when DAD would detect a conflict for a temporary address, the address would be removed but its timers would not be stopped, resulting in a panic when the removed address's invalidation timer fired. While I'm here, remove the check for unicast-ness on removed address endpoints since multicast addresses are no longer stored in the same structure as unicast addresses as of 27ee4fe76ad586ac8751951a842b3681f93. Test: stack_test.TestMixedSLAACAddrConflictRegen PiperOrigin-RevId: 359344849
2021-02-18Remove deprecated NUD types Failed and FailedEntryLookupsSam Balana
Completes the soft migration to Unreachable state by removing the Failed state and the the FailedEntryLookups StatCounter. Fixes #4667 PiperOrigin-RevId: 358226380
2021-02-17Move Name() out of netstack Matcher. It can live in the sentry.Kevin Krakauer
PiperOrigin-RevId: 358078157