summaryrefslogtreecommitdiffhomepage
path: root/pkg/tcpip/stack
AgeCommit message (Collapse)Author
2020-08-24Merge release-20200818.0-32-g339d266be (automated)gVisor bot
2020-08-24Consider loopback bound to all addresses in subnetGhanan Gowripalan
When a loopback interface is configurd with an address and associated subnet, the loopback should treat all addresses in that subnet as an address it owns. This is mimicking linux behaviour as seen below: ``` $ ip addr show dev lo 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group ... link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever $ ping 192.0.2.1 PING 192.0.2.1 (192.0.2.1) 56(84) bytes of data. ^C --- 192.0.2.1 ping statistics --- 2 packets transmitted, 0 received, 100% packet loss, time 1018ms $ ping 192.0.2.2 PING 192.0.2.2 (192.0.2.2) 56(84) bytes of data. ^C --- 192.0.2.2 ping statistics --- 3 packets transmitted, 0 received, 100% packet loss, time 2039ms $ sudo ip addr add 192.0.2.1/24 dev lo $ ip addr show dev lo 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group ... link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet 192.0.2.1/24 scope global lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever $ ping 192.0.2.1 PING 192.0.2.1 (192.0.2.1) 56(84) bytes of data. 64 bytes from 192.0.2.1: icmp_seq=1 ttl=64 time=0.131 ms 64 bytes from 192.0.2.1: icmp_seq=2 ttl=64 time=0.046 ms 64 bytes from 192.0.2.1: icmp_seq=3 ttl=64 time=0.048 ms ^C --- 192.0.2.1 ping statistics --- 3 packets transmitted, 3 received, 0% packet loss, time 2042ms rtt min/avg/max/mdev = 0.046/0.075/0.131/0.039 ms $ ping 192.0.2.2 PING 192.0.2.2 (192.0.2.2) 56(84) bytes of data. 64 bytes from 192.0.2.2: icmp_seq=1 ttl=64 time=0.131 ms 64 bytes from 192.0.2.2: icmp_seq=2 ttl=64 time=0.069 ms 64 bytes from 192.0.2.2: icmp_seq=3 ttl=64 time=0.049 ms 64 bytes from 192.0.2.2: icmp_seq=4 ttl=64 time=0.035 ms ^C --- 192.0.2.2 ping statistics --- 4 packets transmitted, 4 received, 0% packet loss, time 3049ms rtt min/avg/max/mdev = 0.035/0.071/0.131/0.036 ms ``` Test: integration_test.TestLoopbackAcceptAllInSubnet PiperOrigin-RevId: 328188546
2020-08-20Merge release-20200810.0-74-g129018ab3 (automated)gVisor bot
2020-08-20Consistent precondition formattingMichael Pratt
Our "Preconditions:" blocks are very useful to determine the input invariants, but they are bit inconsistent throughout the codebase, which makes them harder to read (particularly cases with 5+ conditions in a single paragraph). I've reformatted all of the cases to fit in simple rules: 1. Cases with a single condition are placed on a single line. 2. Cases with multiple conditions are placed in a bulleted list. This format has been added to the style guide. I've also mentioned "Postconditions:", though those are much less frequently used, and all uses already match this style. PiperOrigin-RevId: 327687465
2020-08-17Merge release-20200810.0-40-ge3c4bbd10 (automated)gVisor bot
2020-08-17Remove address range functionsGhanan Gowripalan
Should have been removed in cl/326791119 https://github.com/google/gvisor/commit/9a7b5830aa063895f67ca0fdf653a46906374613 PiperOrigin-RevId: 327074156
2020-08-15Merge release-20200810.0-36-g9a7b5830a (automated)gVisor bot
2020-08-15Don't support address rangesGhanan Gowripalan
Previously the netstack supported assignment of a range of addresses. This feature is not used so remove it. PiperOrigin-RevId: 326791119
2020-08-15Merge release-20200810.0-35-g1736b2208 (automated)gVisor bot
2020-08-14Use a single NetworkEndpoint per NIC per protocolGhanan Gowripalan
The NetworkEndpoint does not need to be created for each address. Most of the work the NetworkEndpoint does is address agnostic. PiperOrigin-RevId: 326759605
2020-08-13Merge release-20200810.0-23-g47515f475 (automated)gVisor bot
2020-08-13Migrate to PacketHeader API for PacketBuffer.Ting-Yu Wang
Formerly, when a packet is constructed or parsed, all headers are set by the client code. This almost always involved prepending to pk.Header buffer or trimming pk.Data portion. This is known to prone to bugs, due to the complexity and number of the invariants assumed across netstack to maintain. In the new PacketHeader API, client will call Push()/Consume() method to construct/parse an outgoing/incoming packet. All invariants, such as slicing and trimming, are maintained by the API itself. NewPacketBuffer() is introduced to create new PacketBuffer. Zero value is no longer valid. PacketBuffer now assumes the packet is a concatenation of following portions: * LinkHeader * NetworkHeader * TransportHeader * Data Any of them could be empty, or zero-length. PiperOrigin-RevId: 326507688
2020-08-11Merge release-20200804.0-61-g8e31f0dc5 (automated)gVisor bot
2020-08-10Set the NetworkProtocolNumber of all PacketBuffers.Kevin Krakauer
NetworkEndpoints set the number on outgoing packets in Write() and NetworkProtocols set them on incoming packets in Parse(). Needed for #3549. PiperOrigin-RevId: 325938745
2020-08-09Merge release-20200804.0-54-gb404b5c25 (automated)gVisor bot
2020-08-08Use unicast source for ICMP echo repliesGhanan Gowripalan
Packets MUST NOT use a non-unicast source address for ICMP Echo Replies. Test: integration_test.TestPingMulticastBroadcast PiperOrigin-RevId: 325634380
2020-08-07Merge release-20200804.0-43-g94447aeab (automated)gVisor bot
2020-08-07Fix panic during Address Resolution of neighbor entry created by NSSam Balana
When a Neighbor Solicitation is received, a neighbor entry is created with the remote host's link layer address, but without a link layer address resolver. If the host decides to send a packet addressed to the IP address of that neighbor entry, Address Resolution starts with a nil pointer to the link layer address resolver. This causes the netstack to panic and crash. This change ensures that when a packet is sent in that situation, the link layer address resolver will be set before Address Resolution begins. Tests: pkg/tcpip/stack:stack_test + TestEntryUnknownToStaleToProbeToReachable - TestNeighborCacheEntryNoLinkAddress Updates #1889 Updates #1894 Updates #1895 Updates #1947 Updates #1948 Updates #1949 Updates #1950 PiperOrigin-RevId: 325516471
2020-08-06Merge release-20200804.0-28-gfc4dd3ef4 (automated)gVisor bot
2020-08-06Join IPv4 all-systems group on NIC enableGhanan Gowripalan
Test: - stack_test.TestJoinLeaveMulticastOnNICEnableDisable - integration_test.TestIncomingMulticastAndBroadcast PiperOrigin-RevId: 325185259
2020-08-06Merge release-20200804.0-24-g90a2d4e82 (automated)gVisor bot
2020-08-05Support receiving broadcast IPv4 packetsGhanan Gowripalan
Test: integration_test.TestIncomingSubnetBroadcast PiperOrigin-RevId: 325135617
2020-08-05Merge release-20200804.0-22-ge7b232a5b (automated)gVisor bot
2020-08-05Prefer RLock over Lock in functions that don't need Lock().Bhasker Hariharan
Updates #231 PiperOrigin-RevId: 325097683
2020-08-05Merge release-20200622.1-336-g0e6f7a12c (automated)gVisor bot
2020-08-04Update variables for implementation of RACK in TCPNayana Bidari
RACK (Recent Acknowledgement) is a new loss detection algorithm in TCP. These are the fields which should be stored on connections to implement RACK algorithm. PiperOrigin-RevId: 324948703
2020-07-31Merge release-20200622.1-300-ga7d9aa6d5 (automated)gVisor bot
2020-07-31iptables: support SO_ORIGINAL_DSTKevin Krakauer
Envoy (#170) uses this to get the original destination of redirected packets.
2020-07-30Implement neighbor unreachability detection for ARP and NDP.Sam Balana
This change implements the Neighbor Unreachability Detection (NUD) state machine, as per RFC 4861 [1]. The state machine operates on a single neighbor in the local network. This requires the state machine to be implemented on each entry of the neighbor table. This change also adds, but does not expose, several APIs. The first API is for performing basic operations on the neighbor table: - Create a static entry - List all entries - Delete all entries - Remove an entry by address The second API is used for changing the NUD protocol constants on a per-NIC basis to allow Neighbor Discovery to operate over links with widely varying performance characteristics. See [RFC 4861 Section 10][2] for the list of constants. Finally, the last API is for allowing users to subscribe to NUD state changes. See [RFC 4861 Appendix C][3] for the list of edges. [1]: https://tools.ietf.org/html/rfc4861 [2]: https://tools.ietf.org/html/rfc4861#section-10 [3]: https://tools.ietf.org/html/rfc4861#appendix-C Tests: pkg/tcpip/stack:stack_test - TestNeighborCacheAddStaticEntryThenOverflow - TestNeighborCacheClear - TestNeighborCacheClearThenOverflow - TestNeighborCacheConcurrent - TestNeighborCacheDuplicateStaticEntryWithDifferentLinkAddress - TestNeighborCacheDuplicateStaticEntryWithSameLinkAddress - TestNeighborCacheEntry - TestNeighborCacheEntryNoLinkAddress - TestNeighborCacheGetConfig - TestNeighborCacheKeepFrequentlyUsed - TestNeighborCacheNotifiesWaker - TestNeighborCacheOverflow - TestNeighborCacheOverwriteWithStaticEntryThenOverflow - TestNeighborCacheRemoveEntry - TestNeighborCacheRemoveEntryThenOverflow - TestNeighborCacheRemoveStaticEntry - TestNeighborCacheRemoveStaticEntryThenOverflow - TestNeighborCacheRemoveWaker - TestNeighborCacheReplace - TestNeighborCacheResolutionFailed - TestNeighborCacheResolutionTimeout - TestNeighborCacheSetConfig - TestNeighborCacheStaticResolution - TestEntryAddsAndClearsWakers - TestEntryDelayToProbe - TestEntryDelayToReachableWhenSolicitedOverrideConfirmation - TestEntryDelayToReachableWhenUpperLevelConfirmation - TestEntryDelayToStaleWhenConfirmationWithDifferentAddress - TestEntryDelayToStaleWhenProbeWithDifferentAddress - TestEntryFailedGetsDeleted - TestEntryIncompleteToFailed - TestEntryIncompleteToIncompleteDoesNotChangeUpdatedAt - TestEntryIncompleteToReachable - TestEntryIncompleteToReachableWithRouterFlag - TestEntryIncompleteToStale - TestEntryInitiallyUnknown - TestEntryProbeToFailed - TestEntryProbeToReachableWhenSolicitedConfirmationWithSameAddress - TestEntryProbeToReachableWhenSolicitedOverrideConfirmation - TestEntryProbeToStaleWhenConfirmationWithDifferentAddress - TestEntryProbeToStaleWhenProbeWithDifferentAddress - TestEntryReachableToStaleWhenConfirmationWithDifferentAddress - TestEntryReachableToStaleWhenConfirmationWithDifferentAddressAndOverride - TestEntryReachableToStaleWhenProbeWithDifferentAddress - TestEntryReachableToStaleWhenTimeout - TestEntryStaleToDelay - TestEntryStaleToReachableWhenSolicitedOverrideConfirmation - TestEntryStaleToStaleWhenOverrideConfirmation - TestEntryStaleToStaleWhenProbeUpdateAddress - TestEntryStaysDelayWhenOverrideConfirmationWithSameAddress - TestEntryStaysProbeWhenOverrideConfirmationWithSameAddress - TestEntryStaysReachableWhenConfirmationWithRouterFlag - TestEntryStaysReachableWhenProbeWithSameAddress - TestEntryStaysStaleWhenProbeWithSameAddress - TestEntryUnknownToIncomplete - TestEntryUnknownToStale - TestEntryUnknownToUnknownWhenConfirmationWithUnknownAddress pkg/tcpip/stack:stack_x_test - TestDefaultNUDConfigurations - TestNUDConfigurationFailsForNotSupported - TestNUDConfigurationsBaseReachableTime - TestNUDConfigurationsDelayFirstProbeTime - TestNUDConfigurationsMaxMulticastProbes - TestNUDConfigurationsMaxRandomFactor - TestNUDConfigurationsMaxUnicastProbes - TestNUDConfigurationsMinRandomFactor - TestNUDConfigurationsRetransmitTimer - TestNUDConfigurationsUnreachableTime - TestNUDStateReachableTime - TestNUDStateRecomputeReachableTime - TestSetNUDConfigurationFailsForBadNICID - TestSetNUDConfigurationFailsForNotSupported [1]: https://tools.ietf.org/html/rfc4861 [2]: https://tools.ietf.org/html/rfc4861#section-10 [3]: https://tools.ietf.org/html/rfc4861#appendix-C Updates #1889 Updates #1894 Updates #1895 Updates #1947 Updates #1948 Updates #1949 Updates #1950 PiperOrigin-RevId: 324070795
2020-07-30Use brodcast MAC for broadcast IPv4 packetsGhanan Gowripalan
When sending packets to a known network's broadcast address, use the broadcast MAC address. Test: - stack_test.TestOutgoingSubnetBroadcast - udp_test.TestOutgoingSubnetBroadcast PiperOrigin-RevId: 324062407
2020-07-28Redirect TODO to GitHub issuesFabricio Voznika
PiperOrigin-RevId: 323715260
2020-07-27Merge release-20200622.1-240-g8dbf428a1 (automated)gVisor bot
2020-07-27Add ability to send unicast ARP requests and Neighbor SolicitationsSam Balana
The previous implementation of LinkAddressRequest only supported sending broadcast ARP requests and multicast Neighbor Solicitations. The ability to send these packets as unicast is required for Neighbor Unreachability Detection. Tests: pkg/tcpip/network/arp:arp_test - TestLinkAddressRequest pkg/tcpip/network/ipv6:ipv6_test - TestLinkAddressRequest Updates #1889 Updates #1894 Updates #1895 Updates #1947 Updates #1948 Updates #1949 Updates #1950 PiperOrigin-RevId: 323451569
2020-07-24Merge release-20200622.1-208-g82a5cada5 (automated)gVisor bot
2020-07-23Add AfterFunc to tcpip.ClockSam Balana
Changes the API of tcpip.Clock to also provide a method for scheduling and rescheduling work after a specified duration. This change also implements the AfterFunc method for existing implementations of tcpip.Clock. This is the groundwork required to mock time within tests. All references to CancellableTimer has been replaced with the tcpip.Job interface, allowing for custom implementations of scheduling work. This is a BREAKING CHANGE for clients that implement their own tcpip.Clock or use tcpip.CancellableTimer. Migration plan: 1. Add AfterFunc(d, f) to tcpip.Clock 2. Replace references of tcpip.CancellableTimer with tcpip.Job 3. Replace calls to tcpip.CancellableTimer#StopLocked with tcpip.Job#Cancel 4. Replace calls to tcpip.CancellableTimer#Reset with tcpip.Job#Schedule 5. Replace calls to tcpip.NewCancellableTimer with tcpip.NewJob. PiperOrigin-RevId: 322906897
2020-07-23Merge release-20200622.1-200-gdd530eeef (automated)gVisor bot
2020-07-23iptables: use keyed array literalsKevin Krakauer
PiperOrigin-RevId: 322882426
2020-07-23Merge release-20200622.1-198-gfc26b3764 (automated)gVisor bot
2020-07-23Merge pull request #3207 from kevinGC:icmp-connectgVisor bot
PiperOrigin-RevId: 322853192
2020-07-23Merge release-20200622.1-191-g36257e6b7 (automated)gVisor bot
2020-07-22make connect(2) fail when dest is unreachableKevin Krakauer
Previously, ICMP destination unreachable datagrams were ignored by TCP endpoints. This caused connect to hang when an intermediate router couldn't find a route to the host. This manifested as a Kokoro error when Docker IPv6 was enabled. The Ruby image test would try to install the sinatra gem and hang indefinitely attempting to use an IPv6 address. Fixes #3079.
2020-07-22iptables: don't NAT existing connectionsKevin Krakauer
Fixes a NAT bug that manifested as: - A SYN was sent from gVisor to another host, unaffected by iptables. - The corresponding SYN/ACK was NATted by a PREROUTING REDIRECT rule despite being part of the existing connection. - The socket that sent the SYN never received the SYN/ACK and thus a connection could not be established. We handle this (as Linux does) by tracking all connections, inserting a no-op conntrack rule for new connections with no rules of their own. Needed for istio support (#170).
2020-07-22Merge release-20200622.1-187-gbd98f8201 (automated)gVisor bot
2020-07-22iptables: replace maps with arraysKevin Krakauer
For iptables users, Check() is a hot path called for every packet one or more times. Let's avoid a bunch of map lookups. PiperOrigin-RevId: 322678699
2020-07-22Merge release-20200622.1-184-g71bf90c55 (automated)gVisor bot
2020-07-22Support for receiving outbound packets in AF_PACKET.Bhasker Hariharan
Updates #173 PiperOrigin-RevId: 322665518
2020-07-15Merge release-20200622.1-167-ge92f38ff0 (automated)gVisor bot
2020-07-15iptables: remove check for NetworkHeaderKevin Krakauer
This is no longer necessary, as we always set NetworkHeader before calling iptables.Check. PiperOrigin-RevId: 321461978
2020-07-15Merge release-20200622.1-162-gfef90c61c (automated)gVisor bot
2020-07-15Fix minor bugs in a couple of interface IOCTLs.Bhasker Hariharan
gVisor incorrectly returns the wrong ARP type for SIOGIFHWADDR. This breaks tcpdump as it tries to interpret the packets incorrectly. Similarly, SIOCETHTOOL is used by tcpdump to query interface properties which fails with an EINVAL since we don't implement it. For now change it to return EOPNOTSUPP to indicate that we don't support the query rather than return EINVAL. NOTE: ARPHRD types for link endpoints are distinct from NIC capabilities and NIC flags. In Linux all 3 exist eg. ARPHRD types are stored in dev->type field while NIC capabilities are more like the device features which can be queried using SIOCETHTOOL but not modified and NIC Flags are fields that can be modified from user space. eg. NIC status (UP/DOWN/MULTICAST/BROADCAST) etc. Updates #2746 PiperOrigin-RevId: 321436525