gvisor - Container Runtime Sandbox

Age	Commit message (Collapse)	Author
2021-02-06	Unexpose NIC	Ghanan Gowripalan
	The NIC structure is not to be used outside of the stack package directly. PiperOrigin-RevId: 356036737
2021-01-31	Use different neighbor tables per network endpoint	Ghanan Gowripalan
	This stores each protocol's neighbor state separately. This change also removes the need for each neighbor entry to keep track of their own link address resolver now that all the entries in a cache will use the same resolver. PiperOrigin-RevId: 354818155
2021-01-30	Implement LinkAddressResolver on NetworkEndpoints	Ghanan Gowripalan
	This removes the need to provide the link address request with the NIC the request is being performed on since the NetworkEndpoints already have a reference to the NIC. PiperOrigin-RevId: 354721940
2021-01-28	Change tcpip.Error to an interface	Tamir Duberstein
	This makes it possible to add data to types that implement tcpip.Error. ErrBadLinkEndpoint is removed as it is unused. PiperOrigin-RevId: 354437314
2021-01-27	Confirm neighbor reachability with TCP ACKs	Ghanan Gowripalan
	As per RFC 4861 section 7.3.1, A neighbor is considered reachable if the node has recently received a confirmation that packets sent recently to the neighbor were received by its IP layer. Positive confirmation can be gathered in two ways: hints from upper-layer protocols that indicate a connection is making "forward progress", or receipt of a Neighbor Advertisement message that is a response to a Neighbor Solicitation message. This change adds support for TCP to let the IP/link layers know that a neighbor is reachable. Test: integration_test.TestTCPConfirmNeighborReachability PiperOrigin-RevId: 354222833
2021-01-22	Pass RouteInfo to the route resolve callback	Ghanan Gowripalan
	The route resolution callback will be called with a stack.ResolvedFieldsResult which will hold the route info so callers can avoid attempting resolution again to check if a previous resolution attempt succeeded or not. Test: integration_test.TestRouteResolvedFields PiperOrigin-RevId: 353319019
2021-01-21	Only use callback for GetLinkAddress	Ghanan Gowripalan
	GetLinkAddress's callback will be called immediately with a stack.LinkResolutionResult which will hold the link address so no need to also return the link address from the function. Fixes #5151. PiperOrigin-RevId: 353157857
2021-01-21	Do not cache remote link address in Route	Ghanan Gowripalan
	...unless explicitly requested via ResolveWith. Remove cancelled channels from pending packets as we can use the link resolution channel in a FIFO to limit the number of maximum pending resolutions we should queue packets for. This change also defers starting the goroutine that handles link resolution completion to when link resolution succeeds, fails or gets cancelled due to the max number of pending resolutions being reached. Fixes #751. PiperOrigin-RevId: 353130577
2021-01-15	Resolve known link address on route creation	Ghanan Gowripalan
	If a Route is being created through a link that requires link address resolution and a remote address that has a known mapping to a link address, populate the link address when the route is created. This removes the need for neighbor/link address caches to perform this check. Fixes #5149 PiperOrigin-RevId: 352122401
2021-01-15	Support GetLinkAddress with neighborCache	Ghanan Gowripalan
	Test: integration_test.TestGetLinkAddress PiperOrigin-RevId: 352119404
2021-01-15	Only pass stack.Route's fields to LinkEndpoints	Ghanan Gowripalan
	stack.Route is used to send network packets and resolve link addresses. A LinkEndpoint does not need to do either of these and only needs the route's fields at the time of the packet write request. Since LinkEndpoints only need the route's fields when writing packets, pass a stack.RouteInfo instead. PiperOrigin-RevId: 352108405
2020-12-22	Invoke address resolution upon subsequent traffic to Failed neighbor	Peter Johnston
	Removes the period of time in which subseqeuent traffic to a Failed neighbor immediately fails with ErrNoLinkAddress. A Failed neighbor is one in which address resolution fails; or in other words, the neighbor's IP address cannot be translated to a MAC address. This means removing the Failed state for linkAddrCache and allowing transitiong out of Failed into Incomplete for neighborCache. Previously, both caches would transition entries to Failed after address resolution fails. In this state, any subsequent traffic requested within an unreachable time would immediately fail with ErrNoLinkAddress. This does not follow RFC 4861 section 7.3.3: If address resolution fails, the entry SHOULD be deleted, so that subsequent traffic to that neighbor invokes the next-hop determination procedure again. Invoking next-hop determination at this point ensures that alternate default routers are tried. The API for getting a link address for a given address, whether through the link address cache or the neighbor table, is updated to optionally take a callback which will be called when address resolution completes. This allows `Route` to handle completing link resolution internally, so callers of (Route).Resolve (e.g. endpoints) don’t have to keep track of when it completes and update the Route accordingly. This change also removes the wakers from LinkAddressCache, NeighborCache, and Route in favor of the callbacks, and callers that previously used a waker can now just pass a callback to (Route).Resolve that will notify the waker on resolution completion. Fixes #4796 Startblock: has LGTM from sbalana and then add reviewer ghanan PiperOrigin-RevId: 348597478
2020-12-03	Make `stack.Route` thread safe	Peter Johnston
	Currently we rely on the user to take the lock on the endpoint that owns the route, in order to modify it safely. We can instead move `Route.RemoteLinkAddress` under `Route`'s mutex, and allow non-locking and thread-safe access to other fields of `Route`. PiperOrigin-RevId: 345461586
2020-11-25	Make stack.Route safe to access concurrently	Ghanan Gowripalan
	Multiple goroutines may use the same stack.Route concurrently so the stack.Route should make sure that any functions called on it are thread-safe. Fixes #4073 PiperOrigin-RevId: 344320491
2020-11-18	Remove unused methods from stack.Route	Ghanan Gowripalan
	PiperOrigin-RevId: 343211553
2020-11-18	Fix loopback subnet routing error	Ghanan Gowripalan
	Packets should be properly routed when sending packets to addresses in the loopback subnet which are not explicitly assigned to the loopback interface. Tests: - integration_test.TestLoopbackAcceptAllInSubnetUDP - integration_test.TestLoopbackAcceptAllInSubnetTCP PiperOrigin-RevId: 343135643
2020-11-12	Move packet handling to NetworkEndpoint	Ghanan Gowripalan
	The NIC should not hold network-layer state or logic - network packet handling/forwarding should be performed at the network layer instead of the NIC. Fixes #4688 PiperOrigin-RevId: 342166985
2020-11-05	Cache addressEndpoint.addr.Subnet() to avoid allocations.	Bhasker Hariharan
	This change adds a Subnet() method to AddressableEndpoint so that we can avoid repeated calls to AddressableEndpoint.AddressWithPrefix().Subnet(). Updates #231 PiperOrigin-RevId: 340969877
2020-11-05	Use stack.Route exclusively for writing packets	Ghanan Gowripalan
	* Remove stack.Route from incoming packet path. There is no need to pass around a stack.Route during the incoming path of a packet. Instead, pass around the packet's link/network layer information in the packet buffer since all layers may need this information. * Support address bound and outgoing packet NIC in routes. When forwarding is enabled, the source address of a packet may be bound to a different interface than the outgoing interface. This change updates stack.Route to hold both NICs so that one can be used to write packets while the other is used to check if the route's bound address is valid. Note, we need to hold the address's interface so we can check if the address is a spoofed address. * Introduce the concept of a local route. Local routes are routes where the packet never needs to leave the stack; the destination is stack-local. We can now route between interfaces within a stack if the packet never needs to leave the stack, even when forwarding is disabled. * Always obtain a route from the stack before sending a packet. If a packet needs to be sent in response to an incoming packet, a route must be obtained from the stack to ensure the stack is configured to send packets to the packet's source from the packet's destination. * Enable spoofing if a stack may send packets from unowned addresses. This change required changes to some netgophers since previously, promiscuous mode was enough to let the netstack respond to all incoming packets regardless of the packet's destination address. Now that a stack.Route is not held for each incoming packet, finding a route may fail with local addresses we don't own but accepted packets for while in promiscuous mode. Since we also want to be able to send from any address (in response the received promiscuous mode packets), we need to enable spoofing. * Skip transport layer checksum checks for locally generated packets. If a packet is locally generated, the stack can safely assume that no errors were introduced while being locally routed since the packet is never sent out the wire. Some bugs fixed: - transport layer checksum was never calculated after NAT. - handleLocal didn't handle routing across interfaces. - stack didn't support forwarding across interfaces. - always consult the routing table before creating an endpoint. Updates #4688 Fixes #3906 PiperOrigin-RevId: 340943442
2020-10-30	Automated rollback of changelist 339750876	Dean Deng
	PiperOrigin-RevId: 339945377
2020-10-29	Automated rollback of changelist 339675182	Dean Deng
	PiperOrigin-RevId: 339750876
2020-10-29	Delay goroutine creation during TCP handshake for accept/connect.	Dean Deng
	Refactor TCP handshake code so that when connect is initiated, the initial SYN is sent before creating a goroutine to handle the rest of the handshake (which blocks). Similarly, the initial SYN-ACK is sent inline when SYN is received during accept. Some additional cleanup is done as well. Eventually we would like to complete connections in the dispatcher without requiring a wakeup to complete the handshake. This refactor makes that easier. Updates #231 PiperOrigin-RevId: 339675182
2020-10-14	Find route before sending NA response	Ghanan Gowripalan
	This change also brings back the stack.Route.ResolveWith method so that we can immediately resolve a route when sending an NA in response to a a NS with a source link layer address option. Test: ipv6_test.TestNeighorSolicitationResponse PiperOrigin-RevId: 337185461
2020-10-09	Automated rollback of changelist 336304024	Ghanan Gowripalan
	PiperOrigin-RevId: 336339194
2020-10-09	Automated rollback of changelist 336185457	Bhasker Hariharan
	PiperOrigin-RevId: 336304024
2020-10-08	Do not resolve routes immediately	Ghanan Gowripalan
	When a response needs to be sent to an incoming packet, the stack should consult its neighbour table to determine the remote address's link address. When an entry does not exist in the stack's neighbor table, the stack should queue the packet while link resolution completes. See comments. PiperOrigin-RevId: 336185457
2020-10-05	Remove AssignableAddressEndpoint.NetworkEndpoint	Ghanan Gowripalan
	We can get the network endpoint directly from the NIC. This is a preparatory CL for when a Route needs to hold a dedicated NIC as its output interface. This is because when forwarding is enabled, packets may be sent from a NIC different from the NIC a route's local address is associated with. PiperOrigin-RevId: 335484500
2020-09-30	Count IP OutgoingPacketErrors in the NetworkEndpoint methods	Arthur Sfez
	Before this change, OutgoingPacketErrors was incremented in the stack.Route methods. This was going to be a problem once IPv4/IPv6 WritePackets support fragmentation because Route.WritePackets might now know how many packets are left after an error occurs. Test: - pkg/tcpip/network/ipv4:ipv4_test - pkg/tcpip/network/ipv6:ipv6_test PiperOrigin-RevId: 334687983
2020-09-29	Trim Network/Transport Endpoint/Protocol	Ghanan Gowripalan
	* Remove Capabilities and NICID methods from NetworkEndpoint. * Remove linkEP and stack parameters from NetworkProtocol.NewEndpoint. The LinkEndpoint can be fetched from the NetworkInterface. The stack is passed to the NetworkProtocol when it is created so the NetworkEndpoint can get it from its protocol. * Remove stack parameter from TransportProtocol.NewEndpoint. Like the NetworkProtocol/Endpoint, the stack is passed to the TransportProtocol when it is created. PiperOrigin-RevId: 334332721
2020-09-29	Move IP state from NIC to NetworkEndpoint/Protocol	Ghanan Gowripalan
	* Add network address to network endpoints. Hold network-specific state in the NetworkEndpoint instead of the stack. This results in the stack no longer needing to "know" about the network endpoints and special case certain work for various endpoints (e.g. IPv6 DAD). * Provide NetworkEndpoints with an NetworkInterface interface. Instead of just passing the NIC ID of a NIC, pass an interface so the network endpoint may query other information about the NIC such as whether or not it is a loopback device. * Move NDP code and state to the IPv6 package. NDP is IPv6 specific so there is no need for it to live in the stack. * Control forwarding through NetworkProtocols instead of Stack Forwarding should be controlled on a per-network protocol basis so forwarding configurations are now controlled through network protocols. * Remove stack.referencedNetworkEndpoint. Now that addresses are exposed via AddressEndpoint and only one NetworkEndpoint is created per interface, there is no need for a referenced NetworkEndpoint. * Assume network teardown methods are infallible. Fixes #3871, #3916 PiperOrigin-RevId: 334319433
2020-09-15	Don't conclude broadcast from route destination	Ghanan Gowripalan
	The routing table (in its current) form should not be used to make decisions about whether a remote address is a broadcast address or not (for IPv4). Note, a destination subnet does not always map to a network. E.g. RouterA may have a route to 192.168.0.0/22 through RouterB, but RouterB may be configured with 4x /24 subnets on 4 different interfaces. See https://github.com/google/gvisor/issues/3938. PiperOrigin-RevId: 331819868
2020-08-25	Add option to replace linkAddrCache with neighborCache	Sam Balana
	This change adds an option to replace the current implementation of ARP through linkAddrCache, with an implementation of NUD through neighborCache. Switching to using NUD for both ARP and NDP is beneficial for the reasons described by RFC 4861 Section 3.1: "[Using NUD] significantly improves the robustness of packet delivery in the presence of failing routers, partially failing or partitioned links, or nodes that change their link-layer addresses. For instance, mobile nodes can move off-link without losing any connectivity due to stale ARP caches." "Unlike ARP, Neighbor Unreachability Detection detects half-link failures and avoids sending traffic to neighbors with which two-way connectivity is absent." Along with these changes exposes the API for querying and operating the neighbor cache. Operations include: - Create a static entry - List all entries - Delete all entries - Remove an entry by address This also exposes the API to change the NUD protocol constants on a per-NIC basis to allow Neighbor Discovery to operate over links with widely varying performance characteristics. See [RFC 4861 Section 10][1] for the list of constants. Finally, an API for subscribing to NUD state changes is exposed through NUDDispatcher. See [RFC 4861 Appendix C][3] for the list of edges. Tests: pkg/tcpip/network/arp:arp_test + TestDirectRequest pkg/tcpip/network/ipv6:ipv6_test + TestLinkResolution + TestNDPValidation + TestNeighorAdvertisementWithTargetLinkLayerOption + TestNeighorSolicitationResponse + TestNeighorSolicitationWithSourceLinkLayerOption + TestRouterAdvertValidation pkg/tcpip/stack:stack_test + TestCacheWaker + TestForwardingWithFakeResolver + TestForwardingWithFakeResolverManyPackets + TestForwardingWithFakeResolverManyResolutions + TestForwardingWithFakeResolverPartialTimeout + TestForwardingWithFakeResolverTwoPackets + TestIPv6SourceAddressSelectionScopeAndSameAddress [1]: https://tools.ietf.org/html/rfc4861#section-10 [2]: https://tools.ietf.org/html/rfc4861#appendix-C Fixes #1889 Fixes #1894 Fixes #1895 Fixes #1947 Fixes #1948 Fixes #1949 Fixes #1950 PiperOrigin-RevId: 328365034
2020-08-13	Migrate to PacketHeader API for PacketBuffer.	Ting-Yu Wang
	Formerly, when a packet is constructed or parsed, all headers are set by the client code. This almost always involved prepending to pk.Header buffer or trimming pk.Data portion. This is known to prone to bugs, due to the complexity and number of the invariants assumed across netstack to maintain. In the new PacketHeader API, client will call Push()/Consume() method to construct/parse an outgoing/incoming packet. All invariants, such as slicing and trimming, are maintained by the API itself. NewPacketBuffer() is introduced to create new PacketBuffer. Zero value is no longer valid. PacketBuffer now assumes the packet is a concatenation of following portions: * LinkHeader * NetworkHeader * TransportHeader * Data Any of them could be empty, or zero-length. PiperOrigin-RevId: 326507688
2020-08-08	Use unicast source for ICMP echo replies	Ghanan Gowripalan
	Packets MUST NOT use a non-unicast source address for ICMP Echo Replies. Test: integration_test.TestPingMulticastBroadcast PiperOrigin-RevId: 325634380
2020-07-30	Use brodcast MAC for broadcast IPv4 packets	Ghanan Gowripalan
	When sending packets to a known network's broadcast address, use the broadcast MAC address. Test: - stack_test.TestOutgoingSubnetBroadcast - udp_test.TestOutgoingSubnetBroadcast PiperOrigin-RevId: 324062407
2020-06-09	Handle removed NIC in NDP timer for packet tx	Ghanan Gowripalan
	NDP packets are sent periodically from NDP timers. These timers do not hold the NIC lock when sending packets as the packet write operation may take some time. While the lock is not held, the NIC may be removed by some other goroutine. This change handles that scenario gracefully. Test: stack_test.TestRemoveNICWhileHandlingRSTimer PiperOrigin-RevId: 315524143
2020-06-03	Pass PacketBuffer as pointer.	Ting-Yu Wang
	Historically we've been passing PacketBuffer by shallow copying through out the stack. Right now, this is only correct as the caller would not use PacketBuffer after passing into the next layer in netstack. With new buffer management effort in gVisor/netstack, PacketBuffer will own a Buffer (to be added). Internally, both PacketBuffer and Buffer may have pointers and shallow copying shouldn't be used. Updates #2404. PiperOrigin-RevId: 314610879
2020-05-29	Update WritePacket* API to take ownership of packets to be written.	Ting-Yu Wang
	Updates #2404. PiperOrigin-RevId: 313834784
2020-05-01	Support for connection tracking of TCP packets.	Nayana Bidari
	Connection tracking is used to track packets in prerouting and output hooks of iptables. The NAT rules modify the tuples in connections. The connection tracking code modifies the packets by looking at the modified tuples.
2020-04-30	FIFO QDisc implementation	Bhasker Hariharan
	Updates #231 PiperOrigin-RevId: 309323808
2020-04-03	Refactor software GSO code.	Bhasker Hariharan
	Software GSO implementation currently has a complicated code path with implicit assumptions that all packets to WritePackets carry same Data and it does this to avoid allocations on the path etc. But this makes it hard to reuse the WritePackets API. This change breaks all such assumptions by introducing a new Vectorised View API ReadToVV which can be used to cleanly split a VV into multiple independent VVs. Further this change also makes packet buffers linkable to form an intrusive list. This allows us to get rid of the array of packet buffers that are passed in the WritePackets API call and replace it with a list of packet buffers. While this code does introduce some more allocations in the benchmarks it doesn't cause any degradation. Updates #231 PiperOrigin-RevId: 304731742
2020-03-24	Move tcpip.PacketBuffer and IPTables to stack package.	Bhasker Hariharan
	This is a precursor to be being able to build an intrusive list of PacketBuffers for use in queuing disciplines being implemented. Updates #2214 PiperOrigin-RevId: 302677662
2020-01-31	Use multicast Ethernet address for multicast NDP	Ghanan Gowripalan
	As per RFC 2464 section 7, an IPv6 packet with a multicast destination address is transmitted to the mapped Ethernet multicast address. Test: - ipv6.TestLinkResolution - stack_test.TestDADResolve - stack_test.TestRouterSolicitation PiperOrigin-RevId: 292610529
2020-01-08	Remove redundant function argument	Tamir Duberstein
	PacketLooping is already a member on the passed Route. PiperOrigin-RevId: 288721500
2019-11-22	Use PacketBuffers with GSO.	Kevin Krakauer
	PiperOrigin-RevId: 282045221
2019-11-14	Use PacketBuffers for outgoing packets.	Kevin Krakauer
	PiperOrigin-RevId: 280455453
2019-10-22	netstack/tcp: software segmentation offload	Andrei Vagin
	Right now, we send each tcp packet separately, we call one system call per-packet. This patch allows to generate multiple tcp packets and send them by sendmmsg. The arguable part of this CL is a way how to handle multiple headers. This CL adds the next field to the Prepandable buffer. Nginx test results: Server Software: nginx/1.15.9 Server Hostname: 10.138.0.2 Server Port: 8080 Document Path: /10m.txt Document Length: 10485760 bytes w/o gso: Concurrency Level: 5 Time taken for tests: 5.491 seconds Complete requests: 100 Failed requests: 0 Total transferred: 1048600200 bytes HTML transferred: 1048576000 bytes Requests per second: 18.21 [#/sec] (mean) Time per request: 274.525 [ms] (mean) Time per request: 54.905 [ms] (mean, across all concurrent requests) Transfer rate: 186508.03 [Kbytes/sec] received sw-gso: Concurrency Level: 5 Time taken for tests: 3.852 seconds Complete requests: 100 Failed requests: 0 Total transferred: 1048600200 bytes HTML transferred: 1048576000 bytes Requests per second: 25.96 [#/sec] (mean) Time per request: 192.576 [ms] (mean) Time per request: 38.515 [ms] (mean, across all concurrent requests) Transfer rate: 265874.92 [Kbytes/sec] received w/o gso: $ ./tcp_benchmark --client --duration 15 --ideal [SUM] 0.0-15.1 sec 2.20 GBytes 1.25 Gbits/sec software gso: $ tcp_benchmark --client --duration 15 --ideal --gso $((1<<16)) --swgso [SUM] 0.0-15.1 sec 3.99 GBytes 2.26 Gbits/sec PiperOrigin-RevId: 276112677
2019-10-14	Internal change.	gVisor bot
	PiperOrigin-RevId: 274700093
2019-10-07	Implement IP_TTL.	Ian Gudger
	Also change the default TTL to 64 to match Linux. PiperOrigin-RevId: 273430341
2019-10-03	Implement proper local broadcast behavior	Chris Kuiper
	The behavior for sending and receiving local broadcast (255.255.255.255) traffic is as follows: Outgoing -------- * A broadcast packet sent on a socket that is bound to an interface goes out that interface * A broadcast packet sent on an unbound socket follows the route table to select the outgoing interface + if an explicit route entry exists for 255.255.255.255/32, use that one + else use the default route * Broadcast packets are looped back and delivered following the rules for incoming packets (see next). This is the same behavior as for multicast packets, except that it cannot be disabled via sockopt. Incoming -------- * Sockets wishing to receive broadcast packets must bind to either INADDR_ANY (0.0.0.0) or INADDR_BROADCAST (255.255.255.255). No other socket receives broadcast packets. * Broadcast packets are multiplexed to all sockets matching it. This is the same behavior as for multicast packets. * A socket can bind to 255.255.255.255:<port> and then receive its own broadcast packets sent to 255.255.255.255:<port> In addition, this change implicitly fixes an issue with multicast reception. If two sockets want to receive a given multicast stream and one is bound to ANY while the other is bound to the multicast address, only one of them will receive the traffic. PiperOrigin-RevId: 272792377