gvisor - Container Runtime Sandbox

Age	Commit message (Collapse)	Author
2021-06-04	Honor data and FIN from the ACK completing handshake	Mithun Iyer
	If the ACK completing the handshake has FIN or data, requeue the segment for further processing by the newly established endpoint. Otherwise, the segments would have to be retransmitted by the peer to be processed by the established endpoint. Doing this, keeps the behavior in parity with Linux. This also addresses a test flake with TCPNonBlockingConnectClose where the ACK (completing the handshake) and multiple retransmitted FINACKs from the peer could be dropped by the listener, when using syncookies and the accept queue is full. The handshake could eventually get completed with a retransmitted FINACK, without actual processing of FIN. This can cause the poll with POLLRDHUP on the accepted socket to sometimes time out before the next FINACK retransmission. PiperOrigin-RevId: 377651695
2021-05-27	Support SO_BINDTODEVICE in ICMP sockets	Sam Balana
	Adds support for the SO_BINDTODEVICE socket option in ICMP sockets with an accompanying packetimpact test to exercise use of this socket option. Adds a unit test to exercise the NIC selection logic introduced by this change. The remaining unit tests for ICMP sockets need to be added in a subsequent CL. See https://gvisor.dev/issues/5623 for the list of remaining unit tests. Adds a "timeout" field to PacketimpactTestInfo, necessary due to the long runtime of the newly added packetimpact test. Fixes #5678 Fixes #4896 Updates #5623 Updates #5681 Updates #5763 Updates #5956 Updates #5966 Updates #5967 PiperOrigin-RevId: 376271581
2021-05-20	Add protocol state to TCPINFO	Mithun Iyer
	Add missing protocol state to TCPINFO struct and update packetimpact. This re-arranges the TCP state definitions to align with Linux. Fixes #478 PiperOrigin-RevId: 374996751
2021-05-05	Fix a race in reading last seen ICMP error during handshake	Mithun Iyer
	On receiving an ICMP error during handshake, the error is propagated by reading `endpoint.lastError`. This can race with the socket layer invoking getsockopt() with SO_ERROR where the same value is read and cleared, causing the handshake to bail out with a non-error state. Fix the race by checking for lastError state and failing the handshake with ErrConnectionAborted if the lastError was read and cleared by say SO_ERROR. The race mentioned in the bug, is caught only with the newly added tcp_test unit test, where we have control over stopping/resuming protocol loop. Adding a packetimpact test as well for sanity testing of ICMP error handling during handshake. Fixes #5922 PiperOrigin-RevId: 372135662
2021-04-20	Clean test tags.	Adin Scannell
	PiperOrigin-RevId: 369505182
2021-04-05	Fix listen backlog handling to be in parity with Linux	Mithun Iyer
	- Change the accept queue full condition for a listening endpoint to only honor completed (and delivered) connections. - Use syncookies if the number of incomplete connections is beyond listen backlog. This also cleans up the SynThreshold option code as that is no longer used with this change. - Added a new stack option to unconditionally generate syncookies. Similar to sysctl -w net.ipv4.tcp_syncookies=2 on Linux. - Enable keeping of incomplete connections beyond listen backlog. - Drop incoming SYNs only if the accept queue is filled up. - Drop incoming ACKs that complete handshakes when accept queue is full - Enable the stack to accept one more connection than programmed by listen backlog. - Handle backlog argument being zero, negative for listen, as Linux. - Add syscall and packetimpact tests to reflect the changes above. - Remove TCPConnectBacklog test which is polling for completed connections on the client side which is not reflective of whether the accept queue is filled up by the test. The modified syscall test in this CL addresses testing of connecting sockets. Fixes #3153 PiperOrigin-RevId: 366935921
2021-03-29	[syserror] Split usermem package	Zach Koopmans
	Split usermem package to help remove syserror dependency in go_marshal. New hostarch package contains code not dependent on syserror. PiperOrigin-RevId: 365651233
2021-03-22	Fix and merge tcp_{outside_the_window,tcp_unacc_seq_ack}_closing	Zeling Feng
	The tests were not using the correct windowSize so the testing segments were actually within the window for seqNumOffset=0 tests. The issue is already fixed by #5674. PiperOrigin-RevId: 364252630
2021-03-15	Packetimpact test for ACK to OTW Seq segments behavior in CLOSING	Zeling Feng
	TCP, in CLOSING state, MUST send an ACK with next expected SEQ number after receiving any segment with OTW SEQ number and remain in the same state. While I am here, I also changed shutdown to behave the same as other calls in posix_server. PiperOrigin-RevId: 362976955
2021-02-25	Move SetNonblocking into posix_server	Zeling Feng
	- open flags can be different on different OSs, by putting SetNonblocking into the posix_server rather than the testbench, we can always get the right value for O_NONBLOCK - merged the tcp_queue_{send,receive}_in_syn_sent into a single file PiperOrigin-RevId: 359620630
2021-02-17	Use TCP_INFO to get RTO in tcp_retransmits_test	Nayana Bidari
	- TCP_INFO is used to get the RTO instead of calculating it manually. PiperOrigin-RevId: 358032487
2021-02-12	Remove packetimpact test tcp_reordering	Nayana Bidari
	Remove flaky tcp_reordering_test as it does not check reordering. We have added new reorder tests in tcp_rack_test.go PiperOrigin-RevId: 357278769
2021-01-28	Respect SO_BINDTODEVICE in unconnected UDP writes	Marina Ciocea
	Previously, sending on an unconnected UDP socket would ignore the SO_BINDTODEVICE option. Send on the configured interface when an UDP socket is bound to an interface through setsockop SO_BINDTODEVICE. Add packetimpact tests exercising UDP reads and writes with every combination of bound/unbound, broadcast/multicast/unicast destination, and bound/not-bound to device. PiperOrigin-RevId: 354299670
2021-01-27	Add support for more fields in netstack for TCP_INFO	Nayana Bidari
	This CL adds support for the following fields: - RTT, RTTVar, RTO - send congestion window (sndCwnd) and send slow start threshold (sndSsthresh) - congestion control state(CaState) - ReorderSeen PiperOrigin-RevId: 354195361
2021-01-22	Add tests for RACK	Nayana Bidari
	- Added packetimpact tests for RACK. PiperOrigin-RevId: 353282342
2020-12-09	Refactor the Makefile to avoid recursive Make.	Adin Scannell
	Recursive make is difficult to follow and debug. Drop this by using internal functions, which, while difficult, are easier than trying to following recursive invokations. Further simplify the Makefile by collapsing the image bits and removing the tools/vm directory, which is effectively unused. Fixes #4952 PiperOrigin-RevId: 346569133
2020-12-05	Fix zero receive window advertisements.	Mithun Iyer
	With the recent changes db36d948fa63ce950d94a5e8e9ebc37956543661, we try to balance the receive window advertisements between payload lengths vs segment overhead length. This works fine when segment size are much higher than the overhead, but not otherwise. In cases where the segment length is smaller than the segment overhead, we may end up not advertising zero receive window for long time and end up tail-dropping segments. This is especially pronounced when application socket reads are slow or stopped. In this change we do not grow the right edge of the receive window for smaller segment sizes similar to Linux. Also, we keep track of the socket buffer usage and let the window grow if the application is actively reading data. Fixes #4903 PiperOrigin-RevId: 345832012
2020-11-24	[2/3] Support isolated containers for parallel packetimpact tests	Zeling Feng
	Added a new flag num_duts to the test runner to create multiple DUTs for the testbench can connect to. PiperOrigin-RevId: 344195435
2020-11-16	Add packetimpact tests for ICMPv6 Error message for fragment	Toshi Kikuchi
	Updates #4427 PiperOrigin-RevId: 342703931
2020-10-29	Add IPv4 reassembly packetimpact test	Arthur Sfez
	The IPv6 reassembly test was also refactored to be easily extended with more cases. PiperOrigin-RevId: 339768605
2020-10-15	Syncing packetimpact tests in different directories	Zeling Feng
	By exposing an ALL_TESTS list in defs.bzl we can make sure all packetimpact users get to agree on the list of all tests. A defect in this approach is that we have to keep a list of packetimpact_testbench rules in the BUILD file. An helper validate_all_tests has been added to help keep BUILD and .bzl files in sync. PiperOrigin-RevId: 337411839
2020-10-06	Add support for IPv6 fragmentation	Arthur Sfez
	Most of the IPv4 fragmentation code was moved in the fragmentation package and it is reused by IPv6 fragmentation. Test: - pkg/tcpip/network/ipv4:ipv4_test - pkg/tcpip/network/ipv6:ipv6_test - pkg/tcpip/network/fragmentation:fragmentation_test Fixes #4389 PiperOrigin-RevId: 335714280
2020-09-24	Change segment/pending queue to use receive buffer limits.	Bhasker Hariharan
	segment_queue today has its own standalone limit of MaxUnprocessedSegments but this can be a problem in UnlockUser() we do not release the lock till there are segments to be processed. What can happen is as handleSegments dequeues packets more keep getting queued and we will never release the lock. This can keep happening even if the receive buffer is full because nothing can read() till we release the lock. Further having a separate limit for pending segments makes it harder to track memory usage etc. Unifying the limits makes it easier to reason about memory in use and makes the overall buffer behaviour more consistent. PiperOrigin-RevId: 333508122
2020-09-18	Enqueue TCP sends arriving in SYN_SENT state.	Mithun Iyer
	TCP needs to enqueue any send requests arriving when the connection is in SYN_SENT state. The data should be sent out soon after completion of the connection handshake. Fixes #3995 PiperOrigin-RevId: 332482041
2020-09-16	Automated rollback of changelist 329526153	Nayana Bidari
	PiperOrigin-RevId: 332097286
2020-09-14	Test RST handling in TIME_WAIT.	Mithun Iyer
	gVisor stack ignores RSTs when in TIME_WAIT which is not the default Linux behavior. Add a packetimpact test to test the same. Also update code comments to reflect the rationale for the current gVisor behavior. PiperOrigin-RevId: 331629879
2020-09-01	Fix handling of unacceptable ACKs during close.	Mithun Iyer
	On receiving an ACK with unacceptable ACK number, in a closing state, TCP, needs to reply back with an ACK with correct seq and ack numbers and remain in same state. This change is as per RFC793 page 37, but with a difference that it does not apply to ESTABLISHED state, just as in Linux. Also add more tests to check for OTW sequence number and unacceptable ack numbers in these states. Fixes #3785 PiperOrigin-RevId: 329616283
2020-09-01	Automated rollback of changelist 328350576	Nayana Bidari
	PiperOrigin-RevId: 329526153
2020-08-25	Support SO_LINGER socket option.	Nayana Bidari
	When SO_LINGER option is enabled, the close will not return until all the queued messages are sent and acknowledged for the socket or linger timeout is reached. If the option is not set, close will return immediately. This option is mainly supported for connection oriented protocols such as TCP. PiperOrigin-RevId: 328350576
2020-08-06	Join IPv4 all-systems group on NIC enable	Ghanan Gowripalan
	Test: - stack_test.TestJoinLeaveMulticastOnNICEnableDisable - integration_test.TestIncomingMulticastAndBroadcast PiperOrigin-RevId: 325185259
2020-07-29	Test UDP socket bound to ANY can receive unicast	Jay Zhuang
	PiperOrigin-RevId: 323773771
2020-07-22	make connect(2) fail when dest is unreachable	Kevin Krakauer
	Previously, ICMP destination unreachable datagrams were ignored by TCP endpoints. This caused connect to hang when an intermediate router couldn't find a route to the host. This manifested as a Kokoro error when Docker IPv6 was enabled. The Ruby image test would try to install the sinatra gem and hang indefinitely attempting to use an IPv6 address. Fixes #3079.
2020-07-17	Test UDP packets with mcast source addr are discarded	Jay Zhuang
	PiperOrigin-RevId: 321790802
2020-07-14	Test IPv6 fragment reassembly	Zeling Feng
	A packetimpact test for: "A node must be able to accept a fragmented packet that, after reassembly, is as large as 1500 octets." PiperOrigin-RevId: 321210729
2020-07-13	Create packetimpact test for UDP broadcast	Jay Zhuang
	PiperOrigin-RevId: 321000340
2020-07-07	Set IPv4 ID on all non-atomic datagrams	Tony Gong
	RFC 6864 imposes various restrictions on the uniqueness of the IPv4 Identification field for non-atomic datagrams, defined as an IP datagram that either can be fragmented (DF=0) or is already a fragment (MF=1 or positive fragment offset). In order to be compliant, the ID field is assigned for all non-atomic datagrams. Add a TCP unit test that induces retransmissions and checks that the IPv4 ID field is unique every time. Add basic handling of the IP_MTU_DISCOVER socket option so that the option can be used to disable PMTU discovery, effectively setting DF=0. Attempting to set the sockopt to anything other than disabled will fail because PMTU discovery is currently not implemented, and the default behavior matches that of disabled. PiperOrigin-RevId: 320081842
2020-07-01	TCP receive should block when in SYN-SENT state.	Mithun Iyer
	The application can choose to initiate a non-blocking connect and later block on a read, when the endpoint is still in SYN-SENT state. PiperOrigin-RevId: 319311016
2020-06-30	Fix two bugs in TCP sender.	Bhasker Hariharan
	a) When GSO is in use we should not cap the segment to maxPayloadSize in sender.maybeSendSegment as the GSO logic will cap the segment to the correct size. Without this the host GSO is not used as we end up breaking up large segments into small MSS sized segments before writing the packets to the host. b) The check to not split a segment due to it not fitting in the receiver window when there are pending segments is incorrect as segments in writeList can be really large as we just take the write call's buffer size and create a single large segment. So a write of say 128KB will just be 1 segment in the writeList. The linux code checks if 1 MSS sized segments fits in the receiver's window and if not then does not split the current segment. gVisor's check was incorrect that it was checking if the whole segment which could be >>> 1 MSS would fit in the receiver's window. This was causing us to prematurely stop sending and falling back to retransmit timer/probe from the other end to send data. This was seen when running HTTPD benchmarks where @ HEAD when sending large files the benchmark was taking forever to run. The tcp_splitseg_mss_test.go is being deleted as the test as written doesn't test what is intended correctly. This is because GSO is enabled by default and the reason the MSS+1 sized segment is sent is because GSO is in use. A proper test will require disabling GSO on linux and netstack which is going to take a bit of work in packetimpact to do it correctly. Separately a new test probably should be written that verifies that a segment > availableWindow is not split if the availableWindow is < 1 MSS. Fixes #3107 PiperOrigin-RevId: 319172089
2020-06-26	Packetimpact test for IPv6 unknown options action	Zeling Feng
	The Option Type identifiers are internally encoded such that their highest-order two bits specify the action that must be taken if the processing IPv6 node does not recognize the Option Type: 00 - skip over this option and continue processing the header. 01 - discard the packet. 10 - discard the packet and, regardless of whether or not the packet's Destination Address was a multicast address, send an ICMP Parameter Problem, Code 2, message to the packet's Source Address, pointing to the unrecognized Option Type. 11 - discard the packet and, only if the packet's Destination Address was not a multicast address, send an ICMP Parameter Problem, Code 2, message to the packet's Source Address, pointing to the unrecognized Option Type. PiperOrigin-RevId: 318566613
2020-06-15	TCP to honor updated window size during handshake.	Mithun Iyer
	In passive open cases, we transition to Established state after initializing endpoint's sender and receiver. With this we lose out on any updates coming from the ACK that completes the handshake. This change ensures that we uniformly transition to Established in all cases and does minor cleanups. Fixes #2938 PiperOrigin-RevId: 316567014
2020-06-11	Add test for reordering.	Ian Gudger
	Tests the effect of reordering on retransmission and window size. Test covers the expected behavior of both Linux and netstack, however, netstack does not behave as expected. Further, the current expected behavior of netstack is not ideal and should be adjusted in the future. PiperOrigin-RevId: 316015184
2020-06-05	Handle TCP segment split cases as per MSS.	Mithun Iyer
	- Always split segments larger than MSS. Currently, we base the segment split decision as a function of the send congestion window and MSS, which could be greater than the MSS advertised by remote. - While splitting segments, ensure the PSH flag is reset when there are segments that are queued to be sent. - With TCP_CORK, hold up segments up until MSS. Fix a bug in computing available send space before attempting to coalesce segments. Fixes #2832 PiperOrigin-RevId: 314802928
2020-06-03	Avoid TCP segment split when out of sender window.	Mithun Iyer
	If the entire segment cannot be accommodated in the receiver advertised window and if there are still unacknowledged pending segments, skip splitting the segment. The segment transmit would get retried by the retransmit handler. PiperOrigin-RevId: 314538523
2020-05-29	Test TCP should queue RECEIVE request in SYN-SENT	Zeling Feng
	PiperOrigin-RevId: 313878910
2020-05-29	Move TCP to CLOSED from SYN-RCVD on RST.	Mithun Iyer
	RST handling is broken when the TCP state transitions from SYN-SENT to SYN-RCVD in case of simultaneous open. An incoming RST should trigger cleanup of the endpoint. RFC793, section 3.9, page 70. Fixes #2814 PiperOrigin-RevId: 313828777
2020-05-29	Internal change.	gVisor bot
	PiperOrigin-RevId: 313821986
2020-05-26	Automated rollback of changelist 311424257	gVisor bot
	PiperOrigin-RevId: 313300554
2020-05-20	Test that we have PAWS mechanism	Zeling Feng
	If there is a Timestamps option in the arriving segment and SEG.TSval < TS.Recent and if TS.Recent is valid, then treat the arriving segment as not acceptable: Send an acknowledgement in reply as specified in RFC-793 page 69 and drop the segment. https://tools.ietf.org/html/rfc1323#page-19 PiperOrigin-RevId: 312590678
2020-05-20	Internal change.	gVisor bot
	PiperOrigin-RevId: 312559963
2020-05-14	Internal change.	gVisor bot
	PiperOrigin-RevId: 311645222