gvisor - Container Runtime Sandbox

Age	Commit message (Collapse)	Author
2020-05-13	Fix TCP segment retransmit timeout handling.	Mithun Iyer
	As per RFC 1122 and Linux retransmit timeout handling: - The segment retransmit timeout needs to exponentially increase and cap at a predefined value. - TCP connection needs to timeout after a predefined number of segment retransmissions. - TCP connection should not timeout when the retranmission timeout exceeds MaxRTO, predefined upper bound. Fixes #2673 PiperOrigin-RevId: 311463961
2020-05-13	Stub support for TCP_SYNCNT and TCP_WINDOW_CLAMP.	Bhasker Hariharan
	This change adds support for TCP_SYNCNT and TCP_WINDOW_CLAMP options in GetSockOpt/SetSockOpt. This change does not really change any behaviour in Netstack and only stores/returns the stored value. Actual honoring of these options will be added as required. Fixes #2626, #2625 PiperOrigin-RevId: 311453777
2020-05-08	Send ACK to OTW SEQs/unacc ACKs in CLOSE_WAIT	Zeling Feng
	This fixed the corresponding packetimpact test. PiperOrigin-RevId: 310593470
2020-05-07	Capture range variable in parallel subtests	Sam Balana
	Only the last test was running before since the goroutines won't be executed until after this loop. I added t.Log(test.name) and this is was the result: TestListenNoAcceptNonUnicastV4/SourceUnspecified: DestOtherMulticast TestListenNoAcceptNonUnicastV4/DestUnspecified: DestOtherMulticast TestListenNoAcceptNonUnicastV4/DestOtherMulticast: DestOtherMulticast TestListenNoAcceptNonUnicastV4/SourceBroadcast: DestOtherMulticast TestListenNoAcceptNonUnicastV4/DestOurMulticast: DestOtherMulticast TestListenNoAcceptNonUnicastV4/DestBroadcast: DestOtherMulticast TestListenNoAcceptNonUnicastV4/SourceOtherMulticast: DestOtherMulticast TestListenNoAcceptNonUnicastV4/SourceOurMulticast: DestOtherMulticast https://github.com/golang/go/wiki/TableDrivenTests#parallel-testing PiperOrigin-RevId: 310440629
2020-05-07	Fix bugs in SACK recovery.	Bhasker Hariharan
	Every call to sender.NextSeg does not need to iterate from the front of the writeList as in a given recovery episode we can cache the last nextSeg returned. There cannot be a lower sequenced segment that matches the next call to NextSeg as otherwise we would have returned that instead in the previous call. This fixes the issue of excessive CPU usage w/ large send buffers where we spend a lot of time iterating from the front of the list on every NextSeg invocation. Further the following other bugs were also fixed: * Iteration of segments never sent in NextSeg() when looking for segments for retransmission that match step1/3/4 of the NextSeg algorithm * Correctly setting rescueRxt only if the rescue segment was actually sent. * Correctly initializing rescueRxt/highRxt when entering SACK recovery. * Correctly re-arming the timer only on retransmissions when SACK is in use and not for every segment being sent as it was being done before. * Copy over xmitTime and xmitCount on segment clone. * Move writeNext along when skipping over SACKED segments. This is required to prevent spurious retransmissions where we end up retransmitting data that was never lost. PiperOrigin-RevId: 310387671
2020-05-05	Support TCP zero window probes.	Mithun Iyer
	As per RFC 1122 4.2.2.17, when the remote advertizes zero receive window, the sender needs to probe for the window-size to become non-zero starting from the next retransmission interval. The TCP connection needs to be kept open as long as the remote is acknowledging the zero window probes. We reuse the retransmission timers to support this. Fixes #1644 PiperOrigin-RevId: 310021575
2020-05-01	Support for connection tracking of TCP packets.	Nayana Bidari
	Connection tracking is used to track packets in prerouting and output hooks of iptables. The NAT rules modify the tuples in connections. The connection tracking code modifies the packets by looking at the modified tuples.
2020-05-01	Automated rollback of changelist 308674219	Kevin Krakauer
	PiperOrigin-RevId: 309491861
2020-04-30	FIFO QDisc implementation	Bhasker Hariharan
	Updates #231 PiperOrigin-RevId: 309323808
2020-04-27	Reduce flakiness in tcp_test.	Bhasker Hariharan
	Poll for metric updates as immediately trying to read them can sometimes be flaky if due to goroutine scheduling the check happens before the sender has got a chance to update the corresponding sent metric. PiperOrigin-RevId: 308712817
2020-04-27	Automated rollback of changelist 308163542	gVisor bot
	PiperOrigin-RevId: 308674219
2020-04-23	Remove View.First() and View.RemoveFirst()	Kevin Krakauer
	These methods let users eaily break the VectorisedView abstraction, and allowed netstack to slip into pseudo-enforcement of the "all headers are in the first View" invariant. Removing them and replacing with PullUp(n) breaks this reliance and will make it easier to add iptables support and rework network buffer management. The new View.PullUp(n) method is low cost in the common case, when when all the headers fit in the first View. PiperOrigin-RevId: 308163542
2020-04-23	Simplify Docker test infrastructure.	Adin Scannell
	This change adds a layer of abstraction around the internal Docker APIs, and eliminates all direct dependencies on Dockerfiles in the infrastructure. A subsequent change will automated the generation of local images (with efficient caching). Note that this change drops the use of bazel container rules, as that experiment does not seem to be viable. PiperOrigin-RevId: 308095430
2020-04-22	tcp: handle listen after shutdown properly	Andrei Vagin
	Right now, sentry panics in this case: panic: close of nil channel goroutine 67 [running]: pkg/tcpip/transport/tcp/tcp.(endpoint).listen(0xc0000ce000, 0x9, 0x0) pkg/tcpip/transport/tcp/endpoint.go:2208 +0x170 pkg/tcpip/transport/tcp/tcp.(endpoint).Listen(0xc0000ce000, 0x9, 0xc0003a1ad0) pkg/tcpip/transport/tcp/endpoint.go:2179 +0x50 Fixes #2468 PiperOrigin-RevId: 307896725
2020-04-21	Automated rollback of changelist 307477185	gVisor bot
	PiperOrigin-RevId: 307598974
2020-04-20	Merge pull request #2313 from kevinGC:firstn	gVisor bot
	PiperOrigin-RevId: 307477185
2020-04-19	Don't accept segments outside the receive window	Eyal Soha
	Fixed to match RFC 793 page 69. Fixes #1607 PiperOrigin-RevId: 307334892
2020-04-17	Remove View.First() and View.RemoveFirst()	Kevin Krakauer
	These methods let users eaily break the VectorisedView abstraction, and allowed netstack to slip into pseudo-enforcement of the "all headers are in the first View" invariant. Removing them and replacing with PullUp(n) breaks this reliance and will make it easier to add iptables support and rework network buffer management. The new View.PullUp(n) method is low cost in the common case, when when all the headers fit in the first View.
2020-04-17	Permit setting unknown options	Tamir Duberstein
	This previously changed in 305699233, but this behaviour turned out to be load bearing. PiperOrigin-RevId: 307033802
2020-04-16	Reset pending connections on listener shutdown.	Mithun Iyer
	When the listening socket is read shutdown, we need to reset all pending and incoming connections. Ensure that the endpoint is not cleaned up from the demuxer and subsequent bind to same port does not go through. PiperOrigin-RevId: 306958038
2020-04-16	Fix data race in tcp_test.	Bhasker Hariharan
	This change makes SynRcvdCountThreshold and the global synRcvdCount into a stack configurable value. This is required because in cases like mod_proxy which create multiple Stack instances the count will be a global value that impacts all Stack instances. Further the tests relied on modifying the global threshold to simulate tests where we want to verify SYN cookie based behaviour. This lead to data races due to the global being modified/read without locks or atomics. PiperOrigin-RevId: 306947723
2020-04-15	Remove unnecessary code	Tamir Duberstein
	Remove useless casts and duplicate return statements. PiperOrigin-RevId: 306627916
2020-04-15	Reset pending connections on listener close	Mithun Iyer
	Attempt to redeliver TCP segments that are enqueued into a closing TCP endpoint. This was being done for Established endpoints but not for those that are listening or performing connection handshake. Fixes #2417 PiperOrigin-RevId: 306598155
2020-04-14	Reduce flakiness in tcp_test.	Bhasker Hariharan
	Tests now use a MinRTO of 3s instead of default 200ms. This reduced flakiness in a lot of the congestion control/recovery tests which were flaky due to retransmit timer firing too early in case the test executors were overloaded. This change also bumps some of the timeouts in tests which were too sensitive to timer variations and reduces the number of slow start iterations which can make the tests run for too long and also trigger retansmit timeouts etc if the executor is overloaded. PiperOrigin-RevId: 306562645
2020-04-09	Merge pull request #2253 from amscanne:nogo	gVisor bot
	PiperOrigin-RevId: 305807868
2020-04-09	Convert int and bool socket options to use GetSockOptInt and GetSockOptBool	Andrei Vagin
	PiperOrigin-RevId: 305699233
2020-04-08	Remove lostcancel warnings.	Adin Scannell
	Updates #2243
2020-04-08	Fix all printf formatting errors.	Adin Scannell
	Updates #2243
2020-04-03	Refactor software GSO code.	Bhasker Hariharan
	Software GSO implementation currently has a complicated code path with implicit assumptions that all packets to WritePackets carry same Data and it does this to avoid allocations on the path etc. But this makes it hard to reuse the WritePackets API. This change breaks all such assumptions by introducing a new Vectorised View API ReadToVV which can be used to cleanly split a VV into multiple independent VVs. Further this change also makes packet buffers linkable to form an intrusive list. This allows us to get rid of the array of packet buffers that are passed in the WritePackets API call and replace it with a list of packet buffers. While this code does introduce some more allocations in the benchmarks it doesn't cause any degradation. Updates #231 PiperOrigin-RevId: 304731742
2020-03-26	Support owner matching for iptables.	Nayana Bidari
	This feature will match UID and GID of the packet creator, for locally generated packets. This match is only valid in the OUTPUT and POSTROUTING chains. Forwarded packets do not have any socket associated with them. Packets from kernel threads do have a socket, but usually no owner.
2020-03-25	Fix data-race in endpoint.Readiness	Bhasker Hariharan
	PiperOrigin-RevId: 302924789
2020-03-24	Add support for setting TCP segment hash.	Bhasker Hariharan
	This allows the link layer endpoints to consistenly hash a TCP segment to a single underlying queue in case a link layer endpoint does support multiple underlying queues. Updates #231 PiperOrigin-RevId: 302760664
2020-03-24	Move tcpip.PacketBuffer and IPTables to stack package.	Bhasker Hariharan
	This is a precursor to be being able to build an intrusive list of PacketBuffers for use in queuing disciplines being implemented. Updates #2214 PiperOrigin-RevId: 302677662
2020-03-20	Remove unused variable `sndNxtList`.	Ting-Yu Wang
	PiperOrigin-RevId: 302110328
2020-03-19	Remove redundant dep in BUILD	Jay Zhuang
	PiperOrigin-RevId: 301859066
2020-03-19	Address comments on workMu removal change.	Bhasker Hariharan
	Updates #231, #357 PiperOrigin-RevId: 301833669
2020-03-19	Remove workMu from tcpip.Endpoint.	Bhasker Hariharan
	workMu is removed and e.mu is now a mutex that supports TryLock. The packet processing path tries to lock the mutex and if its locked it will just queue the packet and move on. The endpoint.UnlockUser() will process any backlog of packets before unlocking the socket. This simplifies the locking inside tcp endpoints a lot. Further the endpoint.LockUser() implements spinning as long as the lock is not held by another syscall goroutine. This ensures low latency as not spinning leads to the task thread being put to sleep if the lock is held by the packet dispatch path. This is suboptimal as the lower layer rarely holds the lock for long so implementing spinning here helps. If the lock is held by another task goroutine then we just proceed to call LockUser() and the task could be put to sleep. The protocol goroutines themselves just call e.mu.Lock() and block if the lock is currently not available. Updates #231, #357 PiperOrigin-RevId: 301808349
2020-03-18	Store segment transmit count.	Ian Gudger
	This will aid in segment reordering detection. Updates #691 PiperOrigin-RevId: 301692638
2020-03-11	Implement heap.Interface on pointer receiver	Tamir Duberstein
	PiperOrigin-RevId: 300467253
2020-03-11	Fix race condition (*tcp.endpoint).Close	Tamir Duberstein
	Atomically close the endpoint. Before this change, it was possible for multiple callers to perform duplicate work. PiperOrigin-RevId: 300462110
2020-03-11	Fix memory leak in danglingEndpoints.	Bhasker Hariharan
	Endpoints which were being terminated in an ERROR state or were moved to CLOSED by the worker goroutine do not run cleanupLocked() as that should already be run by the worker termination. But when making that change we made the mistake of not removing the endpoint from the danglingEndpoints which is normally done in cleanupLocked(). As a result these endpoints are leaked since a reference is held to them in the danglingEndpoints array forever till Stack is torn down. PiperOrigin-RevId: 300438426
2020-03-06	shutdown(s, SHUT_WR) in TIME-WAIT returns ENOTCONN	Eyal Soha
	From RFC 793 s3.9 p61 Event Processing: CLOSE Call during TIME-WAIT: return with "error: connection closing" Fixes #1603 PiperOrigin-RevId: 299401353
2020-03-05	Use a pool of arrays to avoid slice headers from escaping in TCP options pool.	Ian Gudger
	By putting slices into the pool, the slice header escapes. This can be avoided by not putting the slice header into the pool. This removes an allocation from the TCP segment send path. PiperOrigin-RevId: 299215480
2020-03-03	Avoid memory leaks	Tamir Duberstein
	Properly discard segments from the segment heap. PiperOrigin-RevId: 298704074
2020-03-03	Fix datarace on TransportEndpointInfo.ID and clean up semantics.	Ian Gudger
	Ensures that all access to TransportEndpointInfo.ID is either: * In a function ending in a Locked suffix. * While holding the appropriate mutex. This primary affects the checkV4Mapped method on affected endpoints, which has been renamed to checkV4MappedLocked. Also document the method and change its argument to be a value instead of a pointer which had caused some awkwardness. This race was possible in the udp and icmp endpoints between Connect and uses of TransportEndpointInfo.ID including in both itself and Bind. The tcp endpoint did not suffer from this bug, but benefited from better documentation. Updates #357 PiperOrigin-RevId: 298682913
2020-03-02	Fix data-race when reading/writing e.amss.	Bhasker Hariharan
	PiperOrigin-RevId: 298451319
2020-02-27	Fix a race in TCP endpoint teardown and teardown the stack in tcp_test.	Ian Gudger
	Call stack.Close on stacks when we are done with them in tcp_test. This avoids leaking resources and reduces the test's flakiness when race/gotsan is enabled. It also provides test coverage for the race also fixed in this change, which can be reliably triggered with the stack.Close change (and without the other changes) when race/gotsan is enabled. The race was possible when calling Abort (via stack.Close) on an endpoint processing a SYN segment as part of a passive connect. Updates #1564 PiperOrigin-RevId: 297685432
2020-02-27	Internal change.	Nayana Bidari
	PiperOrigin-RevId: 297638665
2020-02-25	Deflake TestCurrentConnectedIncrement.	Bhasker Hariharan
	TestCurrentConnectedIncrement fails consistently under gotsan due to the sleep to check metrics is exactly the same as the TIME-WAIT duration. Under gotsan things can be slow enough that the increment test is done before the protocol goroutine is run after the TIME-WAIT timer expires and does its cleanup. Increasing the sleep from 1s to 1.2s makes the test pass consistently. PiperOrigin-RevId: 297160181
2020-02-24	Add support for tearing down protocol dispatchers and TIME_WAIT endpoints.	Ian Gudger
	Protocol dispatchers were previously leaked. Bypassing TIME_WAIT is required to test this change. Also fix a race when a socket in SYN-RCVD is closed. This is also required to test this change. PiperOrigin-RevId: 296922548