gvisor - Container Runtime Sandbox

Age	Commit message (Collapse)	Author
2020-03-26	Support owner matching for iptables.	Nayana Bidari
	This feature will match UID and GID of the packet creator, for locally generated packets. This match is only valid in the OUTPUT and POSTROUTING chains. Forwarded packets do not have any socket associated with them. Packets from kernel threads do have a socket, but usually no owner.
2020-03-25	Fix data-race in endpoint.Readiness	Bhasker Hariharan
	PiperOrigin-RevId: 302924789
2020-03-24	Add support for setting TCP segment hash.	Bhasker Hariharan
	This allows the link layer endpoints to consistenly hash a TCP segment to a single underlying queue in case a link layer endpoint does support multiple underlying queues. Updates #231 PiperOrigin-RevId: 302760664
2020-03-24	Move tcpip.PacketBuffer and IPTables to stack package.	Bhasker Hariharan
	This is a precursor to be being able to build an intrusive list of PacketBuffers for use in queuing disciplines being implemented. Updates #2214 PiperOrigin-RevId: 302677662
2020-03-20	Remove unused variable `sndNxtList`.	Ting-Yu Wang
	PiperOrigin-RevId: 302110328
2020-03-19	Remove redundant dep in BUILD	Jay Zhuang
	PiperOrigin-RevId: 301859066
2020-03-19	Address comments on workMu removal change.	Bhasker Hariharan
	Updates #231, #357 PiperOrigin-RevId: 301833669
2020-03-19	Remove workMu from tcpip.Endpoint.	Bhasker Hariharan
	workMu is removed and e.mu is now a mutex that supports TryLock. The packet processing path tries to lock the mutex and if its locked it will just queue the packet and move on. The endpoint.UnlockUser() will process any backlog of packets before unlocking the socket. This simplifies the locking inside tcp endpoints a lot. Further the endpoint.LockUser() implements spinning as long as the lock is not held by another syscall goroutine. This ensures low latency as not spinning leads to the task thread being put to sleep if the lock is held by the packet dispatch path. This is suboptimal as the lower layer rarely holds the lock for long so implementing spinning here helps. If the lock is held by another task goroutine then we just proceed to call LockUser() and the task could be put to sleep. The protocol goroutines themselves just call e.mu.Lock() and block if the lock is currently not available. Updates #231, #357 PiperOrigin-RevId: 301808349
2020-03-18	Store segment transmit count.	Ian Gudger
	This will aid in segment reordering detection. Updates #691 PiperOrigin-RevId: 301692638
2020-03-11	Implement heap.Interface on pointer receiver	Tamir Duberstein
	PiperOrigin-RevId: 300467253
2020-03-11	Fix race condition (*tcp.endpoint).Close	Tamir Duberstein
	Atomically close the endpoint. Before this change, it was possible for multiple callers to perform duplicate work. PiperOrigin-RevId: 300462110
2020-03-11	Fix memory leak in danglingEndpoints.	Bhasker Hariharan
	Endpoints which were being terminated in an ERROR state or were moved to CLOSED by the worker goroutine do not run cleanupLocked() as that should already be run by the worker termination. But when making that change we made the mistake of not removing the endpoint from the danglingEndpoints which is normally done in cleanupLocked(). As a result these endpoints are leaked since a reference is held to them in the danglingEndpoints array forever till Stack is torn down. PiperOrigin-RevId: 300438426
2020-03-06	shutdown(s, SHUT_WR) in TIME-WAIT returns ENOTCONN	Eyal Soha
	From RFC 793 s3.9 p61 Event Processing: CLOSE Call during TIME-WAIT: return with "error: connection closing" Fixes #1603 PiperOrigin-RevId: 299401353
2020-03-05	Use a pool of arrays to avoid slice headers from escaping in TCP options pool.	Ian Gudger
	By putting slices into the pool, the slice header escapes. This can be avoided by not putting the slice header into the pool. This removes an allocation from the TCP segment send path. PiperOrigin-RevId: 299215480
2020-03-03	Avoid memory leaks	Tamir Duberstein
	Properly discard segments from the segment heap. PiperOrigin-RevId: 298704074
2020-03-03	Fix datarace on TransportEndpointInfo.ID and clean up semantics.	Ian Gudger
	Ensures that all access to TransportEndpointInfo.ID is either: * In a function ending in a Locked suffix. * While holding the appropriate mutex. This primary affects the checkV4Mapped method on affected endpoints, which has been renamed to checkV4MappedLocked. Also document the method and change its argument to be a value instead of a pointer which had caused some awkwardness. This race was possible in the udp and icmp endpoints between Connect and uses of TransportEndpointInfo.ID including in both itself and Bind. The tcp endpoint did not suffer from this bug, but benefited from better documentation. Updates #357 PiperOrigin-RevId: 298682913
2020-03-02	Fix data-race when reading/writing e.amss.	Bhasker Hariharan
	PiperOrigin-RevId: 298451319
2020-02-27	Fix a race in TCP endpoint teardown and teardown the stack in tcp_test.	Ian Gudger
	Call stack.Close on stacks when we are done with them in tcp_test. This avoids leaking resources and reduces the test's flakiness when race/gotsan is enabled. It also provides test coverage for the race also fixed in this change, which can be reliably triggered with the stack.Close change (and without the other changes) when race/gotsan is enabled. The race was possible when calling Abort (via stack.Close) on an endpoint processing a SYN segment as part of a passive connect. Updates #1564 PiperOrigin-RevId: 297685432
2020-02-27	Internal change.	Nayana Bidari
	PiperOrigin-RevId: 297638665
2020-02-25	Deflake TestCurrentConnectedIncrement.	Bhasker Hariharan
	TestCurrentConnectedIncrement fails consistently under gotsan due to the sleep to check metrics is exactly the same as the TIME-WAIT duration. Under gotsan things can be slow enough that the increment test is done before the protocol goroutine is run after the TIME-WAIT timer expires and does its cleanup. Increasing the sleep from 1s to 1.2s makes the test pass consistently. PiperOrigin-RevId: 297160181
2020-02-24	Add support for tearing down protocol dispatchers and TIME_WAIT endpoints.	Ian Gudger
	Protocol dispatchers were previously leaked. Bypassing TIME_WAIT is required to test this change. Also fix a race when a socket in SYN-RCVD is closed. This is also required to test this change. PiperOrigin-RevId: 296922548
2020-02-18	Enable IPV6_RECVTCLASS socket option for datagram sockets	gVisor bot
	Added the ability to get/set the IP_RECVTCLASS socket option on UDP endpoints. If enabled, traffic class from the incoming Network Header passed as ancillary data in the ControlMessages. Adding Get/SetSockOptBool to decrease the overhead of getting/setting simple options. (This was absorbed in a CL that will be landing before this one). Test: * Added unit test to udp_test.go that tests getting/setting as well as verifying that we receive expected TOS from incoming packet. * Added a syscall test for verifying getting/setting * Removed test skip for existing syscall test to enable end to end test. PiperOrigin-RevId: 295840218
2020-02-13	Internal change.	gVisor bot
	PiperOrigin-RevId: 294952610
2020-02-05	Add notes to relevant tests.	Adin Scannell
	These were out-of-band notes that can help provide additional context and simplify automated imports. PiperOrigin-RevId: 293525915
2020-02-05	recv() on a closed TCP socket returns ENOTCONN	Eyal Soha
	From RFC 793 s3.9 p58 Event Processing: If RECEIVE Call arrives in CLOSED state and the user has access to such a connection, the return should be "error: connection does not exist" Fixes #1598 PiperOrigin-RevId: 293494287
2020-02-04	Add socket connection stress test.	Ian Gudger
	Tests 65k connection attempts on common types of sockets to check for port leaks. Also fixes a bug where dual-stack sockets wouldn't properly re-queue segments received while closing. PiperOrigin-RevId: 293241166
2020-01-31	Fix method comment to match method name.	Ian Gudger
	PiperOrigin-RevId: 292624867
2020-01-31	Use multicast Ethernet address for multicast NDP	Ghanan Gowripalan
	As per RFC 2464 section 7, an IPv6 packet with a multicast destination address is transmitted to the mapped Ethernet multicast address. Test: - ipv6.TestLinkResolution - stack_test.TestDADResolve - stack_test.TestRouterSolicitation PiperOrigin-RevId: 292610529
2020-01-30	Fix for panic in endpoint.Close().	Bhasker Hariharan
	When sending a RST on shutdown we need to double check the state after acquiring the work mutex as the endpoint could have transitioned out of a connected state from the time we checked it and we acquired the workMutex. I added two tests but sadly neither reproduce the panic. I am going to leave the tests in as they are good to have anyway. PiperOrigin-RevId: 292393800
2020-01-29	Add support for TCP_DEFER_ACCEPT.	Bhasker Hariharan
	PiperOrigin-RevId: 292233574
2020-01-27	Refactor to hide C from channel.Endpoint.	Ting-Yu Wang
	This is to aid later implementation for /dev/net/tun device. PiperOrigin-RevId: 291746025
2020-01-27	Standardize on tools directory.	Adin Scannell
	PiperOrigin-RevId: 291745021
2020-01-21	Add a new TCP stat for current open connections.	Mithun Iyer
	Such a stat accounts for all connections that are currently established and not yet transitioned to close state. Also fix bug in double increment of CurrentEstablished stat. Fixes #1579 PiperOrigin-RevId: 290827365
2020-01-21	Merge pull request #1558 from kevinGC:iptables-write-input-drop	gVisor bot
	PiperOrigin-RevId: 290793754
2020-01-17	Filter out received packets with a local source IP address.	Eyal Soha
	CERT Advisory CA-96.21 III. Solution advises that devices drop packets which could not have correctly arrived on the wire, such as receiving a packet where the source IP address is owned by the device that sent it. Fixes #1507 PiperOrigin-RevId: 290378240
2020-01-15	Bugfix to terminate the protocol loop on StateError.	Bhasker Hariharan
	The change to introduce worker goroutines can cause the endpoint to transition to StateError and we should terminate the loop rather than let the endpoint transition to a CLOSED state as we do in case the endpoint enters TIME-WAIT/CLOSED. Moving to a closed state would cause the actual error to not be propagated to any read() calls etc. PiperOrigin-RevId: 289923568
2020-01-14	Changes TCP packet dispatch to use a pool of goroutines.	Bhasker Hariharan
	All inbound segments for connections in ESTABLISHED state are delivered to the endpoint's queue but for every segment delivered we also queue the endpoint for processing to a selected processor. This ensures that when there are a large number of connections in ESTABLISHED state the inbound packets are all handled by a small number of goroutines and significantly reduces the amount of work the goscheduler has to perform. We let connections in other states follow the current path where the endpoint's goroutine directly handles the segments. Updates #231 PiperOrigin-RevId: 289728325
2020-01-14	Implement {g,s}etsockopt(IP_RECVTOS) for UDP sockets	Tamir Duberstein
	PiperOrigin-RevId: 289718534
2020-01-13	Fix test building.	Kevin Krakauer

2020-01-13	Allow dual stack sockets to operate on AF_INET	Tamir Duberstein
	Fixes #1490 Fixes #1495 PiperOrigin-RevId: 289523250
2020-01-10	panic fix in retransmitTimerExpired.	Bhasker Hariharan
	This is a band-aid fix for now to prevent panics. PiperOrigin-RevId: 289078453
2020-01-09	New sync package.	Ian Gudger
	* Rename syncutil to sync. * Add aliases to sync types. * Replace existing usage of standard library sync package. This will make it easier to swap out synchronization primitives. For example, this will allow us to use primitives from github.com/sasha-s/go-deadlock to check for lock ordering violations. Updates #1472 PiperOrigin-RevId: 289033387
2020-01-09	Merge pull request #1523 from majek:fix-1522-silly-window-rx	gVisor bot
	PiperOrigin-RevId: 289019953
2020-01-09	Change BindToDeviceOption to store NICID	Eyal Soha
	This makes it possible to call the sockopt from go even when the NIC has no name. PiperOrigin-RevId: 288955236
2020-01-08	Introduce tcpip.SockOptBool	Tamir Duberstein
	...and port V6OnlyOption to it. PiperOrigin-RevId: 288789451
2020-01-08	Combine various Create*NIC methods into CreateNICWithOptions.	Bert Muthalaly
	PiperOrigin-RevId: 288779416
2020-01-08	Rename tcpip.SockOpt{,Int}	Tamir Duberstein
	PiperOrigin-RevId: 288772878
2020-01-08	Fix #1522 - implement silly window sydrome protection on rx side	Marek Majkowski
	Before, each of small read()'s that raises window either from zero or above threshold of aMSS, would generate an ACK. In a classic silly-window-syndrome scenario, we can imagine a pessimistic case when small read()'s generate a stream of ACKs. This PR fixes that, essentially treating window size < aMSS as zero. We send ACK exactly in a moment when window increases to >= aMSS or half of receive buffer size (whichever smaller).
2020-01-07	#1398 - send ACK when available buffer space gets larger than 1 MSS	Marek Majkowski
	When receiving data, netstack avoids sending spurious acks. When user does recv() should netstack send ack telling the sender that the window was increased? It depends. Before this patch, netstack _will_ send the ack in the case when window was zero or window >> scale was zero. Basically - when recv space increased from zero. This is not working right with silly-window-avoidance on the sender side. Some network stacks refuse to transmit segments, that will fill the window but are below MSS. Before this patch, this confuses netstack. On one hand if the window was like 3 bytes, netstack will _not_ send ack if the window increases. On the other hand sending party will refuse to transmit 3-byte packet. This patch changes that, making netstack will send an ACK when the available buffer size increases to or above 1*MSS. This will inform other party buffer is large enough, and hopefully uncork it. Signed-off-by: Marek Majkowski <marek@cloudflare.com>
2019-12-26	Automated rollback of changelist 287029703	gVisor bot
	PiperOrigin-RevId: 287217899