gvisor - Container Runtime Sandbox

Age	Commit message (Collapse)	Author
2019-06-21	Fix the logic for sending zero window updates.	Bhasker Hariharan
	Today we have the logic split in two places between endpoint Read() and the worker goroutine which actually sends a zero window. This change makes it so that when a zero window ACK is sent we set a flag in the endpoint which can be read by the endpoint to decide if it should notify the worker to send a nonZeroWindow update. The worker now does not do the check again but instead sends an ACK and flips the flag right away. Similarly today when SO_RECVBUF is set the SetSockOpt call has logic to decide if a zero window update is required. Rather than do that we move the logic to the worker goroutine and it can check the zeroWindow flag and send an update if required. PiperOrigin-RevId: 254505447
2019-06-13	Add support for TCP receive buffer auto tuning.	Bhasker Hariharan
	The implementation is similar to linux where we track the number of bytes consumed by the application to grow the receive buffer of a given TCP endpoint. This ensures that the advertised window grows at a reasonable rate to accomodate for the sender's rate and prevents large amounts of data being held in stack buffers if the application is not actively reading or not reading fast enough. The original paper that was used to implement the linux receive buffer auto- tuning is available @ https://public.lanl.gov/radiant/pubs/drs/lacsi2001.pdf NOTE: Linux does not implement DRS as defined in that paper, it's just a good reference to understand the solution space. Updates #230 PiperOrigin-RevId: 253168283
2019-06-13	Update canonical repository.	Adin Scannell
	This can be merged after: https://github.com/google/gvisor-website/pull/77 or https://github.com/google/gvisor-website/pull/78 PiperOrigin-RevId: 253132620
2019-06-12	Minor BUILD file cleanup.	Adin Scannell
	PiperOrigin-RevId: 252918338
2019-06-12	Add support for TCP_CONGESTION socket option.	Bhasker Hariharan
	This CL also cleans up the error returned for setting congestion control which was incorrectly returning EINVAL instead of ENOENT. PiperOrigin-RevId: 252889093
2019-06-10	Fixes to listen backlog handling.	Bhasker Hariharan
	Changes netstack to confirm to current linux behaviour where if the backlog is full then we drop the SYN and do not send a SYN-ACK. Similarly we allow upto backlog connections to be in SYN-RCVD state as long as the backlog is not full. We also now drop a SYN if syn cookies are in use and the backlog for the listening endpoint is full. Added new tests to confirm the behaviour. Also reverted the change to increase the backlog in TcpPortReuseMultiThread syscall test. Fixes #236 PiperOrigin-RevId: 252500462
2019-06-06	Track and export socket state.	Rahat Mahmood
	This is necessary for implementing network diagnostic interfaces like /proc/net/{tcp,udp,unix} and sock_diag(7). For pass-through endpoints such as hostinet, we obtain the socket state from the backend. For netstack, we add explicit tracking of TCP states. PiperOrigin-RevId: 251934850
2019-06-05	netstack/tcp: fix calculating a number of outstanding packets	Andrei Vagin
	In case of GSO, a segment can container more than one packet and we need to use the pCount() helper to get a number of packets. PiperOrigin-RevId: 251743020
2019-06-04	Fix data race in synRcvdState.	Bhasker Hariharan
	When checking the length of the acceptedChan we should hold the endpoint mutex otherwise a syn received while the listening socket is being closed can result in a data race where the cleanupLocked routine sets acceptedChan to nil while a handshake goroutine in progress could try and check it at the same time. PiperOrigin-RevId: 251537697
2019-06-03	Delete debug log lines left by mistake.	Bhasker Hariharan
	Updates #236 PiperOrigin-RevId: 251337915
2019-05-31	Disable certain tests that are flaky under race detector.	Bhasker Hariharan
	PiperOrigin-RevId: 250976665
2019-05-31	Change segment queue limit to be of fixed size.	Bhasker Hariharan
	Netstack sets the unprocessed segment queue size to match the receive buffer size. This is not required as this queue only needs to hold enough for a short duration before the endpoint goroutine can process it. Updates #230 PiperOrigin-RevId: 250976323
2019-05-30	Fixes to TCP listen behavior.	Bhasker Hariharan
	Netstack listen loop can get stuck if cookies are in-use and the app is slow to accept incoming connections. Further we continue to complete handshake for a connection even if the backlog is full. This creates a problem when a lots of connections come in rapidly and we end up with lots of completed connections just hanging around to be delivered. These fixes change netstack behaviour to mirror what linux does as described here in the following article http://veithen.io/2014/01/01/how-tcp-backlog-works-in-linux.html Now when cookies are not in-use Netstack will silently drop the ACK to a SYN-ACK and not complete the handshake if the backlog is full. This will result in the connection staying in a half-complete state. Eventually the sender will retransmit the ACK and if backlog has space we will transition to a connected state and deliver the endpoint. Similarly when cookies are in use we do not try and create an endpoint unless there is space in the accept queue to accept the newly created endpoint. If there is no space then we again silently drop the ACK as we can just recreate it when the ACK is retransmitted by the peer. We also now use the backlog to cap the size of the SYN-RCVD queue for a given endpoint. So at any time there can be N connections in the backlog and N in a SYN-RCVD state if the application is not accepting connections. Any new SYNs will be dropped. This CL also fixes another small bug where we mark a new endpoint which has not completed handshake as connected. We should wait till handshake successfully completes before marking it connected. Updates #236 PiperOrigin-RevId: 250717817
2019-05-24	Remove unused wakers	Tamir Duberstein
	These wakers are uselessly allocated and passed around; nothing ever listens for notifications on them. The code here appears to be vestigial, so removing it and allowing a nil waker to be passed seems appropriate. PiperOrigin-RevId: 249879320 Change-Id: Icd209fb77cc0dd4e5c49d7a9f2adc32bf88b4b71
2019-05-22	UDP and TCP raw socket support.	Kevin Krakauer
	PiperOrigin-RevId: 249511348 Change-Id: I34539092cc85032d9473ff4dd308fc29dc9bfd6b
2019-05-08	Set the FilesytemType in MountSource from the Filesystem.	Nicolas Lacasse
	And stop storing the Filesystem in the MountSource. This allows us to decouple the MountSource filesystem type from the name of the filesystem. PiperOrigin-RevId: 247292982 Change-Id: I49cbcce3c17883b7aa918ba76203dfd6d1b03cc8
2019-05-05	Fix raw socket behavior and tests.	Kevin Krakauer
	Some behavior was broken due to the difficulty of running automated raw socket tests. Change-Id: I152ca53916bb24a0208f2dc1c4f5bc87f4724ff6 PiperOrigin-RevId: 246747067
2019-05-03	Support IPv4 fragmentation in netstack	Googler
	Testing: Unit tests and also large ping in Fuchsia OS PiperOrigin-RevId: 246563592 Change-Id: Ia12ab619f64f4be2c8d346ce81341a91724aef95
2019-05-03	Fix transport/raw copybara export	Tamir Duberstein
	- include packet_list.go - exclude state.go (by renaming to include an underscore) Also rename raw.go to endpoint.go for consistency. PiperOrigin-RevId: 246547912 Change-Id: I19c8331c794ba683a940cc96a8be6497b53ff24d
2019-05-03	Implement support for SACK based recovery(RFC 6675).	Bhasker Hariharan
	PiperOrigin-RevId: 246536003 Change-Id: I118b745f45040be9c70cb6a1028acdb06c78d8c9
2019-05-02	Support reception of multicast data on more than one socket	Chris Kuiper
	This requires two changes: 1) Support for more than one socket to join a given multicast group. 2) Duplicate delivery of incoming multicast packets to all sockets listening for it. In addition, I tweaked the code (and added a test) to disallow duplicates IP_ADD_MEMBERSHIP calls for the same group and NIC. This is how Linux does it. PiperOrigin-RevId: 246437315 Change-Id: Icad8300b4a8c3f501d9b4cd283bd3beabef88b72
2019-04-29	Change copyright notice to "The gVisor Authors"	Michael Pratt
	Based on the guidelines at https://opensource.google.com/docs/releasing/authors/. 1. $ rg -l "Google LLC" \| xargs sed -i 's/Google LLC.*/The gVisor Authors./' 2. Manual fixup of "Google Inc" references. 3. Add AUTHORS file. Authors may request to be added to this file. 4. Point netstack AUTHORS to gVisor AUTHORS. Drop CONTRIBUTORS. Fixes #209 PiperOrigin-RevId: 245823212 Change-Id: I64530b24ad021a7d683137459cafc510f5ee1de9
2019-04-29	Allow and document bug ids in gVisor codebase.	Nicolas Lacasse
	PiperOrigin-RevId: 245818639 Change-Id: I03703ef0fb9b6675955637b9fe2776204c545789
2019-04-26	Make raw sockets a toggleable feature disabled by default.	Kevin Krakauer
	PiperOrigin-RevId: 245511019 Change-Id: Ia9562a301b46458988a6a1f0bbd5f07cbfcb0615
2019-04-19	tcpip/transport/tcp: read side only shutdown of an endpoint	Ben Burkert
	Support shutdown on only the read side of an endpoint. Reads performed after a call to Shutdown with only the ShutdownRead flag will return ErrClosedForReceive without data. Break out the shutdown(2) with SHUT_RD syscall test into to two tests. The first tests that no packets are sent when shutting down the read side of a socket. The second tests that, after shutting down the read side of a socket, unread data can still be read, or an EOF if there is no more data to read. Change-Id: I9d7c0a06937909cbb466b7591544a4bcaebb11ce PiperOrigin-RevId: 244459430
2019-04-18	tcpip/transport/udp: add Forwarder type	Ben Burkert
	Add a UDP forwarder for intercepting and forwarding UDP sessions. Change-Id: I2d83c900c1931adfc59a532dd4f6b33a0db406c9 PiperOrigin-RevId: 244293576
2019-04-18	netstack: use a proper network protocol to set gso.L3HdrLen	Andrei Vagin
	It is possible to create a listening socket which will accept IPv4 and IPv6 connections. In this case, we set IPv6ProtocolNumber for all accepted endpoints, even if they handle IPv4 connections. This means that we can't use endpoint.netProto to set gso.L3HdrLen. PiperOrigin-RevId: 244227948 Change-Id: I5e1863596cb9f3d216febacdb7dc75651882eef1
2019-04-09	Add TCP checksum verification.	Bhasker Hariharan
	PiperOrigin-RevId: 242704699 Change-Id: I87db368ca343b3b4bf4f969b17d3aa4ce2f8bd4f
2019-04-02	Add a raw socket transport endpoint and use it for raw ICMP sockets.	Kevin Krakauer
	Having raw socket code together will make it easier to add support for other raw network protocols. Currently, only ICMP uses the raw endpoint. However, adding support for other protocols such as UDP shouldn't be much more difficult than adding a few switch cases. PiperOrigin-RevId: 241564875 Change-Id: I77e03adafe4ce0fd29ba2d5dfdc547d2ae8f25bf
2019-03-29	Fix incorrect checksums in TCP and UDP tests.	Bhasker Hariharan
	PiperOrigin-RevId: 241025361 Change-Id: I292e7aea9a4b294b11e4f736e107010d9524586b
2019-03-28	Fix Panic in SACKScoreboard.Delete.	Bhasker Hariharan
	The panic was caused by modifying the tree while iterating which invalidated the iterator. Also fixes another bug in SACKScoreboard.Insert() which was causing blocks to be merged incorrectly. PiperOrigin-RevId: 240895053 Change-Id: Ia72b8244297962df5c04283346da5226434740af
2019-03-28	netstack/fdbased: add generic segmentation offload (GSO) support	Andrei Vagin
	The linux packet socket can handle GSO packets, so we can segment packets to 64K instead of the MTU which is usually 1500. Here are numbers for the nginx-1m test: runsc: 579330.01 [Kbytes/sec] received runsc-gso: 1794121.66 [Kbytes/sec] received runc: 2122139.06 [Kbytes/sec] received and for tcp_benchmark: $ tcp_benchmark --duration 15 --ideal [ 4] 0.0-15.0 sec 86647 MBytes 48456 Mbits/sec $ tcp_benchmark --client --duration 15 --ideal [ 4] 0.0-15.0 sec 2173 MBytes 1214 Mbits/sec $ tcp_benchmark --client --duration 15 --ideal --gso 65536 [ 4] 0.0-15.0 sec 19357 MBytes 10825 Mbits/sec PiperOrigin-RevId: 240809103 Change-Id: I2637f104db28b5d4c64e1e766c610162a195775a
2019-03-26	netstack: Don't exclude length when a pseudo-header checksum is calculated	Andrei Vagin
	This is a preparation for GSO changes (cl/234508902). RELNOTES[gofers]: Refactor checksum code to include length, which it already did, but in a convoluted way. Should be a no-op. PiperOrigin-RevId: 240460794 Change-Id: I537381bc670b5a9f5d70a87aa3eb7252e8f5ace2
2019-03-20	netstack: adjust the sequence number after trimming the packet	Andrei Vagin
	PiperOrigin-RevId: 239417224 Change-Id: I14a9adc31a6330a79a6156c105969cd5f1f63d20
2019-03-19	netstack: reduce MSS from SYN to account tcp options	Andrei Vagin
	See: https://tools.ietf.org/html/rfc6691#section-2 PiperOrigin-RevId: 239305632 Change-Id: Ie8eb912a43332e6490045dc95570709c5b81855e
2019-03-14	Remove duplicate TCP flag definitions	Tamir Duberstein
	PiperOrigin-RevId: 238467634 Change-Id: If4cd8efff7386fbee1195f051d15549b495910a9
2019-03-13	Remove unused function.	Kevin Krakauer
	PiperOrigin-RevId: 238336475 Change-Id: I8131e04699028246ebc233953ebb3feca5673940
2019-03-08	Validate multicast addresses in multicast group operations.	Ian Gudger
	PiperOrigin-RevId: 237559843 Change-Id: I93a9d83a08cd3d49d5fc7fcad5b0710d0aa04aaa
2019-03-08	Implement IP_MULTICAST_LOOP.	Ian Gudger
	IP_MULTICAST_LOOP controls whether or not multicast packets sent on the default route are looped back. In order to implement this switch, support for sending and looping back multicast packets on the default route had to be implemented. For now we only support IPv4 multicast. PiperOrigin-RevId: 237534603 Change-Id: I490ac7ff8e8ebef417c7eb049a919c29d156ac1c
2019-03-05	Add new retransmissions and recovery related metrics.	Bhasker Hariharan
	PiperOrigin-RevId: 236945145 Change-Id: I051760d95154ea5574c8bb6aea526f488af5e07b
2019-03-05	Remove unused commit() function argument to Bind.	Kevin Krakauer
	PiperOrigin-RevId: 236926132 Change-Id: I5cf103f22766e6e65a581de780c7bb9ca0fa3181
2019-02-27	Ping support via IPv4 raw sockets.	Kevin Krakauer
	Broadly, this change: * Enables sockets to be created via `socket(AF_INET, SOCK_RAW, IPPROTO_ICMP)`. * Passes the network-layer (IP) header up the stack to the transport endpoint, which can pass it up to the socket layer. This allows a raw socket to return the entire IP packet to users. * Adds functions to stack.TransportProtocol, stack.Stack, stack.transportDemuxer that enable incoming packets to be delivered to raw endpoints. New raw sockets of other protocols (not ICMP) just need to register with the stack. * Enables ping.endpoint to return IP headers when created via SOCK_RAW. PiperOrigin-RevId: 235993280 Change-Id: I60ed994f5ff18b2cbd79f063a7fdf15d093d845a
2019-02-25	Add a SACK scoreboard to TCP endpoints.	Bhasker Hariharan
	This change does not make use of SACK information but adds support to track SACK information and store it in the endpoint. The actual SACK based recovery will be in a separate CL. Part of commits to add RFC 6675 support to Netstack. PiperOrigin-RevId: 235612264 Change-Id: I261f94844d7bad5abda803152ce6cc6125a467ff
2019-02-22	Rename ping endpoints to icmp endpoints.	Kevin Krakauer
	PiperOrigin-RevId: 235248572 Change-Id: I5b0538b6feb365a98712c2a2d56d856fe80a8a09
2019-02-20	Implement Broadcast support	Amanda Tait
	This change adds support for the SO_BROADCAST socket option in gVisor Netstack. This support includes getsockopt()/setsockopt() functionality for both UDP and TCP endpoints (the latter being a NOOP), dispatching broadcast messages up and down the stack, and route finding/creation for broadcast packets. Finally, a suite of tests have been implemented, exercising this functionality through the Linux syscall API. PiperOrigin-RevId: 234850781 Change-Id: If3e666666917d39f55083741c78314a06defb26c
2019-02-15	Implement IP_MULTICAST_IF.	Ian Gudger
	This allows setting a default send interface for IPv4 multicast. IPv6 support will come later. PiperOrigin-RevId: 234251379 Change-Id: I65922341cd8b8880f690fae3eeb7ddfa47c8c173
2019-02-15	Move SO_TIMESTAMP from different transport endpoints to epsocket.	Kevin Krakauer
	SO_TIMESTAMP is reimplemented in ping and UDP sockets (and needs to be added for TCP), but can just be implemented in epsocket for simplicity. This will also make SIOCGSTAMP easier to implement. PiperOrigin-RevId: 234179300 Change-Id: Ib5ea0b1261dc218c1a8b15a65775de0050fe3230
2019-02-14	Internal change.	Googler
	PiperOrigin-RevId: 234011346 Change-Id: Ic69375ddb3794dd0d3d6e62ee4dc60fdf4baf2c7
2019-02-11	Do not drop packets w/ missing TCP timestamps.	Bhasker Hariharan
	RFC7323 recommends that if the timestamp option was negotiated then all packets should carry a TCP Timestamp and any packets that do not should be dropped. Netstack implemented this behaviour. Linux OTOH does not and will accept such packets. This change makes Netstack behaviour compatible with Linux. Also now that we allow such packets, we do need to update RTO calculations based on these packets even if timestamp option is enabled. PiperOrigin-RevId: 233432268 Change-Id: I9f4742ae6b63930ac3b5e37d8c238761e6a4b29f
2019-02-07	Plumb IP_ADD_MEMBERSHIP and IP_DROP_MEMBERSHIP to netstack.	Ian Gudger
	Also includes a few fixes for IPv4 multicast support. IPv6 support is coming in a followup CL. PiperOrigin-RevId: 233008638 Change-Id: If7dae6222fef43fda48033f0292af77832d95e82