gvisor - Container Runtime Sandbox

Age	Commit message (Collapse)	Author
2019-10-07	Rename epsocket to netstack.	Kevin Krakauer
	PiperOrigin-RevId: 273365058
2019-09-27	Implement SO_BINDTODEVICE sockopt	gVisor bot
	PiperOrigin-RevId: 271644926
2019-09-26	Make raw socket tests pass in environments with or without CAP_NET_RAW.	Kevin Krakauer
	PiperOrigin-RevId: 271442321
2019-09-23	netstack: convert more socket options to {Set,Get}SockOptInt	Andrei Vagin
	PiperOrigin-RevId: 270763208
2019-09-23	internal BUILD file cleanup.	gVisor bot
	PiperOrigin-RevId: 270680704
2019-09-19	Remove defer from hot path and ensure Atomic is applied consistently.	Adin Scannell
	PiperOrigin-RevId: 270114317
2019-09-12	Implement splice methods for pipes and sockets.	Adin Scannell
	This also allows the tee(2) implementation to be enabled, since dup can now be properly supported via WriteTo. Note that this change necessitated some minor restructoring with the fs.FileOperations splice methods. If the *fs.File is passed through directly, then only public API methods are accessible, which will deadlock immediately since the locking is already done by fs.Splice. Instead, we pass through an abstract io.Reader or io.Writer, which elide locks and use the underlying fs.FileOperations directly. PiperOrigin-RevId: 268805207
2019-09-12	Remove go_test from go_stateify and go_marshal	Michael Pratt
	They are no-ops, so the standard rule works fine. PiperOrigin-RevId: 268776264
2019-08-30	Return correct buffer size for ioctl(socket, FIONREAD)	Fabricio Voznika
	Ioctl was returning just the buffer size from epsocket.endpoint and it was not considering data from epsocket.SocketOperations that was read from the endpoint, but not yet sent to the caller. PiperOrigin-RevId: 266485461
2019-08-29	Implement /proc/net/udp.	Rahat Mahmood
	PiperOrigin-RevId: 266229756
2019-08-21	Use tcpip.Subnet in tcpip.Route	Tamir Duberstein
	This is the first step in replacing some of the redundant types with the standard library equivalents. PiperOrigin-RevId: 264706552
2019-08-19	hostinet: fix parsing route netlink message	Jianfeng Tan
	We wrongly parses output interface as gateway address. The fix is straightforward. Fixes #638 Signed-off-by: Jianfeng Tan <henry.tjf@antfin.com> Change-Id: Ia4bab31f3c238b0278ea57ab22590fad00eaf061 COPYBARA_INTEGRATE_REVIEW=https://github.com/google/gvisor/pull/684 from tanjianfeng:fix-638 b940e810367ad1273519bfa594f4371bdd293e83 PiperOrigin-RevId: 264211336
2019-08-19	Read iptables via sockopts.	Kevin Krakauer
	PiperOrigin-RevId: 264180125
2019-08-16	netstack: disconnect an unix socket only if the address family is AF_UNSPEC	Andrei Vagin
	Linux allows to call connect for ANY and the zero port. PiperOrigin-RevId: 263892534
2019-08-14	Replace uinptr with int64 when returning lengths	Tamir Duberstein
	This is in accordance with newer parts of the standard library. PiperOrigin-RevId: 263449916
2019-08-14	Improve SendMsg performance.	Bhasker Hariharan
	SendMsg before this change would copy all the data over into a new slice even if the underlying socket could only accept a small amount of data. This is really inefficient with non-blocking sockets and under high throughput where large writes could get ErrWouldBlock or if there was say a timeout associated with the sendmsg() syscall. With this change we delay copying bytes in till they are needed and only copy what can be potentially sent/held in the socket buffer. Reducing the need to repeatedly copy data over. Also a minor fix to change state FIN-WAIT-1 when shutdown(..., SHUT_WR) is called instead of when we transmit the actual FIN. Otherwise the socket could remain in CONNECTED state even though the user has called shutdown() on the socket. Updates #627 PiperOrigin-RevId: 263430505
2019-08-09	netlink: return an error in nlmsgerr	Andrei Vagin
	Now if a process sends an unsupported netlink requests, an error is returned from the send system call. The linux kernel works differently in this case. It returns errors in the nlmsgerr netlink message. Reported-by: syzbot+571d99510c6f935202da@syzkaller.appspotmail.com PiperOrigin-RevId: 262690453
2019-08-08	Return a well-defined socket address type from socket funtions.	Rahat Mahmood
	Previously we were representing socket addresses as an interface{}, which allowed any type which could be binary.Marshal()ed to be used as a socket address. This is fine when the address is passed to userspace via the linux ABI, but is problematic when used from within the sentry such as by networking procfs files. PiperOrigin-RevId: 262460640
2019-08-08	netstack: Don't start endpoint goroutines too soon on restore.	Rahat Mahmood
	Endpoint protocol goroutines were previously started as part of loading the endpoint. This is potentially too soon, as resources used by these goroutine may not have been loaded. Protocol goroutines may perform meaningful work as soon as they're started (ex: incoming connect) which can cause them to indirectly access resources that haven't been loaded yet. This CL defers resuming all protocol goroutines until the end of restore. PiperOrigin-RevId: 262409429
2019-08-08	Merge pull request #653 from xiaobo55x:dev	gVisor bot
	PiperOrigin-RevId: 262402929
2019-08-05	Change syscall.EPOLLET to unix.EPOLLET	Haibo Xu
	syscall.EPOLLET has been defined with different values on amd64 and arm64(-0x80000000 on amd64, and 0x80000000 on arm64), while unix.EPOLLET has been unified this value to 0x80000000(golang/go#5328). ref #63 Signed-off-by: Haibo Xu <haibo.xu@arm.com> Change-Id: Id97d075c4e79d86a2ea3227ffbef02d8b00ffbb8
2019-08-02	Plumbing for iptables sockopts.	Kevin Krakauer
	PiperOrigin-RevId: 261413396
2019-08-02	Automated rollback of changelist 261191548	Rahat Mahmood
	PiperOrigin-RevId: 261373749
2019-08-01	Implement getsockopt(TCP_INFO).	Rahat Mahmood
	Export some readily-available fields for TCP_INFO and stub out the rest. PiperOrigin-RevId: 261191548
2019-07-31	Basic support for 'ip route'	Ian Lewis
	Implements support for RTM_GETROUTE requests for netlink sockets. Fixes #507 PiperOrigin-RevId: 261051045
2019-07-24	Add support for a subnet prefix length on interface network addresses	Chris Kuiper
	This allows the user code to add a network address with a subnet prefix length. The prefix length value is stored in the network endpoint and provided back to the user in the ProtocolAddress type. PiperOrigin-RevId: 259807693
2019-07-18	net/tcp/setockopt: impelment setsockopt(fd, SOL_TCP, TCP_INQ)	Andrei Vagin
	PiperOrigin-RevId: 258859507
2019-07-17	Add AF_UNIX, SOCK_RAW sockets, which exist for some reason.	Kevin Krakauer
	tcpdump creates these. PiperOrigin-RevId: 258611829
2019-07-15	Support /proc/net/dev	Jianfeng Tan
	This proc file reports the stats of interfaces. We could use ifconfig command to check the result. Signed-off-by: Jianfeng Tan <henry.tjf@antfin.com> Change-Id: Ia7c1e637f5c76c30791ffda68ee61e861b6ef827 COPYBARA_INTEGRATE_REVIEW=https://gvisor-review.googlesource.com/c/gvisor/+/18282/ PiperOrigin-RevId: 258303936
2019-07-12	Add IPPROTO_RAW, which allows raw sockets to write IP headers.	Kevin Krakauer
	iptables also relies on IPPROTO_RAW in a way. It opens such a socket to manipulate the kernel's tables, but it doesn't actually use any of the functionality. Blegh. PiperOrigin-RevId: 257903078
2019-07-12	Stub out support for TCP_MAXSEG.	Bhasker Hariharan
	Adds support to set/get the TCP_MAXSEG value but does not really change the segment sizes emitted by netstack or alter the MSS advertised by the endpoint. This is currently being added only to unblock iperf3 on gVisor. Plumbing this correctly requires a bit more work which will come in separate CLs. PiperOrigin-RevId: 257859112
2019-07-03	netstack/udp: connect with the AF_UNSPEC address family means disconnect	Andrei Vagin
	PiperOrigin-RevId: 256433283
2019-07-02	Remove map from fd_map, change to fd_table.	Adin Scannell
	This renames FDMap to FDTable and drops the kernel.FD type, which had an entire package to itself and didn't serve much use (it was freely cast between types, and served as more of an annoyance than providing any protection.) Based on BenchmarkFDLookupAndDecRef-12, we can expect 5-10 ns per lookup operation, and 10-15 ns per concurrent lookup operation of savings. This also fixes two tangential usage issues with the FDMap. Namely, non-atomic use of NewFDFrom and associated calls to Remove (that are both racy and fail to drop the reference on the underlying file.) PiperOrigin-RevId: 256285890
2019-07-01	Fix unix/transport.queue reference leaks.	Ian Gudger
	Fix two leaks for connectionless Unix sockets: * Double connect: Subsequent connects would leak a reference on the previously connected endpoint. * Close unconnected: Sockets which were not connected at the time of closure would leak a reference on their receiver. PiperOrigin-RevId: 256070451
2019-06-28	Add finalizer on AtomicRefCount to check for leaks.	Ian Gudger
	PiperOrigin-RevId: 255711454
2019-06-27	Complete pipe support on overlayfs	Fabricio Voznika
	Get/Set pipe size and ioctl support were missing from overlayfs. It required moving the pipe.Sizer interface to fs so that overlay could get access. Fixes #318 PiperOrigin-RevId: 255511125
2019-06-27	Fix various spelling issues in the documentation	Michael Pratt
	Addresses obvious typos, in the documentation only. COPYBARA_INTEGRATE_REVIEW=https://github.com/google/gvisor/pull/443 from Pixep:fix/documentation-spelling 4d0688164eafaf0b3010e5f4824b35d1e7176d65 PiperOrigin-RevId: 255477779
2019-06-18	gvisor/fs: don't update file.offset for sockets, pipes, etc	Andrei Vagin
	sockets, pipes and other non-seekable file descriptors don't use file.offset, so we don't need to update it. With this change, we will be able to call file operations without locking the file.mu mutex. This is already used for pipes in the splice system call. PiperOrigin-RevId: 253746644
2019-06-13	Add support for TCP receive buffer auto tuning.	Bhasker Hariharan
	The implementation is similar to linux where we track the number of bytes consumed by the application to grow the receive buffer of a given TCP endpoint. This ensures that the advertised window grows at a reasonable rate to accomodate for the sender's rate and prevents large amounts of data being held in stack buffers if the application is not actively reading or not reading fast enough. The original paper that was used to implement the linux receive buffer auto- tuning is available @ https://public.lanl.gov/radiant/pubs/drs/lacsi2001.pdf NOTE: Linux does not implement DRS as defined in that paper, it's just a good reference to understand the solution space. Updates #230 PiperOrigin-RevId: 253168283
2019-06-13	Plumb context through more layers of filesytem.	Ian Gudger
	All functions which allocate objects containing AtomicRefCounts will soon need a context. PiperOrigin-RevId: 253147709
2019-06-13	Implement getsockopt() SO_DOMAIN, SO_PROTOCOL and SO_TYPE.	Rahat Mahmood
	SO_TYPE was already implemented for everything but netlink sockets. PiperOrigin-RevId: 253138157
2019-06-13	Update canonical repository.	Adin Scannell
	This can be merged after: https://github.com/google/gvisor-website/pull/77 or https://github.com/google/gvisor-website/pull/78 PiperOrigin-RevId: 253132620
2019-06-12	Add support for TCP_CONGESTION socket option.	Bhasker Hariharan
	This CL also cleans up the error returned for setting congestion control which was incorrectly returning EINVAL instead of ENOENT. PiperOrigin-RevId: 252889093
2019-06-10	Store more information in the kernel socket table.	Rahat Mahmood
	Store enough information in the kernel socket table to distinguish between different types of sockets. Previously we were only storing the socket family, but this isn't enough to classify sockets. For example, TCPv4 and UDPv4 sockets are both AF_INET, and ICMP sockets are SOCK_DGRAM sockets with a particular protocol. Instead of creating more sub-tables, flatten the socket table and provide a filtering mechanism based on the socket entry. Also generate and store a socket entry index ("sl" in linux) which allows us to output entries in a stable order from procfs. PiperOrigin-RevId: 252495895
2019-06-06	Use common definition of SockType.	Rahat Mahmood
	SockType isn't specific to unix domain sockets, and the current definition basically mirrors the linux ABI's definition. PiperOrigin-RevId: 251956740
2019-06-06	Track and export socket state.	Rahat Mahmood
	This is necessary for implementing network diagnostic interfaces like /proc/net/{tcp,udp,unix} and sock_diag(7). For pass-through endpoints such as hostinet, we obtain the socket state from the backend. For netstack, we add explicit tracking of TCP states. PiperOrigin-RevId: 251934850
2019-06-03	gvisor/sock/unix: pass creds when a message is sent between unconnected sockets	Andrei Vagin
	and don't report a sender address if it doesn't have one PiperOrigin-RevId: 251371284
2019-05-30	Fixes to TCP listen behavior.	Bhasker Hariharan
	Netstack listen loop can get stuck if cookies are in-use and the app is slow to accept incoming connections. Further we continue to complete handshake for a connection even if the backlog is full. This creates a problem when a lots of connections come in rapidly and we end up with lots of completed connections just hanging around to be delivered. These fixes change netstack behaviour to mirror what linux does as described here in the following article http://veithen.io/2014/01/01/how-tcp-backlog-works-in-linux.html Now when cookies are not in-use Netstack will silently drop the ACK to a SYN-ACK and not complete the handshake if the backlog is full. This will result in the connection staying in a half-complete state. Eventually the sender will retransmit the ACK and if backlog has space we will transition to a connected state and deliver the endpoint. Similarly when cookies are in use we do not try and create an endpoint unless there is space in the accept queue to accept the newly created endpoint. If there is no space then we again silently drop the ACK as we can just recreate it when the ACK is retransmitted by the peer. We also now use the backlog to cap the size of the SYN-RCVD queue for a given endpoint. So at any time there can be N connections in the backlog and N in a SYN-RCVD state if the application is not accepting connections. Any new SYNs will be dropped. This CL also fixes another small bug where we mark a new endpoint which has not completed handshake as connected. We should wait till handshake successfully completes before marking it connected. Updates #236 PiperOrigin-RevId: 250717817
2019-05-30	gvisor: socket() returns EPROTONOSUPPORT if protocol is not supported	Andrei Vagin
	PiperOrigin-RevId: 250426407
2019-05-22	UDP and TCP raw socket support.	Kevin Krakauer
	PiperOrigin-RevId: 249511348 Change-Id: I34539092cc85032d9473ff4dd308fc29dc9bfd6b