gvisor - Container Runtime Sandbox

Age	Commit message (Collapse)	Author
2020-12-17	[netstack] Implement MSG_ERRQUEUE flag for recvmsg(2).	Ayush Ranjan
	Introduces the per-socket error queue and the necessary cmsg mechanisms. PiperOrigin-RevId: 348028508
2020-12-17	Remove duplicate `return`	Tamir Duberstein
	PiperOrigin-RevId: 347974624
2020-12-16	Cleanup locking in multicast group protocol tests	Ghanan Gowripalan
	Startblock: has LGTM from asfez and then add reviewer tamird PiperOrigin-RevId: 347928471
2020-12-16	Automated rollback of changelist 346565589	gVisor bot
	PiperOrigin-RevId: 347911316
2020-12-16	Merge pull request #4880 from lubinszARM:pr_tlbi_02	gVisor bot
	PiperOrigin-RevId: 347890782
2020-12-16	Add support to count the number of packets SACKed.	Nayana Bidari
	sacked_out is required in RACK to check the number of duplicate acknowledgements during updating the reorder window. If there is no reordering and the value for sacked_out is greater than the classic threshold value 3, then reorder window is set to zero. It is calculated by counting the number of segments sacked in the ACK and is reduced when a cumulative ACK is received which covers the SACK blocks. This value is set to zero when the connection enters recovery. PiperOrigin-RevId: 347872246
2020-12-16	Ensure correctness of saved receive window	Mithun Iyer
	When the scaled receive window size > 65535 (max uint16), we advertise the scaled value as 65535, but are not adjusting the saved receive window value when doing so. This would keep our current window calculation logic to be incorrect, as the saved receive window value is different from what was advertised. Fixes #4903 PiperOrigin-RevId: 347771340
2020-12-15	Validate router alert's data length	Ghanan Gowripalan
	RFC 2711 specifies that the router alert's length field is always 2 so we should make sure only 2 bytes are read from a router alert option's data field. Test: header.TestIPv6OptionsExtHdrIterErr PiperOrigin-RevId: 347727876
2020-12-15	Internal change.	Andrei Vagin
	PiperOrigin-RevId: 347720083
2020-12-15	Don't split enabled flag across multicast group state	Ghanan Gowripalan
	Startblock: has LGTM from asfez and then add reviewer brunodalbo PiperOrigin-RevId: 347716242
2020-12-15	Implement command SEM_INFO and SEM_STAT for semctl.	Jing Chen
	PiperOrigin-RevId: 347711998
2020-12-15	Change violation mode to an enum	Chong Cai
	PiperOrigin-RevId: 347706953
2020-12-15	[syzkaller] Avoid AIOContext from resurrecting after being marked dead.	Ayush Ranjan
	syzkaller reported the closing of a nil channel. This is only possible when the AIOContext was destroyed twice. Some scenarios that could lead to this: - It died and then some called aioCtx.Prepare() on it and then killed it again which could cause the double destroy. The context could have been destroyed in between the call to LookupAIOContext() and Prepare(). - aioManager was destroyed but it did not update the contexts map. So Lookup could still return a dead AIOContext and then someone could call Prepare on it and kill it again. So added a check in aioCtx.Prepare() for the context being dead. This will prevent a dead context from resurrecting. Also refactored code to destroy the aioContext consistently. Earlier we were not munmapping the aioContexts that were destroyed upon aioManager destruction. Reported-by: syzbot+ef6a588d0ce6059991d2@syzkaller.appspotmail.com PiperOrigin-RevId: 347704347
2020-12-15	[netstack] Make recvmsg(2) call to host in hostinet even if dst is empty.	Ayush Ranjan
	We want to make the recvmsg syscall to the host regardless of if the dst is empty or not so that: - Host can populate the control messages if necessary. - Host can return sender address. - Host can return appropriate errors. Earlier because we were using the IOSequence.CopyOutFrom() API, the usermem package does not even call the Reader function if the destination is empty (as an optimization). PiperOrigin-RevId: 347684566
2020-12-15	Internal change.	gVisor bot
	PiperOrigin-RevId: 347671070
2020-12-15	Merge pull request #4722 from zhlhahaha:2010	gVisor bot
	PiperOrigin-RevId: 347660920
2020-12-15	Fix error code for connect in raw sockets.	Nayana Bidari
	PiperOrigin-RevId: 347650354
2020-12-15	Fix a data race in packetEPs	Ting-Yu Wang
	packetEPs may get into a state that `len < cap`, casuing append() modifying the original slice storage. Reported-by: syzbot+978dd0e9c2600ab7a76b@syzkaller.appspotmail.com PiperOrigin-RevId: 347634351
2020-12-14	Update containerd/cgroups	Fabricio Voznika
	PiperOrigin-RevId: 347532687
2020-12-14	[netstack] Update raw socket and hostinet control message parsing.	Ayush Ranjan
	There are surprisingly few syscall tests that run with hostinet. For example running the following command only returns two results: `bazel query test/syscalls:all \| grep hostnet` I think as a result, as our control messages evolved, hostinet was left behind. Update it to support all control messages netstack supports. This change also updates sentry's control message parsing logic to make it up to date with all the control messages we support. PiperOrigin-RevId: 347508892
2020-12-14	Move SO_LINGER option to socketops.	Nayana Bidari
	PiperOrigin-RevId: 347437786
2020-12-14	Do not check for reference leaks after saving.	Dean Deng
	We should not assert that all resources are dropped after saving. PiperOrigin-RevId: 347420131
2020-12-14	Move SO_ERROR and SO_OOBINLINE option to socketops.	Nayana Bidari
	SO_OOBINLINE option is set/get as boolean value, which is the same as linux. As we currently do not support disabling this option, we always return it as true. PiperOrigin-RevId: 347413905
2020-12-12	Reduce the memory overhead in IP fragment management	Toshi Kikuchi
	- Deep-copy pkt.Data and hold it instead of shallow-copy (vv.Clone). This allows the pkt's backing array, which includes the header portion, to be freed. - Remove fragHeap. The fragments are now held in holes struct instead. - Stop reserving the initial capacity of holes slice. PiperOrigin-RevId: 347198744
2020-12-12	Introduce IPv6 extension header serialization facilities	Bruno Dal Bo
	Adds IPv6 extension header serializer and Hop by Hop options serializer. Add RouterAlert option serializer and use it in MLD. Fixed #4996 Startblock: has LGTM from marinaciocea and then add reviewer ghanan PiperOrigin-RevId: 347174537
2020-12-11	Internal change.	gVisor bot
	PiperOrigin-RevId: 347091372
2020-12-11	Make fixes to vfs2 leak checking.	Dean Deng
	PiperOrigin-RevId: 347089828
2020-12-11	Add runsc symbolize command.	Dean Deng
	This command takes instruction pointers from stdin and converts them into their corresponding file names and line/column numbers in the runsc source code. The inputs are not interpreted as actual addresses, but as synthetic values that are exposed through /sys/kernel/debug/kcov. One can extract coverage information from kcov and translate those values into locations in the source code by running symbolize on the same runsc binary. This will allow us to generate syzkaller coverage reports. PiperOrigin-RevId: 347089624
2020-12-11	Fix panic when IPv4 address is used in sendmsg for IPv6 sockets	Nayana Bidari
	We do not rely on error for getsockopt options(which have boolean values) anymore. This will cause issue in sendmsg where we used to return error for IPV6_V6Only option. Fix the panic by returning error (for sockets other than TCP and UDP) if the address does not match the type(AF_INET/AF_INET6) of the socket. PiperOrigin-RevId: 347063838
2020-12-11	Remove existing nogo exceptions.	Adin Scannell
	PiperOrigin-RevId: 347047550
2020-12-11	[netstack] Decouple tcpip.ControlMessages from the IP control messges.	Ayush Ranjan
	tcpip.ControlMessages can not contain Linux specific structures which makes it painful to convert back and forth from Linux to tcpip back to Linux when passing around control messages in hostinet and raw sockets. Now we convert to the Linux version of the control message as soon as we are out of tcpip. PiperOrigin-RevId: 347027065
2020-12-11	Make semctl IPC_INFO cmd return the index of highest used entry.	Jing Chen
	PiperOrigin-RevId: 346973338
2020-12-10	Change merkle root file name to avoid collision	Chong Cai
	PiperOrigin-RevId: 346923826
2020-12-10	Disable host reassembly for fragments.	Bhasker Hariharan
	fdbased endpoint was enabling fragment reassembly on the host AF_PACKET socket to ensure that fragments are delivered inorder to the right dispatcher. But this prevents fragments from being delivered to gvisor at all and makes testing of gvisor's fragment reassembly code impossible. The potential impact from this is minimal since IP Fragmentation is not really that prevelant and in cases where we do get fragments we may deliver the fragment out of order to the TCP layer as multiple network dispatchers may process the fragments and deliver a reassembled fragment after the next packet has been delivered to the TCP endpoint. While not desirable I believe the impact from this is minimal due to low prevalence of fragmentation. Also removed PktType and Hatype fields when binding the socket as these are not used when binding. Its just confusing to have them specified. See: https://man7.org/linux/man-pages/man7/packet.7.html "Fields used for binding are sll_family (should be AF_PACKET), sll_protocol, and sll_ifindex." Fixes #5055 PiperOrigin-RevId: 346919439
2020-12-10	Use specified source address for IGMP/MLD packets	Ghanan Gowripalan
	This change also considers interfaces and network endpoints enabled up up to the point all work to disable them are complete. This was needed so that protocols can perform shutdown work while being disabled (e.g. sending a packet which requires the endpoint to be enabled to obtain a source address). Bug #4682, #4861 Fixes #4888 Startblock: has LGTM from peterjohnston and then add reviewer brunodalbo PiperOrigin-RevId: 346869702
2020-12-09	Add support for IP_RECVORIGDSTADDR IP option.	Bhasker Hariharan
	Fixes #5004 PiperOrigin-RevId: 346643745
2020-12-09	Add //pkg/sync:generic_atomicptrmap.	Jamie Liu
	AtomicPtrMap is a generic concurrent map from arbitrary keys to arbitrary pointer values. Benchmarks: name time/op StoreDelete/RWMutexMap-12 335ns ± 1% StoreDelete/SyncMap-12 705ns ± 3% StoreDelete/AtomicPtrMap-12 287ns ± 4% StoreDelete/AtomicPtrMapSharded-12 289ns ± 1% LoadOrStoreDelete/RWMutexMap-12 342ns ± 2% LoadOrStoreDelete/SyncMap-12 662ns ± 2% LoadOrStoreDelete/AtomicPtrMap-12 290ns ± 7% LoadOrStoreDelete/AtomicPtrMapSharded-12 293ns ± 2% LookupPositive/RWMutexMap-12 101ns ±26% LookupPositive/SyncMap-12 202ns ± 2% LookupPositive/AtomicPtrMap-12 71.1ns ± 2% LookupPositive/AtomicPtrMapSharded-12 73.2ns ± 1% LookupNegative/RWMutexMap-12 119ns ± 1% LookupNegative/SyncMap-12 154ns ± 1% LookupNegative/AtomicPtrMap-12 84.7ns ± 3% LookupNegative/AtomicPtrMapSharded-12 86.8ns ± 1% Concurrent/FixedKeys_1PercentWrites_RWMutexMap-12 1.32µs ± 2% Concurrent/FixedKeys_1PercentWrites_SyncMap-12 52.7ns ±10% Concurrent/FixedKeys_1PercentWrites_AtomicPtrMap-12 31.8ns ±20% Concurrent/FixedKeys_1PercentWrites_AtomicPtrMapSharded-12 24.0ns ±15% Concurrent/FixedKeys_10PercentWrites_RWMutexMap-12 860ns ± 3% Concurrent/FixedKeys_10PercentWrites_SyncMap-12 68.8ns ±20% Concurrent/FixedKeys_10PercentWrites_AtomicPtrMap-12 98.6ns ± 7% Concurrent/FixedKeys_10PercentWrites_AtomicPtrMapSharded-12 42.0ns ±25% Concurrent/FixedKeys_50PercentWrites_RWMutexMap-12 1.17µs ± 3% Concurrent/FixedKeys_50PercentWrites_SyncMap-12 136ns ±34% Concurrent/FixedKeys_50PercentWrites_AtomicPtrMap-12 286ns ± 3% Concurrent/FixedKeys_50PercentWrites_AtomicPtrMapSharded-12 115ns ±35% Concurrent/ChangingKeys_1PercentWrites_RWMutexMap-12 1.27µs ± 2% Concurrent/ChangingKeys_1PercentWrites_SyncMap-12 5.01µs ± 3% Concurrent/ChangingKeys_1PercentWrites_AtomicPtrMap-12 38.1ns ± 3% Concurrent/ChangingKeys_1PercentWrites_AtomicPtrMapSharded-12 22.6ns ± 2% Concurrent/ChangingKeys_10PercentWrites_RWMutexMap-12 1.08µs ± 2% Concurrent/ChangingKeys_10PercentWrites_SyncMap-12 5.97µs ± 1% Concurrent/ChangingKeys_10PercentWrites_AtomicPtrMap-12 390ns ± 2% Concurrent/ChangingKeys_10PercentWrites_AtomicPtrMapSharded-12 93.6ns ± 1% Concurrent/ChangingKeys_50PercentWrites_RWMutexMap-12 1.77µs ± 2% Concurrent/ChangingKeys_50PercentWrites_SyncMap-12 8.07µs ± 2% Concurrent/ChangingKeys_50PercentWrites_AtomicPtrMap-12 1.61µs ± 2% Concurrent/ChangingKeys_50PercentWrites_AtomicPtrMapSharded-12 386ns ± 1% Updates #231 PiperOrigin-RevId: 346614776
2020-12-09	[netstack] Make tcpip.Error savable.	Ayush Ranjan
	Earlier we could not save tcpip.Error objects in structs because upon restore the constant's address changes in netstack's error translation map and translating the error would panic because the map is based on the address of the tcpip.Error instead of the error itself. Now I made that translations map use the error message as key instead of the address. Added relevant synchronization mechanisms to protect the structure and initialize it upon restore. PiperOrigin-RevId: 346590485
2020-12-09	Do not perform IGMP/MLD on loopback interfaces	Ghanan Gowripalan
	The loopback interface will never have any neighbouring nodes so advertising its interest in multicast groups is unnecessary. Bug #4682, #4861 Startblock: has LGTM from asfez and then add reviewer tamird PiperOrigin-RevId: 346587604
2020-12-09	Cap UDP payload size to length informed in UDP header	Bruno Dal Bo
	startblock: has LGTM from peterjohnston and then add reviewer ghanan,tamird PiperOrigin-RevId: 346565589
2020-12-09	Prepare for supporting cross compilation.	Andrei Vagin
	PiperOrigin-RevId: 346496532
2020-12-09	export MountTempDirectory	Zeling Feng
	PiperOrigin-RevId: 346487763
2020-12-07	Fix error handling on fusefs mount.	Rahat Mahmood
	Don't propagate arbitrary golang errors up from fusefs because errors that don't map to an errno result in a sentry panic. Reported-by: syzbot+697cb635346e456fddfc@syzkaller.appspotmail.com PiperOrigin-RevId: 346220306
2020-12-07	Export IGMP stats	Arthur Sfez
	PiperOrigin-RevId: 346197760
2020-12-07	Remove stale comment	Sam Balana
	Removes comment lines about MaxUnsolicitedReportDelay. This is already documented in the comment for GenericMulticastProtocolOptions. PiperOrigin-RevId: 346185053
2020-12-07	Merge pull request #4908 from lubinszARM:pr_kvm_ext_dabt	gVisor bot
	PiperOrigin-RevId: 346143528
2020-12-07	Merge pull request #4874 from zhlhahaha:2022	gVisor bot
	PiperOrigin-RevId: 346134026
2020-12-07	Remove p9.fidRef.openedMu	Michael Pratt
	openedMu has lock ordering violations. Most locks go through OpenedFlag(), which is usually taken after renameMu and opMu. On the other hand, Tlopen takes openedMu before renameMu and opMu (via safelyRead). Resolving this violation is simple: just drop openedMu. The opened and openFlags fields are already protected by opMu in most cases, renameMu (for write) in one case (via safelyGlobal), and only in doWalk by neither. This is a bit ugly because opMu is supposed to be a "semantic" lock, but it works. I'm open to other suggestions. Note that doWalk has a race condition where a FID may open after the open check but before actually walking. This race existed before this change as well; it is not clear if it is problematic. PiperOrigin-RevId: 346108483
2020-12-07	Support icmpv6 transport protocol	Peter Johnston
	PiperOrigin-RevId: 346101076
2020-12-05	Fix zero receive window advertisements.	Mithun Iyer
	With the recent changes db36d948fa63ce950d94a5e8e9ebc37956543661, we try to balance the receive window advertisements between payload lengths vs segment overhead length. This works fine when segment size are much higher than the overhead, but not otherwise. In cases where the segment length is smaller than the segment overhead, we may end up not advertising zero receive window for long time and end up tail-dropping segments. This is especially pronounced when application socket reads are slow or stopped. In this change we do not grow the right edge of the receive window for smaller segment sizes similar to Linux. Also, we keep track of the socket buffer usage and let the window grow if the application is actively reading data. Fixes #4903 PiperOrigin-RevId: 345832012