summaryrefslogtreecommitdiffhomepage
path: root/pkg
AgeCommit message (Collapse)Author
2021-04-13Fix listener close, client connect raceMithun Iyer
Fix a race where the ACK completing the handshake can be dropped by a closing listener without RST to the peer. The listener close would reset the accepted queue and that causes the connecting endpoint in SYNRCVD state to drop the ACK thinking the queue if filled up. PiperOrigin-RevId: 368165509
2021-04-12Add DecRef for verity FDs that were missingChong Cai
Some FileDescriptions in verity fs were opened but DecRef() were missing after used. This could result in a ref leak. PiperOrigin-RevId: 368096759
2021-04-12Don't grab TaskSet mu recursively when reading task state.Rahat Mahmood
Reported-by: syzbot+a6ef0f95a2c9e7da26f3@syzkaller.appspotmail.com Reported-by: syzbot+2eaf8a9f115edec468fe@syzkaller.appspotmail.com PiperOrigin-RevId: 368093861
2021-04-12[op] Use faster go_marshal methods in netfilter.Ayush Ranjan
Use MarshalUnsafe for packed types as it is faster than MarshalBytes. PiperOrigin-RevId: 368076368
2021-04-12Drop locks before calling waiterQueue.NotifyTamir Duberstein
Holding this lock can cause the user's callback to deadlock if it attempts to inspect the accept queue. PiperOrigin-RevId: 368068334
2021-04-10Use the SecureRNG to generate listener noncesTamir Duberstein
Some other cleanup while I'm here: - Remove unused arguments - Handle some unhandled errors - Remove redundant casts - Remove redundant parens - Avoid shadowing `hash` package name PiperOrigin-RevId: 367816161
2021-04-10Don't store accepted endpoints in a channelTamir Duberstein
Use a linked list with cached length and capacity. The current channel is already composed with a mutex and condition variable, and is never used for its channel-like properties. Channels also require eager allocation equal to their capacity, which a linked list does not. PiperOrigin-RevId: 367766626
2021-04-09iptables: support postrouting hook and SNAT targetToshi Kikuchi
The current SNAT implementation has several limitations: - SNAT source port has to be specified. It is not optional. - SNAT source port range is not supported. - SNAT for UDP is a one-way translation. No response packets are handled (because conntrack doesn't support UDP currently). - SNAT and REDIRECT can't work on the same connection. Fixes #5489 PiperOrigin-RevId: 367750325
2021-04-09Return integrity failure only if enabledChong Cai
If the parent is not enabled in verity stepLocked(), failure to find the child dentry could just mean an incorrect path. PiperOrigin-RevId: 367733412
2021-04-09Merge pull request #5767 from avagin:mxcsrgVisor bot
PiperOrigin-RevId: 367730917
2021-04-09Move maxListenBacklog check to sentryMithun Iyer
Move maxListenBacklog check to the caller of endpoint Listen so that it is applicable to Unix domain sockets as well. This was changed in cl/366935921. Reported-by: syzbot+a35ae7cdfdde0c41cf7a@syzkaller.appspotmail.com PiperOrigin-RevId: 367728052
2021-04-09Rename IsV6LinkLocalAddress to IsV6LinkLocalUnicastAddressGhanan Gowripalan
To match the V4 variant. PiperOrigin-RevId: 367691981
2021-04-09Remove duplicate accept queue fullness checkTamir Duberstein
Both code paths perform this check; extract it and remove the comment that suggests it is unique to one of the paths. PiperOrigin-RevId: 367666160
2021-04-09Propagate SYN handling errorTamir Duberstein
Both callers of this function still drop this error on the floor, but progress is progress. Updates #4690. PiperOrigin-RevId: 367604788
2021-04-08Set root dentry and hash for verity before verifyChong Cai
Set root dentry and root hash in verity fs before we verify the root directory if a root hash is provided. These are used during verification. PiperOrigin-RevId: 367547346
2021-04-08Set parent after child is verifiedChong Cai
We should only set parent after child is verified. Also, if the parent is set before verified, destroyLocked() will try to grab parent.dirMu, which may cause deadlock. PiperOrigin-RevId: 367543655
2021-04-08Merge pull request #5736 from lubinszARM:pr_bblu_tlb_asidgVisor bot
PiperOrigin-RevId: 367523491
2021-04-08Do not forward link-local packetsGhanan Gowripalan
As per RFC 3927 section 7 and RFC 4291 section 2.5.6. Test: forward_test.TestMulticastForwarding PiperOrigin-RevId: 367519336
2021-04-08Add Children in merkletree generateChong Cai
This field was missing and should be provided. PiperOrigin-RevId: 367474481
2021-04-08Join all routers group when forwarding is enabledGhanan Gowripalan
See comments inline code for rationale. Test: ip_test.TestJoinLeaveAllRoutersGroup PiperOrigin-RevId: 367449434
2021-04-06Do not perform MLD for certain multicast scopesGhanan Gowripalan
...as per RFC 2710 section 5 page 10. Test: ipv6_test.TestMLDSkipProtocol PiperOrigin-RevId: 367031126
2021-04-05Update gofer dentry permissions only when needed.Ayush Ranjan
Without this change, we ask the gofer server to update the permissions whenever the UID, GID or size is updated via SetStat. Consequently, we don not generate inotify events when the permissions actually change due to SGID bit getting cleared. With this change, we will update the permissions only when needed and generate inotify events. PiperOrigin-RevId: 366946842
2021-04-05Fix listen backlog handling to be in parity with LinuxMithun Iyer
- Change the accept queue full condition for a listening endpoint to only honor completed (and delivered) connections. - Use syncookies if the number of incomplete connections is beyond listen backlog. This also cleans up the SynThreshold option code as that is no longer used with this change. - Added a new stack option to unconditionally generate syncookies. Similar to sysctl -w net.ipv4.tcp_syncookies=2 on Linux. - Enable keeping of incomplete connections beyond listen backlog. - Drop incoming SYNs only if the accept queue is filled up. - Drop incoming ACKs that complete handshakes when accept queue is full - Enable the stack to accept one more connection than programmed by listen backlog. - Handle backlog argument being zero, negative for listen, as Linux. - Add syscall and packetimpact tests to reflect the changes above. - Remove TCPConnectBacklog test which is polling for completed connections on the client side which is not reflective of whether the accept queue is filled up by the test. The modified syscall test in this CL addresses testing of connecting sockets. Fixes #3153 PiperOrigin-RevId: 366935921
2021-04-05Report task CPU usage through the cpuacct cgroup controller.Rahat Mahmood
PiperOrigin-RevId: 366923274
2021-04-05Allow default control values to be set for cgroupfs.Rahat Mahmood
PiperOrigin-RevId: 366891806
2021-04-05Allow user mount for verity fsChong Cai
Allow user mounting a verity fs on an existing mount by specifying mount flags root_hash and lower_path. PiperOrigin-RevId: 366843846
2021-04-05Fail tests when container returns non-zero statusFabricio Voznika
PiperOrigin-RevId: 366839955
2021-04-02Implement cgroupfs.Rahat Mahmood
A skeleton implementation of cgroupfs. It supports trivial cpu and memory controllers with no support for hierarchies. PiperOrigin-RevId: 366561126
2021-04-02Internal change.gVisor bot
PiperOrigin-RevId: 366462448
2021-04-01Internal changesBhasker Hariharan
PiperOrigin-RevId: 366344805
2021-04-01platform/kvm/x86: restore mxcsr when switching from guest to sentryAndrei Vagin
Goruntime sets mxcsr once and never changes it. Reported-by: syzbot+ec55cea6e57ec083b7a6@syzkaller.appspotmail.com Fixes: #5754
2021-03-29[syserror] Split usermem packageZach Koopmans
Split usermem package to help remove syserror dependency in go_marshal. New hostarch package contains code not dependent on syserror. PiperOrigin-RevId: 365651233
2021-03-29Merge pull request #5728 from zhlhahaha:2091gVisor bot
PiperOrigin-RevId: 365613394
2021-03-29[perf] Reduce contention in ptrace.threadPool.lookupOrCreate().Ayush Ranjan
lookupOrCreate is called from subprocess.switchToApp() and subprocess.syscall(). lookupOrCreate() looks for a thread already created for the current TID. If a thread exists (common case), it returns immediately. Otherwise it creates a new one. This change switches to using a sync.RWMutex. The initial thread existence lookup is now done only with the read lock. So multiple successful lookups can occur concurrently. Only when a new thread is created will it acquire the lock for writing and update the map (which is not the common case). Discovered in mutex profiles from the various ptrace benchmarks. Example: https://gvisor.dev/profile/gvisor-buildkite/fd14bfad-b30f-44dc-859b-80ebac50beb4/843827db-da50-4dc9-a2ea-ecf734dde2d5/tmp/profile/ptrace/BenchmarkFio/operation.write/blockSize.4K/filesystem.tmpfs/benchmarks/fio/mutex.pprof/flamegraph PiperOrigin-RevId: 365612094
2021-03-26arm64 ring0: don't use inner-sharable to invalidate tlbRobin Luk
It is enough to invalidate the tlb of local vcpu in switch(). TLBI with inner-sharable will invalidate the tlb in other vcpu. Arm64 hardware supports at least 256 pcid, so I think it's ok to set the length of pcid pool to 128. Signed-off-by: Robin Luk <lubin.lu@antgroup.com>
2021-03-25Use seqfile.SeqHandles correctly in VFS1 /proc/net/.Jamie Liu
Before this change: ``` $ docker run --runtime=runsc --rm -it -v ~/tmp:/hosttmp ubuntu:focal /hosttmp/issue5732 --bytes1=128 --bytes2=1024 #1: read(128) = 128 #2: read(1024) = EOF $ docker run --runtime=runsc-vfs2 --rm -it -v ~/tmp:/hosttmp ubuntu:focal /hosttmp/issue5732 --bytes1=128 --bytes2=1024 #1: read(128) = 128 #2: read(1024) = 256 ``` After this change: ``` $ docker run --runtime=runsc --rm -it -v ~/tmp:/hosttmp ubuntu:focal /hosttmp/issue5732 --bytes1=128 --bytes2=1024 #1: read(128) = 128 #2: read(1024) = 256 $ docker run --runtime=runsc-vfs2 --rm -it -v ~/tmp:/hosttmp ubuntu:focal /hosttmp/issue5732 --bytes1=128 --bytes2=1024 #1: read(128) = 128 #2: read(1024) = 256 ``` Fixes #5732 PiperOrigin-RevId: 365178386
2021-03-25Lock TaskSet mutex for writing in ptraceClone().Jamie Liu
This is necessary since ptraceClone() mutates tracer.ptraceTracees. PiperOrigin-RevId: 365152396
2021-03-25Fix comments errorHoward Zhang
Signed-off-by: Howard Zhang <howard.zhang@arm.com>
2021-03-25Fix nogo test errorHoward Zhang
Signed-off-by: Howard Zhang <howard.zhang@arm.com>
2021-03-24Add POLLRDNORM/POLLWRNORM support.Bhasker Hariharan
On Linux these are meant to be equivalent to POLLIN/POLLOUT. Rather than hack these on in sys_poll etc it felt cleaner to just cleanup the call sites to notify for both events. This is what linux does as well. Fixes #5544 PiperOrigin-RevId: 364859977
2021-03-24Fix data race in fdbased when accessing fanoutID.Bhasker Hariharan
PiperOrigin-RevId: 364859173
2021-03-24Unexpose immutable fields in stack.RouteNick Brown
This change sets the inner `routeInfo` struct to be a named private member and replaces direct access with access through getters. Note that direct access to the fields of `routeInfo` is still possible through the `RouteInfo` struct. Fixes #4902 PiperOrigin-RevId: 364822872
2021-03-23Merge pull request #5677 from avagin:kvm-mmiogVisor bot
PiperOrigin-RevId: 364728696
2021-03-23Move the code that manages floating-point state to a separate packageAndrei Vagin
This change is inspired by Adin's cl/355256448. PiperOrigin-RevId: 364695931
2021-03-23setgid directory support in goferfsKevin Krakauer
Also adds support for clearing the setuid bit when appropriate (writing, truncating, changing size, changing UID, or changing GID). VFS2 only. PiperOrigin-RevId: 364661835
2021-03-23Use constant (TestInitialSequenceNumber) instead of integer (789) in tests.Nayana Bidari
PiperOrigin-RevId: 364596526
2021-03-23Explicitly allow martian loopback packetsGhanan Gowripalan
...instead of opting out of them. Loopback traffic should be stack-local but gVisor has some clients that depend on the ability to receive loopback traffic that originated from outside of the stack. Because of this, we guard this change behind IP protocol options. A previous change provided the facility to deny these martian loopback packets but this change requires client to opt-in to accepting martian loopback packets as accepting martian loopback packets are not meant to be accepted, as per RFC 1122 section 3.2.1.3.g: (g) { 127, <any> } Internal host loopback address. Addresses of this form MUST NOT appear outside a host. PiperOrigin-RevId: 364581174
2021-03-22Fix logs for packetimpact tests cleanupZeling Feng
- Don't cleanup containers in Network.Cleanup, otherwise containers will be killed and removed several times. - Don't set AutoRemove for containers. This will prevent the confusing 'removal already in progress' messages. Fixes #3795 PiperOrigin-RevId: 364404414
2021-03-22Return tcpip.Error from (*Stack).GetMainNICAddressGhanan Gowripalan
PiperOrigin-RevId: 364381970
2021-03-22Avoid calling sync on each write in writethrough mode.Nicolas Lacasse
PiperOrigin-RevId: 364370595