summaryrefslogtreecommitdiffhomepage
path: root/pkg
AgeCommit message (Collapse)Author
2021-04-09iptables: support postrouting hook and SNAT targetToshi Kikuchi
The current SNAT implementation has several limitations: - SNAT source port has to be specified. It is not optional. - SNAT source port range is not supported. - SNAT for UDP is a one-way translation. No response packets are handled (because conntrack doesn't support UDP currently). - SNAT and REDIRECT can't work on the same connection. Fixes #5489 PiperOrigin-RevId: 367750325
2021-04-09Return integrity failure only if enabledChong Cai
If the parent is not enabled in verity stepLocked(), failure to find the child dentry could just mean an incorrect path. PiperOrigin-RevId: 367733412
2021-04-09Merge pull request #5767 from avagin:mxcsrgVisor bot
PiperOrigin-RevId: 367730917
2021-04-09Move maxListenBacklog check to sentryMithun Iyer
Move maxListenBacklog check to the caller of endpoint Listen so that it is applicable to Unix domain sockets as well. This was changed in cl/366935921. Reported-by: syzbot+a35ae7cdfdde0c41cf7a@syzkaller.appspotmail.com PiperOrigin-RevId: 367728052
2021-04-09Rename IsV6LinkLocalAddress to IsV6LinkLocalUnicastAddressGhanan Gowripalan
To match the V4 variant. PiperOrigin-RevId: 367691981
2021-04-09Remove duplicate accept queue fullness checkTamir Duberstein
Both code paths perform this check; extract it and remove the comment that suggests it is unique to one of the paths. PiperOrigin-RevId: 367666160
2021-04-09Propagate SYN handling errorTamir Duberstein
Both callers of this function still drop this error on the floor, but progress is progress. Updates #4690. PiperOrigin-RevId: 367604788
2021-04-08Set root dentry and hash for verity before verifyChong Cai
Set root dentry and root hash in verity fs before we verify the root directory if a root hash is provided. These are used during verification. PiperOrigin-RevId: 367547346
2021-04-08Set parent after child is verifiedChong Cai
We should only set parent after child is verified. Also, if the parent is set before verified, destroyLocked() will try to grab parent.dirMu, which may cause deadlock. PiperOrigin-RevId: 367543655
2021-04-08Merge pull request #5736 from lubinszARM:pr_bblu_tlb_asidgVisor bot
PiperOrigin-RevId: 367523491
2021-04-08Do not forward link-local packetsGhanan Gowripalan
As per RFC 3927 section 7 and RFC 4291 section 2.5.6. Test: forward_test.TestMulticastForwarding PiperOrigin-RevId: 367519336
2021-04-08Add Children in merkletree generateChong Cai
This field was missing and should be provided. PiperOrigin-RevId: 367474481
2021-04-08Join all routers group when forwarding is enabledGhanan Gowripalan
See comments inline code for rationale. Test: ip_test.TestJoinLeaveAllRoutersGroup PiperOrigin-RevId: 367449434
2021-04-06Do not perform MLD for certain multicast scopesGhanan Gowripalan
...as per RFC 2710 section 5 page 10. Test: ipv6_test.TestMLDSkipProtocol PiperOrigin-RevId: 367031126
2021-04-05Update gofer dentry permissions only when needed.Ayush Ranjan
Without this change, we ask the gofer server to update the permissions whenever the UID, GID or size is updated via SetStat. Consequently, we don not generate inotify events when the permissions actually change due to SGID bit getting cleared. With this change, we will update the permissions only when needed and generate inotify events. PiperOrigin-RevId: 366946842
2021-04-05Fix listen backlog handling to be in parity with LinuxMithun Iyer
- Change the accept queue full condition for a listening endpoint to only honor completed (and delivered) connections. - Use syncookies if the number of incomplete connections is beyond listen backlog. This also cleans up the SynThreshold option code as that is no longer used with this change. - Added a new stack option to unconditionally generate syncookies. Similar to sysctl -w net.ipv4.tcp_syncookies=2 on Linux. - Enable keeping of incomplete connections beyond listen backlog. - Drop incoming SYNs only if the accept queue is filled up. - Drop incoming ACKs that complete handshakes when accept queue is full - Enable the stack to accept one more connection than programmed by listen backlog. - Handle backlog argument being zero, negative for listen, as Linux. - Add syscall and packetimpact tests to reflect the changes above. - Remove TCPConnectBacklog test which is polling for completed connections on the client side which is not reflective of whether the accept queue is filled up by the test. The modified syscall test in this CL addresses testing of connecting sockets. Fixes #3153 PiperOrigin-RevId: 366935921
2021-04-05Report task CPU usage through the cpuacct cgroup controller.Rahat Mahmood
PiperOrigin-RevId: 366923274
2021-04-05Allow default control values to be set for cgroupfs.Rahat Mahmood
PiperOrigin-RevId: 366891806
2021-04-05Allow user mount for verity fsChong Cai
Allow user mounting a verity fs on an existing mount by specifying mount flags root_hash and lower_path. PiperOrigin-RevId: 366843846
2021-04-05Fail tests when container returns non-zero statusFabricio Voznika
PiperOrigin-RevId: 366839955
2021-04-02Implement cgroupfs.Rahat Mahmood
A skeleton implementation of cgroupfs. It supports trivial cpu and memory controllers with no support for hierarchies. PiperOrigin-RevId: 366561126
2021-04-02Internal change.gVisor bot
PiperOrigin-RevId: 366462448
2021-04-01Internal changesBhasker Hariharan
PiperOrigin-RevId: 366344805
2021-04-01platform/kvm/x86: restore mxcsr when switching from guest to sentryAndrei Vagin
Goruntime sets mxcsr once and never changes it. Reported-by: syzbot+ec55cea6e57ec083b7a6@syzkaller.appspotmail.com Fixes: #5754
2021-03-29[syserror] Split usermem packageZach Koopmans
Split usermem package to help remove syserror dependency in go_marshal. New hostarch package contains code not dependent on syserror. PiperOrigin-RevId: 365651233
2021-03-29Merge pull request #5728 from zhlhahaha:2091gVisor bot
PiperOrigin-RevId: 365613394
2021-03-29[perf] Reduce contention in ptrace.threadPool.lookupOrCreate().Ayush Ranjan
lookupOrCreate is called from subprocess.switchToApp() and subprocess.syscall(). lookupOrCreate() looks for a thread already created for the current TID. If a thread exists (common case), it returns immediately. Otherwise it creates a new one. This change switches to using a sync.RWMutex. The initial thread existence lookup is now done only with the read lock. So multiple successful lookups can occur concurrently. Only when a new thread is created will it acquire the lock for writing and update the map (which is not the common case). Discovered in mutex profiles from the various ptrace benchmarks. Example: https://gvisor.dev/profile/gvisor-buildkite/fd14bfad-b30f-44dc-859b-80ebac50beb4/843827db-da50-4dc9-a2ea-ecf734dde2d5/tmp/profile/ptrace/BenchmarkFio/operation.write/blockSize.4K/filesystem.tmpfs/benchmarks/fio/mutex.pprof/flamegraph PiperOrigin-RevId: 365612094
2021-03-26arm64 ring0: don't use inner-sharable to invalidate tlbRobin Luk
It is enough to invalidate the tlb of local vcpu in switch(). TLBI with inner-sharable will invalidate the tlb in other vcpu. Arm64 hardware supports at least 256 pcid, so I think it's ok to set the length of pcid pool to 128. Signed-off-by: Robin Luk <lubin.lu@antgroup.com>
2021-03-25Use seqfile.SeqHandles correctly in VFS1 /proc/net/.Jamie Liu
Before this change: ``` $ docker run --runtime=runsc --rm -it -v ~/tmp:/hosttmp ubuntu:focal /hosttmp/issue5732 --bytes1=128 --bytes2=1024 #1: read(128) = 128 #2: read(1024) = EOF $ docker run --runtime=runsc-vfs2 --rm -it -v ~/tmp:/hosttmp ubuntu:focal /hosttmp/issue5732 --bytes1=128 --bytes2=1024 #1: read(128) = 128 #2: read(1024) = 256 ``` After this change: ``` $ docker run --runtime=runsc --rm -it -v ~/tmp:/hosttmp ubuntu:focal /hosttmp/issue5732 --bytes1=128 --bytes2=1024 #1: read(128) = 128 #2: read(1024) = 256 $ docker run --runtime=runsc-vfs2 --rm -it -v ~/tmp:/hosttmp ubuntu:focal /hosttmp/issue5732 --bytes1=128 --bytes2=1024 #1: read(128) = 128 #2: read(1024) = 256 ``` Fixes #5732 PiperOrigin-RevId: 365178386
2021-03-25Lock TaskSet mutex for writing in ptraceClone().Jamie Liu
This is necessary since ptraceClone() mutates tracer.ptraceTracees. PiperOrigin-RevId: 365152396
2021-03-25Fix comments errorHoward Zhang
Signed-off-by: Howard Zhang <howard.zhang@arm.com>
2021-03-25Fix nogo test errorHoward Zhang
Signed-off-by: Howard Zhang <howard.zhang@arm.com>
2021-03-24Add POLLRDNORM/POLLWRNORM support.Bhasker Hariharan
On Linux these are meant to be equivalent to POLLIN/POLLOUT. Rather than hack these on in sys_poll etc it felt cleaner to just cleanup the call sites to notify for both events. This is what linux does as well. Fixes #5544 PiperOrigin-RevId: 364859977
2021-03-24Fix data race in fdbased when accessing fanoutID.Bhasker Hariharan
PiperOrigin-RevId: 364859173
2021-03-24Unexpose immutable fields in stack.RouteNick Brown
This change sets the inner `routeInfo` struct to be a named private member and replaces direct access with access through getters. Note that direct access to the fields of `routeInfo` is still possible through the `RouteInfo` struct. Fixes #4902 PiperOrigin-RevId: 364822872
2021-03-23Merge pull request #5677 from avagin:kvm-mmiogVisor bot
PiperOrigin-RevId: 364728696
2021-03-23Move the code that manages floating-point state to a separate packageAndrei Vagin
This change is inspired by Adin's cl/355256448. PiperOrigin-RevId: 364695931
2021-03-23setgid directory support in goferfsKevin Krakauer
Also adds support for clearing the setuid bit when appropriate (writing, truncating, changing size, changing UID, or changing GID). VFS2 only. PiperOrigin-RevId: 364661835
2021-03-23Use constant (TestInitialSequenceNumber) instead of integer (789) in tests.Nayana Bidari
PiperOrigin-RevId: 364596526
2021-03-23Explicitly allow martian loopback packetsGhanan Gowripalan
...instead of opting out of them. Loopback traffic should be stack-local but gVisor has some clients that depend on the ability to receive loopback traffic that originated from outside of the stack. Because of this, we guard this change behind IP protocol options. A previous change provided the facility to deny these martian loopback packets but this change requires client to opt-in to accepting martian loopback packets as accepting martian loopback packets are not meant to be accepted, as per RFC 1122 section 3.2.1.3.g: (g) { 127, <any> } Internal host loopback address. Addresses of this form MUST NOT appear outside a host. PiperOrigin-RevId: 364581174
2021-03-22Fix logs for packetimpact tests cleanupZeling Feng
- Don't cleanup containers in Network.Cleanup, otherwise containers will be killed and removed several times. - Don't set AutoRemove for containers. This will prevent the confusing 'removal already in progress' messages. Fixes #3795 PiperOrigin-RevId: 364404414
2021-03-22Return tcpip.Error from (*Stack).GetMainNICAddressGhanan Gowripalan
PiperOrigin-RevId: 364381970
2021-03-22Avoid calling sync on each write in writethrough mode.Nicolas Lacasse
PiperOrigin-RevId: 364370595
2021-03-18Translate syserror when validating partial IO errorsFabricio Voznika
syserror allows packages to register translators for errors. These translators should be called prior to checking if the error is valid, otherwise it may not account for possible errors that can be returned from different packages, e.g. safecopy.BusError => syserror.EFAULT. Second attempt, it passes tests now :-) PiperOrigin-RevId: 363714508
2021-03-17Do not use martian loopback packets in testsGhanan Gowripalan
Transport demuxer and UDP tests should not use a loopback address as the source address for packets injected into the stack as martian loopback packets will be dropped in a later change. PiperOrigin-RevId: 363479681
2021-03-17Drop loopback traffic from outside of the stackGhanan Gowripalan
Loopback traffic should be stack-local but gVisor has some clients that depend on the ability to receive loopback traffic that originated from outside of the stack. Because of this, we guard this change behind IP protocol options. Test: integration_test.TestExternalLoopbackTraffic PiperOrigin-RevId: 363461242
2021-03-16kvm: prefault a floating point state before restoring itAndrei Vagin
If physical pages of a memory region are not mapped yet, the kernel will trigger KVM_EXIT_MMIO and we will map physical pages in bluepillHandler(). An instruction that triggered a fault will not be re-executed, it will be emulated in the kernel, but it can't emulate complex instructions like xsave, xrstor. We can touch the memory with simple instructions to workaround this problem.
2021-03-16Fix tcp_fin_retransmission_netstack_testZeling Feng
Netstack does not check ACK number for FIN-ACK packets and goes into TIMEWAIT unconditionally. Fixing the state machine will give us back the retransmission of FIN. PiperOrigin-RevId: 363301883
2021-03-16Fix a race with synRcvdCount and acceptMithun Iyer
There is a race in handling new incoming connections on a listening endpoint that causes the endpoint to reply to more incoming SYNs than what is permitted by the listen backlog. The race occurs when there is a successful passive connection handshake and the synRcvdCount counter is decremented, followed by the endpoint delivered to the accept queue. In the window of time between synRcvdCount decrementing and the endpoint being enqueued for accept, new incoming SYNs can be handled without honoring the listen backlog value, as the backlog could be perceived not full. Fixes #5637 PiperOrigin-RevId: 363279372
2021-03-16setgid directory support in overlayfsKevin Krakauer
PiperOrigin-RevId: 363276495