summaryrefslogtreecommitdiffhomepage
path: root/pkg
AgeCommit message (Collapse)Author
2021-06-10Fix lock ordering issue when enumerating cgroup tasks.Rahat Mahmood
The control files enumerating tasks and threads residing in cgroupfs incorrectly locks cgroupfs.filesystem.tasksMu before kernel.TaskSet.mu. The contents of these control files are inherently racy anyways, so use a snapshot of the tasks in the cgroup and drop tasksMu before resolving pids/tids (which acquires TaskSet.mu). PiperOrigin-RevId: 378767060
2021-06-10Set RLimits during `runsc exec`Fabricio Voznika
PiperOrigin-RevId: 378726430
2021-06-10Add /proc/sys/vm/max_map_countFabricio Voznika
Set it to int32 max because gVisor doesn't have a limit. Fixes #2337 PiperOrigin-RevId: 378722230
2021-06-10Parse mmap protection and flags in straceFabricio Voznika
PiperOrigin-RevId: 378712518
2021-06-10Report task exit in /proc/[pid]/{stat,status} before task goroutine exit.Jamie Liu
Between when runExitNotify.execute() returns nil (indicating that the task goroutine should exit) and when Task.run() advances Task.gosched.State to TaskGoroutineNonexistent (indicating that the task goroutine is exiting), there is a race window in which the Task is waitable (since TaskSet.mu is unlocked and Task.exitParentNotified is true) but will be reported by /proc/[pid]/status as running. Close the window by checking Task.exitState before task goroutine exit. PiperOrigin-RevId: 378711484
2021-06-10[op] Move SignalInfo to abi/linux package.Ayush Ranjan
Fixes #214 PiperOrigin-RevId: 378680466
2021-06-10Merge pull request #6103 from sudo-sturbia:semaphore-errgVisor bot
PiperOrigin-RevId: 378607458
2021-06-10[op] Move SignalStack to abi/linux package.Ayush Ranjan
Updates #214 PiperOrigin-RevId: 378594929
2021-06-09[op] Move SignalAct to abi/linux package.Ayush Ranjan
There were also other duplicate definitions of the same struct that I have now removed. Updates #214 PiperOrigin-RevId: 378579954
2021-06-09Change TODO bug to a more specific issueKevin Krakauer
This lets us close a tracking bug that's too widely-scoped to be reasonably finished. PiperOrigin-RevId: 378563203
2021-06-09Decommit huge-page-aligned regions during reclaim under manual zeroing.Jamie Liu
PiperOrigin-RevId: 378546551
2021-06-09Merge pull request #6128 from kevinGC:fanout-pidgVisor bot
PiperOrigin-RevId: 378506076
2021-06-09Change TODO to NOTE.Nicolas Lacasse
It's in VFS1 code, so we probably will not do it. PiperOrigin-RevId: 378474174
2021-06-09Avoid fanout group collisions with best effortKevin Krakauer
Running multiple instances of netstack in the same network namespace can cause collisions when enabling packet fanout for fdbased endpoints. The only bulletproof fix is to run in different network namespaces, but by using `getpid()` instead of 0 as the fanout ID starting point we can avoid collisions in the common case, particularly when testing/experimenting. Addresses #6124
2021-06-07Exclusively lock IPv6 EP when modifying addressesGhanan Gowripalan
...as address add/removal updates multicast group memberships and NDP state. This partially reverts the change made to the IPv6 endpoint in https://github.com/google/gvisor/commit/ebebb3059f7c5dbe42af85715f1c51c. PiperOrigin-RevId: 378061726
2021-06-07Remove unsupported syscall event for setsockopt(*, SOL_SOCKET, SO_OOBINLINE).Nicolas Lacasse
Netstack behaves as if SO_OOBINLINE is always set, and was logging an unsupported syscall event if the app tries to disable it. We don't have a real use case for TCP urgent mechanisms (and RFC6093 says apps SHOULD NOT use it). This CL keeps the current behavior, but removes the unsupported syscall event. Fixes #6123 PiperOrigin-RevId: 378026059
2021-06-07cgroupfs: don't add a task in the root cgroup if it is already there.Andrei Vagin
PiperOrigin-RevId: 377975013
2021-06-07Implement RENAME_NOREPLACE for all VFS2 filesystem implementations.Jamie Liu
PiperOrigin-RevId: 377966969
2021-06-05Use the NIC packets arrived at when filteringGhanan Gowripalan
As per https://linux.die.net/man/8/iptables, ``` Parameters -i, --in-interface [!] name Name of an interface via which a packet was received (only for packets entering the INPUT, FORWARD and PREROUTING chains). ``` Before this change, iptables would use the NIC that a packet was delivered to after forwarding a packet locally (when forwarding is enabled) instead of the NIC the packet arrived at. Updates #170, #3549. Test: iptables_test.TestInputHookWithLocalForwarding PiperOrigin-RevId: 377714971
2021-06-04Honor data and FIN from the ACK completing handshakeMithun Iyer
If the ACK completing the handshake has FIN or data, requeue the segment for further processing by the newly established endpoint. Otherwise, the segments would have to be retransmitted by the peer to be processed by the established endpoint. Doing this, keeps the behavior in parity with Linux. This also addresses a test flake with TCPNonBlockingConnectClose where the ACK (completing the handshake) and multiple retransmitted FINACKs from the peer could be dropped by the listener, when using syncookies and the accept queue is full. The handshake could eventually get completed with a retransmitted FINACK, without actual processing of FIN. This can cause the poll with POLLRDHUP on the accepted socket to sometimes time out before the next FINACK retransmission. PiperOrigin-RevId: 377651695
2021-06-03Add additional mmap seccomp ruleFabricio Voznika
HostFileMapper.RegenerateMappings calls mmap with MAP_SHARED|MAP_FIXED and these were not allowed. Closes #6116 PiperOrigin-RevId: 377428463
2021-06-03Implement stringer for ExitStatusTamir Duberstein
PiperOrigin-RevId: 377370807
2021-06-03Initialize metrics at initTamir Duberstein
Avoids a race condition at kernel initialization. Updates #6057. PiperOrigin-RevId: 377357723
2021-06-01Ensure full shutdown of endpoint on notifyCloseMithun Iyer
Address a race with non-blocking connect and socket close, causing the FIN (because of socket close) to not be sent out, even after completing the handshake. The race occurs with this sequence: (1) endpoint Connect starts handshake, sending out SYN (2) handshake complete() releases endpoint lock, waiting on sleeper.Fetch() (3) endpoint Close acquires endpoint lock, does not enqueue FIN (as the endpoint is not yet connected) and asserts notifyClose (4) SYNACK from peer gets enqueued asserting newSegmentWaker (5) handshake complete() re-aqcuires lock, first processes newSegmentWaker event, transitions to ESTABLISHED and proceeds to protocolMainLoop() (6) protocolMainLoop() exits while processing notifyClose When the execution follows the above sequence, no FIN is sent to the peer. This causes the listener side to have a half-open connection sitting in the accept queue. Fix this by ensuring that the protocolMainLoop() performs clean shutdown when the endpoint state is still ESTABLISHED. This would not be a bug, if during handshake complete(), sleeper.Fetch() prioritized notificationWaker over newSegmentWaker. In that case, the handshake would not have completed in (5) above. Fixes #6067 PiperOrigin-RevId: 376994395
2021-06-01Move sync generics to their own packagesTamir Duberstein
The presence of multiple packages in a single directory sometimes confuses `go mod`, producing output like: go: downloading gvisor.dev/gvisor v0.0.0-20210601174640-77dc0f5bc94d $GOMODCACHE/gvisor.dev/gvisor@v0.0.0-20210601174640-77dc0f5bc94d/pkg/linewriter/linewriter.go:21:2: found packages sync (aliases.go) and seqatomic (generic_atomicptr_unsafe.go) in $GOMODCACHE/gvisor.dev/gvisor@v0.0.0-20210601174640-77dc0f5bc94d/pkg/sync imports.go:67:2: found packages tcp (accept.go) and rcv (rcv_test.go) in $GOMODCACHE/gvisor.dev/gvisor@v0.0.0-20210601174640-77dc0f5bc94d/pkg/tcpip/transport/tcp PiperOrigin-RevId: 376956213
2021-06-01vfs: Don't allow to mount anything on top of detached mountsAndrei Vagin
PiperOrigin-RevId: 376932659
2021-06-01Ignore RST received for a TCP listenerMithun Iyer
The current implementation has a bug where TCP listener does not ignore RSTs from the peer. While handling RST+ACK from the peer, this bug can complete handshakes that use syncookies. This results in half-open connection delivered to the accept queue. Fixes #6076 PiperOrigin-RevId: 376868749
2021-05-31Update comments on ambient caps to point to bugIan Lewis
PiperOrigin-RevId: 376747671
2021-05-31Use syserror.ENOSPC for system-wide semaphore limits.Zyad A. Ali
semget(2) man page specifies that ENOSPC should be used if "the system limit for the maximum number of semaphore sets (SEMMNI), or the system wide maximum number of semaphores (SEMMNS), would be exceeded."
2021-05-28Clean up warningsTamir Duberstein
- Typos - Unused arguments - Useless conversions PiperOrigin-RevId: 376362730
2021-05-27Fix test_app task-treeFabricio Voznika
Executing `select {}` to wait forever triggers Go runtime deadlock detection and kills the child, causing the number actual processes be less than expected. PiperOrigin-RevId: 376298799
2021-05-27Support SO_BINDTODEVICE in ICMP socketsSam Balana
Adds support for the SO_BINDTODEVICE socket option in ICMP sockets with an accompanying packetimpact test to exercise use of this socket option. Adds a unit test to exercise the NIC selection logic introduced by this change. The remaining unit tests for ICMP sockets need to be added in a subsequent CL. See https://gvisor.dev/issues/5623 for the list of remaining unit tests. Adds a "timeout" field to PacketimpactTestInfo, necessary due to the long runtime of the newly added packetimpact test. Fixes #5678 Fixes #4896 Updates #5623 Updates #5681 Updates #5763 Updates #5956 Updates #5966 Updates #5967 PiperOrigin-RevId: 376271581
2021-05-27nanosleep has to store the finish time in the restart blockAndrei Vagin
nanosleep has to count time that a thread spent in the stopped state. PiperOrigin-RevId: 376258641
2021-05-27Speed up TestBindToDeviceDistributionKevin Krakauer
Testing only TestBindToDeviceDistribution decreased from 24s to 11s, and with TSAN from 186s to 21s. Note: using `t.Parallel()` actually slows the test down. PiperOrigin-RevId: 376251420
2021-05-27Merge pull request #6059 from lubinszARM:pr_arm64_bouncegVisor bot
PiperOrigin-RevId: 376233013
2021-05-27Use fake clocks in all testsTamir Duberstein
...except TCP tests and NDP tests that mutate globals. These will be undertaken later. Updates #5940. PiperOrigin-RevId: 376145608
2021-05-27Avoid warningsTamir Duberstein
- Don't shadow package name - Don't defer in a loop - Remove unnecessary type conversion PiperOrigin-RevId: 376137822
2021-05-26Use the stack RNG everywhereTamir Duberstein
...except in tests. Note this replaces some uses of a cryptographic RNG with a plain RNG. PiperOrigin-RevId: 376070666
2021-05-26Clarify commentTamir Duberstein
PiperOrigin-RevId: 376022495
2021-05-26Add verity getdents testsChong Cai
PiperOrigin-RevId: 376001603
2021-05-26Move presence methods from segment to TCPFlagsTamir Duberstein
PiperOrigin-RevId: 376001032
2021-05-26Alias most local importTamir Duberstein
PiperOrigin-RevId: 375977977
2021-05-26Spawn dequeing task via the clockTamir Duberstein
...and use manual clocks in forwarding and link resolution tests. Fixes #5141. Fixes #6012. PiperOrigin-RevId: 375939167
2021-05-26Use the stack clock everywhereTamir Duberstein
Updates #5939. Updates #6012. RELNOTES: n/a PiperOrigin-RevId: 375931554
2021-05-25Initialize Kernel.Timekeeper before network NSTamir Duberstein
PiperOrigin-RevId: 375843579
2021-05-25Use specific fmt verbs (avoid %v)Tamir Duberstein
Remove useless conversions. Avoid unhandled errors. PiperOrigin-RevId: 375834275
2021-05-25Merge pull request #6064 from sudo-sturbia:misspellinggVisor bot
PiperOrigin-RevId: 375789776
2021-05-25setgid directories for VFS1 tmpfs, overlayfs, and goferfsKevin Krakauer
PiperOrigin-RevId: 375780659
2021-05-25Use opaque types to represent timeTamir Duberstein
Introduce tcpip.MonotonicTime; replace int64 in tcpip.Clock method returns with time.Time and MonotonicTime to improve type safety and ensure that monotonic clock readings are never compared to wall clock readings. PiperOrigin-RevId: 375775907
2021-05-25Use the stack RNGTamir Duberstein
Remove redundant interface. PiperOrigin-RevId: 375756254