summaryrefslogtreecommitdiffhomepage
path: root/pkg
AgeCommit message (Collapse)Author
2020-12-03Make `stack.Route` thread safePeter Johnston
Currently we rely on the user to take the lock on the endpoint that owns the route, in order to modify it safely. We can instead move `Route.RemoteLinkAddress` under `Route`'s mutex, and allow non-locking and thread-safe access to other fields of `Route`. PiperOrigin-RevId: 345461586
2020-12-03Implement `fcntl` options `F_GETSIG` and `F_SETSIG`.Etienne Perot
These options allow overriding the signal that gets sent to the process when I/O operations are available on the file descriptor, rather than the default `SIGIO` signal. Doing so also populates `siginfo` to contain extra information about which file descriptor caused the event (`si_fd`) and what events happened on it (`si_band`). The logic around which FD is populated within `si_fd` matches Linux's, which means it has some weird edge cases where that value may not actually refer to a file descriptor that is still valid. This CL also ports extra S/R logic regarding async handler in VFS2. Without this, async I/O handlers aren't properly re-registered after S/R. PiperOrigin-RevId: 345436598
2020-12-03Support partitions for other tests.Adin Scannell
PiperOrigin-RevId: 345399936
2020-12-02Remove FileReadWriteSeeker from vfs.Jamie Liu
Previous experience has shown that these types of wrappers tends to create two kinds of problems: hidden allocations (e.g. each call to FileReadWriteSeeker.Read/Write allocates a usermem.BytesIO on the heap) and hidden lock ordering problems (e.g. VFS1 splice deadlocks). Since this is only needed by fsimpl/verity, move it there. PiperOrigin-RevId: 345377830
2020-12-02Do not unconditionally allocate in kernel.FDTable.setAll().Jamie Liu
`slice := *(*[]unsafe.Pointer)(...)` makes a copy of the slice header, which then escapes because of the conditional `atomic.StorePointer(&f.slice, &slice)` from table expansion. This occurs even when the table doesn't expand, and when it can't (e.g. `close()` => `f.setAll(nil)`). Fix this by avoiding the copy until after table expansion. Before this CL: ``` TEXT pkg/sentry/kernel/kernel.(*FDTable).setAll(SB) pkg/sentry/kernel/fd_table_unsafe.go fd_table_unsafe.go:119 0x7f00005f50e0 64488b0c25f8ffffff MOVQ FS:0xfffffff8, CX fd_table_unsafe.go:119 0x7f00005f50e9 483b6110 CMPQ 0x10(CX), SP fd_table_unsafe.go:119 0x7f00005f50ed 0f864d040000 JBE 0x7f00005f5540 fd_table_unsafe.go:119 0x7f00005f50f3 4883c480 ADDQ $-0x80, SP fd_table_unsafe.go:119 0x7f00005f50f7 48896c2478 MOVQ BP, 0x78(SP) fd_table_unsafe.go:119 0x7f00005f50fc 488d6c2478 LEAQ 0x78(SP), BP fd_table_unsafe.go:120 0x7f00005f5101 488b8424a8000000 MOVQ 0xa8(SP), AX fd_table_unsafe.go:120 0x7f00005f5109 4885c0 TESTQ AX, AX fd_table_unsafe.go:120 0x7f00005f510c 7411 JE 0x7f00005f511f fd_table_unsafe.go:120 0x7f00005f510e 488b8c24b0000000 MOVQ 0xb0(SP), CX fd_table_unsafe.go:120 0x7f00005f5116 4885c9 TESTQ CX, CX fd_table_unsafe.go:120 0x7f00005f5119 0f8500040000 JNE 0x7f00005f551f fd_table_unsafe.go:124 0x7f00005f511f 488d05da115700 LEAQ 0x5711da(IP), AX fd_table_unsafe.go:124 0x7f00005f5126 48890424 MOVQ AX, 0(SP) fd_table_unsafe.go:124 0x7f00005f512a e8d19fa1ff CALL runtime.newobject(SB) fd_table_unsafe.go:124 0x7f00005f512f 488b7c2408 MOVQ 0x8(SP), DI fd_table_unsafe.go:124 0x7f00005f5134 488b842488000000 MOVQ 0x88(SP), AX fd_table_unsafe.go:124 0x7f00005f513c 488b4820 MOVQ 0x20(AX), CX fd_table_unsafe.go:124 0x7f00005f5140 488b5108 MOVQ 0x8(CX), DX fd_table_unsafe.go:124 0x7f00005f5144 488b19 MOVQ 0(CX), BX fd_table_unsafe.go:124 0x7f00005f5147 488b4910 MOVQ 0x10(CX), CX fd_table_unsafe.go:124 0x7f00005f514b 48895708 MOVQ DX, 0x8(DI) fd_table_unsafe.go:124 0x7f00005f514f 48894f10 MOVQ CX, 0x10(DI) fd_table_unsafe.go:124 0x7f00005f5153 833df6e1120100 CMPL $0x0, runtime.writeBarrier(SB) fd_table_unsafe.go:124 0x7f00005f515a 660f1f440000 NOPW 0(AX)(AX*1) fd_table_unsafe.go:124 0x7f00005f5160 0f8589030000 JNE 0x7f00005f54ef fd_table_unsafe.go:124 0x7f00005f5166 48891f MOVQ BX, 0(DI) fd_table_unsafe.go:124 0x7f00005f5169 48897c2470 MOVQ DI, 0x70(SP) fd_table_unsafe.go:127 0x7f00005f516e 8bb424a0000000 MOVL 0xa0(SP), SI fd_table_unsafe.go:127 0x7f00005f5175 39d6 CMPL DX, SI fd_table_unsafe.go:127 0x7f00005f5177 0f8c5f030000 JL 0x7f00005f54dc ... ``` After this CL: ``` TEXT pkg/sentry/kernel/kernel.(*FDTable).setAll(SB) pkg/sentry/kernel/fd_table_unsafe.go fd_table_unsafe.go:119 0x7f00005f50e0 64488b0c25f8ffffff MOVQ FS:0xfffffff8, CX fd_table_unsafe.go:119 0x7f00005f50e9 488d4424e8 LEAQ -0x18(SP), AX fd_table_unsafe.go:119 0x7f00005f50ee 483b4110 CMPQ 0x10(CX), AX fd_table_unsafe.go:119 0x7f00005f50f2 0f868e040000 JBE 0x7f00005f5586 fd_table_unsafe.go:119 0x7f00005f50f8 4881ec98000000 SUBQ $0x98, SP fd_table_unsafe.go:119 0x7f00005f50ff 4889ac2490000000 MOVQ BP, 0x90(SP) fd_table_unsafe.go:119 0x7f00005f5107 488dac2490000000 LEAQ 0x90(SP), BP fd_table_unsafe.go:120 0x7f00005f510f 488b9424c0000000 MOVQ 0xc0(SP), DX fd_table_unsafe.go:120 0x7f00005f5117 660f1f840000000000 NOPW 0(AX)(AX*1) fd_table_unsafe.go:120 0x7f00005f5120 4885d2 TESTQ DX, DX fd_table_unsafe.go:120 0x7f00005f5123 0f8406040000 JE 0x7f00005f552f fd_table_unsafe.go:120 0x7f00005f5129 488b9c24c8000000 MOVQ 0xc8(SP), BX fd_table_unsafe.go:120 0x7f00005f5131 4885db TESTQ BX, BX fd_table_unsafe.go:120 0x7f00005f5134 0f852b040000 JNE 0x7f00005f5565 fd_table_unsafe.go:124 0x7f00005f513a 488bb424a0000000 MOVQ 0xa0(SP), SI fd_table_unsafe.go:124 0x7f00005f5142 488b7e20 MOVQ 0x20(SI), DI fd_table_unsafe.go:127 0x7f00005f5146 4c8b4708 MOVQ 0x8(DI), R8 fd_table_unsafe.go:127 0x7f00005f514a 448b8c24b8000000 MOVL 0xb8(SP), R9 fd_table_unsafe.go:127 0x7f00005f5152 4539c1 CMPL R8, R9 fd_table_unsafe.go:127 0x7f00005f5155 0f8d4a020000 JGE 0x7f00005f53a5 ... ``` PiperOrigin-RevId: 345363242
2020-12-02Make testutil.RandomID safe for concurrent usesZeling Feng
testutil.RandomID was using Rand.Read which is not safe for concurrent use. It caused name conflicts in packetimpact tests when they are run in parallel. Adding a mutex to protect the Rand.Read operation. PiperOrigin-RevId: 345360062
2020-12-02Consolidate most synchronization primitive linknames in the sync package.Jamie Liu
PiperOrigin-RevId: 345359823
2020-12-02Extract ICMPv4/v6 specific stats to their own typesArthur Sfez
This change lets us split the v4 stats from the v6 stats, which will be useful when adding stats for each network endpoint. PiperOrigin-RevId: 345322615
2020-12-02Abandon reassembly of a packet if fragments overlapArthur Sfez
However, receiving duplicated fragments will not cause reassembly to fail. This is what Linux does too: https://github.com/torvalds/linux/blob/38525c6/net/ipv4/inet_fragment.c#L355 PiperOrigin-RevId: 345309546
2020-12-02[netstack] Refactor common utils out of netstack to socket package.Ayush Ranjan
Moved AddressAndFamily() and ConvertAddress() to socket package from netstack. This helps because these utilities are used by sibling netstack packages. Such sibling dependencies can later cause circular dependencies. Common utils shared between siblings should be moved up to the parent. PiperOrigin-RevId: 345275571
2020-12-02Clean up verity tests.Dean Deng
Refactor some utilities and rename some others for clarity. PiperOrigin-RevId: 345247836
2020-12-02[netstack] Add back EndpointInfo struct in tcp.Ayush Ranjan
This was removed in an earlier commit. This should remain as it allows to add tcp-only state to be exposed. PiperOrigin-RevId: 345246155
2020-12-02Add /proc/sys/kernel/sem.Jing Chen
PiperOrigin-RevId: 345178956
2020-12-01Deflake stack_test.TestRouterSolicitationGhanan Gowripalan
...by using the fake clock. TestRouterSolicitation no longer runs its sub-tests in parallel now that the sub-tests are not long-running - the fake clock simulates time moving forward. PiperOrigin-RevId: 345165794
2020-12-01Correctly lock when listing neighbor entriesGhanan Gowripalan
PiperOrigin-RevId: 345162450
2020-12-01Track join count in multicast group protocol stateGhanan Gowripalan
Before this change, the join count and the state for IGMP/MLD was held across different types which required multiple locks to be held when accessing a multicast group's state. Bug #4682, #4861 Fixes #4916 PiperOrigin-RevId: 345019091
2020-11-30Fix typo in ptrace documentation.Dean Deng
PiperOrigin-RevId: 344958513
2020-11-30Do not start a ContainerExec twiceZeling Feng
ContainerExecStart and ContainerExecAttach both call the /exec/id/start API endpoint. PiperOrigin-RevId: 344946627
2020-11-30Fix deadlock in UDP handleControlPacket path.Bhasker Hariharan
Fixing the sendto deadlock exposed yet another deadlock where a lock inversion occurs on the handleControlPacket path where e.mu and demuxer.epsByNIC.mu are acquired in reverse order from say when RegisterTransportEndpoint is called in endpoint.Connect(). This fix sidesteps the issue by just making endpoint.state an atomic and gets rid of the need to acquire e.mu in e.HandleControlPacket. PiperOrigin-RevId: 344939895
2020-11-30Add more fragment reassembly testsToshi Kikuchi
These tests check if a maximum-sized (64k) packet is reassembled without receiving a fragment with MF flag set to zero. PiperOrigin-RevId: 344913172
2020-11-30Perform IGMP/MLD when the NIC is enabled/disabledGhanan Gowripalan
Test: ip_test.TestMGPWithNICLifecycle Bug #4682, #4861 PiperOrigin-RevId: 344888091
2020-11-30Ensure containerd is used from installed location.Adin Scannell
Currently, if containerd is installed locally via tools/installers/containerd, then it will not necessarily be used if containerd is installed in the system path. This means that the existing containerd tests are all likely broken. Also, use libbtrfs-dev instead of btrfs-tools, which is not actually required. PiperOrigin-RevId: 344879109
2020-11-27Don't add a temporary address to send DAD/RS packetsGhanan Gowripalan
Bug #4803 PiperOrigin-RevId: 344553664
2020-11-26[netstack] Add SOL_TCP options to SocketOptions.Ayush Ranjan
Ports the following options: - TCP_NODELAY - TCP_CORK - TCP_QUICKACK Also deletes the {Get/Set}SockOptBool interface methods from all implementations PiperOrigin-RevId: 344378824
2020-11-25[netstack] Add SOL_IP and SOL_IPV6 options to SocketOptions.Ayush Ranjan
We will use SocketOptions for all kinds of options, not just SOL_SOCKET options because (1) it is consistent with Linux which defines all option variables on the top level socket struct, (2) avoid code complexity. Appropriate checks have been added for matching option level to the endpoint type. Ported the following options to this new utility: - IP_MULTICAST_LOOP - IP_RECVTOS - IPV6_RECVTCLASS - IP_PKTINFO - IP_HDRINCL - IPV6_V6ONLY Changes in behavior (these are consistent with what Linux does AFAICT): - Now IP_MULTICAST_LOOP can be set for TCP (earlier it was a noop) but does not affect the endpoint itself. - We can now getsockopt IP_HDRINCL (earlier we would get an error). - Now we return ErrUnknownProtocolOption if SOL_IP or SOL_IPV6 options are used on unix sockets. - Now we return ErrUnknownProtocolOption if SOL_IPV6 options are used on non AF_INET6 endpoints. This change additionally makes the following modifications: - Add State() uint32 to commonEndpoint because both tcpip.Endpoint and transport.Endpoint interfaces have it. It proves to be quite useful. - Gets rid of SocketOptionsHandler.IsListening(). It was an anomaly as it was not a handler. It is now implemented on netstack itself. - Gets rid of tcp.endpoint.EndpointInfo and directly embeds stack.TransportEndpointInfo. There was an unnecessary level of embedding which served no purpose. - Removes some checks dual_stack_test.go that used the errors from GetSockOptBool(tcpip.V6OnlyOption) to confirm some state. This is not consistent with the new design and also seemed to be testing the implementation instead of behavior. PiperOrigin-RevId: 344354051
2020-11-25Support listener-side MLDv1Ghanan Gowripalan
...as defined by RFC 2710. Querier (router)-side MLDv1 is not yet supported. The core state machine is shared with IGMPv2. This is guarded behind a flag (ipv6.Options.MLDEnabled). Tests: ip_test.TestMGP* Bug #4861 PiperOrigin-RevId: 344344095
2020-11-25Make stack.Route safe to access concurrentlyGhanan Gowripalan
Multiple goroutines may use the same stack.Route concurrently so the stack.Route should make sure that any functions called on it are thread-safe. Fixes #4073 PiperOrigin-RevId: 344320491
2020-11-24Correctly lock when removing neighbor entriesSam Balana
Fix a panic when two entries in Failed state are removed at the same time. PiperOrigin-RevId: 344143777
2020-11-24Report correct pointer value for "bad next header" ICMP errorJulian Elischer
Because the code handles a bad header as "payload" right up to the last moment we need to make sure payload handling does not remove the error information. Fixes #4909 PiperOrigin-RevId: 344141690
2020-11-24Track number of packets queued to Failed neighborsSam Balana
Add a NIC-specific neighbor table statistic so we can determine how many packets have been queued to Failed neighbors, indicating an unhealthy local network. This change assists us to debug in-field issues where subsequent traffic to a neighbor fails. Fixes #4819 PiperOrigin-RevId: 344131119
2020-11-24Extract IGMPv2 core state machineGhanan Gowripalan
The IGMPv2 core state machine can be shared with MLDv1 since they are almost identical, ignoring specific addresses, constants and packets. Bug #4682, #4861 PiperOrigin-RevId: 344102615
2020-11-24Remove outdated TODO.Dean Deng
The bug has been fixed. PiperOrigin-RevId: 344088206
2020-11-24Deduplicate code in ipv6.protocolGhanan Gowripalan
PiperOrigin-RevId: 344009602
2020-11-23Use time.Duration for IGMP Max Response Time fieldGhanan Gowripalan
Bug #4682 PiperOrigin-RevId: 343993297
2020-11-23Don't evict gofer.dentries with inotify watches before saving.Jamie Liu
PiperOrigin-RevId: 343959348
2020-11-23Fix link against runtime.goyield.Adin Scannell
This function does not exist in Go 1.13. We need to add an adaptor to build against Go 1.13, which is the default Ubuntu version. PiperOrigin-RevId: 343929132
2020-11-23Fail gracefully if Docker is not configured with ipv6.Adin Scannell
PiperOrigin-RevId: 343927315
2020-11-20Refactor verity test for readabilityChong Cai
1. Add getD/getDentry methods to avoid long casting line in each test 2. Factor all calls to vfs.OpenAt/UnlinkAt/RenameAt on lower filesystem to their own method (for both lower file and lower Merkle file) so the tests are more readable 3. Add descriptive test names for delete/remove tests PiperOrigin-RevId: 343540202
2020-11-19Perform IGMPv2 when joining IPv4 multicast groupsRyan Heacock
Added headers, stats, checksum parsing capabilities from RFC 2236 describing IGMPv2. IGMPv2 state machine is implemented for each condition, sending and receiving IGMP Membership Reports and Leave Group messages with backwards compatibility with IGMPv1 routers. Test: * Implemented igmp header parser and checksum calculator in header/igmp_test.go * ipv4/igmp_test.go tests incoming and outgoing IGMP messages and pathways. * Added unit test coverage for IGMPv2 RFC behavior + IGMPv1 backwards compatibility in ipv4/igmp_test.go. Fixes #4682 PiperOrigin-RevId: 343408809
2020-11-19Remove racy stringification of socket fds from /proc/net/*.Rahat Mahmood
PiperOrigin-RevId: 343398191
2020-11-19Add a helpful message in stuck task logs.Dean Deng
This also makes the formatting nicer; the caller will add ":\n" to the end of the message. PiperOrigin-RevId: 343397099
2020-11-19Add types to parse MLD messagesGhanan Gowripalan
Preparing for upcoming CLs that add MLD functionality. Bug #4861 Test: header.TestMLD PiperOrigin-RevId: 343391556
2020-11-19Fix possible panic due to bad data.Julian Elischer
Found by a Fuzzer. Reported-by: syzbot+619fa10be366d553ef7f@syzkaller.appspotmail.com PiperOrigin-RevId: 343379575
2020-11-19Propagate IP address prefix from host to netstackFabricio Voznika
Closes #4022 PiperOrigin-RevId: 343378647
2020-11-19Require sync.Mutex to lock and unlock from the same goroutineMichael Pratt
We would like to track locks ordering to detect ordering violations. Detecting violations is much simpler if mutexes must be unlocked by the same goroutine that locked them. Thus, as a first step to tracking lock ordering, add this lock/unlock requirement to gVisor's sync.Mutex. This is more strict than the Go standard library's sync.Mutex, but initial testing indicates only a single lock that is used across goroutines. The new sync.CrossGoroutineMutex relaxes the requirement (but will not provide lock order checking). Due to the additional overhead, enforcement is only enabled with the "checklocks" build tag. Build with this tag using: bazel build --define=gotags=checklocks ... From my spot-checking, this has no changed inlining properties when disabled. Updates #4804 PiperOrigin-RevId: 343370200
2020-11-19Don't hold AddressEndpoints for multicast addressesGhanan Gowripalan
Group addressable endpoints can simply check if it has joined the multicast group without maintaining address endpoints. This also helps remove the dependency on AddressableEndpoint from GroupAddressableEndpoint. Now that group addresses are not tracked with address endpoints, we can avoid accidentally obtaining a route with a multicast local address. PiperOrigin-RevId: 343336912
2020-11-19Remove unused NoChecksumOptionBruno Dal Bo
Migration to unified socket options left this behind. PiperOrigin-RevId: 343305434
2020-11-19Fix some code not using NewPacketBuffer for creating a PacketBuffer.Ting-Yu Wang
PiperOrigin-RevId: 343299993
2020-11-18[vfs] kernfs: Do not panic if destroyed dentry is cached.Ayush Ranjan
If a kernfs user does not cache dentries, then cacheLocked will destroy the dentry. The current DecRef implementation will be racy in this case as the following can happen: - Goroutine 1 calls DecRef and decreases ref count from 1 to 0. - Goroutine 2 acquires d.fs.mu for reading and calls IncRef and increasing the ref count from 0 to 1. - Goroutine 2 releases d.fs.mu and calls DecRef again decreasing ref count from 1 to 0. - Goroutine 1 now acquires d.fs.mu and calls cacheLocked which destroys the dentry. - Goroutine 2 now acquires d.fs.mu and calls cacheLocked to find that the dentry is already destroyed! Earlier we would panic in this case, we could instead just return instead of adding complexity to handle this race. This is similar to what the gofer client does. We do not want to lock d.fs.mu in the case that the filesystem caches dentries (common case as procfs and sysfs do this) to prevent congestion due to lock contention. PiperOrigin-RevId: 343229496
2020-11-18[netstack] Move SO_KEEPALIVE and SO_ACCEPTCONN option to SocketOptions.Ayush Ranjan
PiperOrigin-RevId: 343217712