summaryrefslogtreecommitdiffhomepage
path: root/pkg
AgeCommit message (Collapse)Author
2020-12-12Reduce the memory overhead in IP fragment managementToshi Kikuchi
- Deep-copy pkt.Data and hold it instead of shallow-copy (vv.Clone). This allows the pkt's backing array, which includes the header portion, to be freed. - Remove fragHeap. The fragments are now held in holes struct instead. - Stop reserving the initial capacity of holes slice. PiperOrigin-RevId: 347198744
2020-12-12Introduce IPv6 extension header serialization facilitiesBruno Dal Bo
Adds IPv6 extension header serializer and Hop by Hop options serializer. Add RouterAlert option serializer and use it in MLD. Fixed #4996 Startblock: has LGTM from marinaciocea and then add reviewer ghanan PiperOrigin-RevId: 347174537
2020-12-11Internal change.gVisor bot
PiperOrigin-RevId: 347091372
2020-12-11Make fixes to vfs2 leak checking.Dean Deng
PiperOrigin-RevId: 347089828
2020-12-11Add runsc symbolize command.Dean Deng
This command takes instruction pointers from stdin and converts them into their corresponding file names and line/column numbers in the runsc source code. The inputs are not interpreted as actual addresses, but as synthetic values that are exposed through /sys/kernel/debug/kcov. One can extract coverage information from kcov and translate those values into locations in the source code by running symbolize on the same runsc binary. This will allow us to generate syzkaller coverage reports. PiperOrigin-RevId: 347089624
2020-12-11Fix panic when IPv4 address is used in sendmsg for IPv6 socketsNayana Bidari
We do not rely on error for getsockopt options(which have boolean values) anymore. This will cause issue in sendmsg where we used to return error for IPV6_V6Only option. Fix the panic by returning error (for sockets other than TCP and UDP) if the address does not match the type(AF_INET/AF_INET6) of the socket. PiperOrigin-RevId: 347063838
2020-12-11Remove existing nogo exceptions.Adin Scannell
PiperOrigin-RevId: 347047550
2020-12-11[netstack] Decouple tcpip.ControlMessages from the IP control messges.Ayush Ranjan
tcpip.ControlMessages can not contain Linux specific structures which makes it painful to convert back and forth from Linux to tcpip back to Linux when passing around control messages in hostinet and raw sockets. Now we convert to the Linux version of the control message as soon as we are out of tcpip. PiperOrigin-RevId: 347027065
2020-12-11Make semctl IPC_INFO cmd return the index of highest used entry.Jing Chen
PiperOrigin-RevId: 346973338
2020-12-10Change merkle root file name to avoid collisionChong Cai
PiperOrigin-RevId: 346923826
2020-12-10Disable host reassembly for fragments.Bhasker Hariharan
fdbased endpoint was enabling fragment reassembly on the host AF_PACKET socket to ensure that fragments are delivered inorder to the right dispatcher. But this prevents fragments from being delivered to gvisor at all and makes testing of gvisor's fragment reassembly code impossible. The potential impact from this is minimal since IP Fragmentation is not really that prevelant and in cases where we do get fragments we may deliver the fragment out of order to the TCP layer as multiple network dispatchers may process the fragments and deliver a reassembled fragment after the next packet has been delivered to the TCP endpoint. While not desirable I believe the impact from this is minimal due to low prevalence of fragmentation. Also removed PktType and Hatype fields when binding the socket as these are not used when binding. Its just confusing to have them specified. See: https://man7.org/linux/man-pages/man7/packet.7.html "Fields used for binding are sll_family (should be AF_PACKET), sll_protocol, and sll_ifindex." Fixes #5055 PiperOrigin-RevId: 346919439
2020-12-10Use specified source address for IGMP/MLD packetsGhanan Gowripalan
This change also considers interfaces and network endpoints enabled up up to the point all work to disable them are complete. This was needed so that protocols can perform shutdown work while being disabled (e.g. sending a packet which requires the endpoint to be enabled to obtain a source address). Bug #4682, #4861 Fixes #4888 Startblock: has LGTM from peterjohnston and then add reviewer brunodalbo PiperOrigin-RevId: 346869702
2020-12-09Add support for IP_RECVORIGDSTADDR IP option.Bhasker Hariharan
Fixes #5004 PiperOrigin-RevId: 346643745
2020-12-09Add //pkg/sync:generic_atomicptrmap.Jamie Liu
AtomicPtrMap is a generic concurrent map from arbitrary keys to arbitrary pointer values. Benchmarks: name time/op StoreDelete/RWMutexMap-12 335ns ± 1% StoreDelete/SyncMap-12 705ns ± 3% StoreDelete/AtomicPtrMap-12 287ns ± 4% StoreDelete/AtomicPtrMapSharded-12 289ns ± 1% LoadOrStoreDelete/RWMutexMap-12 342ns ± 2% LoadOrStoreDelete/SyncMap-12 662ns ± 2% LoadOrStoreDelete/AtomicPtrMap-12 290ns ± 7% LoadOrStoreDelete/AtomicPtrMapSharded-12 293ns ± 2% LookupPositive/RWMutexMap-12 101ns ±26% LookupPositive/SyncMap-12 202ns ± 2% LookupPositive/AtomicPtrMap-12 71.1ns ± 2% LookupPositive/AtomicPtrMapSharded-12 73.2ns ± 1% LookupNegative/RWMutexMap-12 119ns ± 1% LookupNegative/SyncMap-12 154ns ± 1% LookupNegative/AtomicPtrMap-12 84.7ns ± 3% LookupNegative/AtomicPtrMapSharded-12 86.8ns ± 1% Concurrent/FixedKeys_1PercentWrites_RWMutexMap-12 1.32µs ± 2% Concurrent/FixedKeys_1PercentWrites_SyncMap-12 52.7ns ±10% Concurrent/FixedKeys_1PercentWrites_AtomicPtrMap-12 31.8ns ±20% Concurrent/FixedKeys_1PercentWrites_AtomicPtrMapSharded-12 24.0ns ±15% Concurrent/FixedKeys_10PercentWrites_RWMutexMap-12 860ns ± 3% Concurrent/FixedKeys_10PercentWrites_SyncMap-12 68.8ns ±20% Concurrent/FixedKeys_10PercentWrites_AtomicPtrMap-12 98.6ns ± 7% Concurrent/FixedKeys_10PercentWrites_AtomicPtrMapSharded-12 42.0ns ±25% Concurrent/FixedKeys_50PercentWrites_RWMutexMap-12 1.17µs ± 3% Concurrent/FixedKeys_50PercentWrites_SyncMap-12 136ns ±34% Concurrent/FixedKeys_50PercentWrites_AtomicPtrMap-12 286ns ± 3% Concurrent/FixedKeys_50PercentWrites_AtomicPtrMapSharded-12 115ns ±35% Concurrent/ChangingKeys_1PercentWrites_RWMutexMap-12 1.27µs ± 2% Concurrent/ChangingKeys_1PercentWrites_SyncMap-12 5.01µs ± 3% Concurrent/ChangingKeys_1PercentWrites_AtomicPtrMap-12 38.1ns ± 3% Concurrent/ChangingKeys_1PercentWrites_AtomicPtrMapSharded-12 22.6ns ± 2% Concurrent/ChangingKeys_10PercentWrites_RWMutexMap-12 1.08µs ± 2% Concurrent/ChangingKeys_10PercentWrites_SyncMap-12 5.97µs ± 1% Concurrent/ChangingKeys_10PercentWrites_AtomicPtrMap-12 390ns ± 2% Concurrent/ChangingKeys_10PercentWrites_AtomicPtrMapSharded-12 93.6ns ± 1% Concurrent/ChangingKeys_50PercentWrites_RWMutexMap-12 1.77µs ± 2% Concurrent/ChangingKeys_50PercentWrites_SyncMap-12 8.07µs ± 2% Concurrent/ChangingKeys_50PercentWrites_AtomicPtrMap-12 1.61µs ± 2% Concurrent/ChangingKeys_50PercentWrites_AtomicPtrMapSharded-12 386ns ± 1% Updates #231 PiperOrigin-RevId: 346614776
2020-12-09[netstack] Make tcpip.Error savable.Ayush Ranjan
Earlier we could not save tcpip.Error objects in structs because upon restore the constant's address changes in netstack's error translation map and translating the error would panic because the map is based on the address of the tcpip.Error instead of the error itself. Now I made that translations map use the error message as key instead of the address. Added relevant synchronization mechanisms to protect the structure and initialize it upon restore. PiperOrigin-RevId: 346590485
2020-12-09Do not perform IGMP/MLD on loopback interfacesGhanan Gowripalan
The loopback interface will never have any neighbouring nodes so advertising its interest in multicast groups is unnecessary. Bug #4682, #4861 Startblock: has LGTM from asfez and then add reviewer tamird PiperOrigin-RevId: 346587604
2020-12-09Cap UDP payload size to length informed in UDP headerBruno Dal Bo
startblock: has LGTM from peterjohnston and then add reviewer ghanan,tamird PiperOrigin-RevId: 346565589
2020-12-09Prepare for supporting cross compilation.Andrei Vagin
PiperOrigin-RevId: 346496532
2020-12-09export MountTempDirectoryZeling Feng
PiperOrigin-RevId: 346487763
2020-12-07Fix error handling on fusefs mount.Rahat Mahmood
Don't propagate arbitrary golang errors up from fusefs because errors that don't map to an errno result in a sentry panic. Reported-by: syzbot+697cb635346e456fddfc@syzkaller.appspotmail.com PiperOrigin-RevId: 346220306
2020-12-07Export IGMP statsArthur Sfez
PiperOrigin-RevId: 346197760
2020-12-07Remove stale commentSam Balana
Removes comment lines about MaxUnsolicitedReportDelay. This is already documented in the comment for GenericMulticastProtocolOptions. PiperOrigin-RevId: 346185053
2020-12-07Merge pull request #4908 from lubinszARM:pr_kvm_ext_dabtgVisor bot
PiperOrigin-RevId: 346143528
2020-12-07Merge pull request #4874 from zhlhahaha:2022gVisor bot
PiperOrigin-RevId: 346134026
2020-12-07Remove p9.fidRef.openedMuMichael Pratt
openedMu has lock ordering violations. Most locks go through OpenedFlag(), which is usually taken after renameMu and opMu. On the other hand, Tlopen takes openedMu before renameMu and opMu (via safelyRead). Resolving this violation is simple: just drop openedMu. The opened and openFlags fields are already protected by opMu in most cases, renameMu (for write) in one case (via safelyGlobal), and only in doWalk by neither. This is a bit ugly because opMu is supposed to be a "semantic" lock, but it works. I'm open to other suggestions. Note that doWalk has a race condition where a FID may open after the open check but before actually walking. This race existed before this change as well; it is not clear if it is problematic. PiperOrigin-RevId: 346108483
2020-12-07Support icmpv6 transport protocolPeter Johnston
PiperOrigin-RevId: 346101076
2020-12-05Fix zero receive window advertisements.Mithun Iyer
With the recent changes db36d948fa63ce950d94a5e8e9ebc37956543661, we try to balance the receive window advertisements between payload lengths vs segment overhead length. This works fine when segment size are much higher than the overhead, but not otherwise. In cases where the segment length is smaller than the segment overhead, we may end up not advertising zero receive window for long time and end up tail-dropping segments. This is especially pronounced when application socket reads are slow or stopped. In this change we do not grow the right edge of the receive window for smaller segment sizes similar to Linux. Also, we keep track of the socket buffer usage and let the window grow if the application is actively reading data. Fixes #4903 PiperOrigin-RevId: 345832012
2020-12-04Remove stack.ReadOnlyAddressableEndpointStateGhanan Gowripalan
Startblock: has LGTM from asfez and then add reviewer tamird PiperOrigin-RevId: 345815146
2020-12-04Overlay runsc regular file mounts with regular files.Jamie Liu
Fixes #4991 PiperOrigin-RevId: 345800333
2020-12-04Allow use of SeqAtomic with pointer-containing types.Jamie Liu
Per runtime.memmove, pointers are always copied atomically, as this is required by the GC. (Also, the init() safety check doesn't work because it gets renamed to <prefix>init() by template instantiation.) PiperOrigin-RevId: 345800302
2020-12-04Introduce IPv4 options serializer and add RouterAlert to IGMPBruno Dal Bo
PiperOrigin-RevId: 345701623
2020-12-04Avoid fallocate(FALLOC_FL_PUNCH_HOLE) when ManualZeroing is in effect.Jamie Liu
PiperOrigin-RevId: 345696124
2020-12-04Require sync.RWMutex to lock and unlock from the same goroutineMichael Pratt
This is the RWMutex equivalent to the preceding sync.Mutex CL. Updates #4804 PiperOrigin-RevId: 345681051
2020-12-03Implement command IPC_INFO for semctl.Jing Chen
PiperOrigin-RevId: 345589628
2020-12-03Update containerd to 1.3.9Fabricio Voznika
PiperOrigin-RevId: 345564927
2020-12-03Internal change.gVisor bot
PiperOrigin-RevId: 345538979
2020-12-03Make `stack.Route` thread safePeter Johnston
Currently we rely on the user to take the lock on the endpoint that owns the route, in order to modify it safely. We can instead move `Route.RemoteLinkAddress` under `Route`'s mutex, and allow non-locking and thread-safe access to other fields of `Route`. PiperOrigin-RevId: 345461586
2020-12-03Implement `fcntl` options `F_GETSIG` and `F_SETSIG`.Etienne Perot
These options allow overriding the signal that gets sent to the process when I/O operations are available on the file descriptor, rather than the default `SIGIO` signal. Doing so also populates `siginfo` to contain extra information about which file descriptor caused the event (`si_fd`) and what events happened on it (`si_band`). The logic around which FD is populated within `si_fd` matches Linux's, which means it has some weird edge cases where that value may not actually refer to a file descriptor that is still valid. This CL also ports extra S/R logic regarding async handler in VFS2. Without this, async I/O handlers aren't properly re-registered after S/R. PiperOrigin-RevId: 345436598
2020-12-03Support partitions for other tests.Adin Scannell
PiperOrigin-RevId: 345399936
2020-12-02Remove FileReadWriteSeeker from vfs.Jamie Liu
Previous experience has shown that these types of wrappers tends to create two kinds of problems: hidden allocations (e.g. each call to FileReadWriteSeeker.Read/Write allocates a usermem.BytesIO on the heap) and hidden lock ordering problems (e.g. VFS1 splice deadlocks). Since this is only needed by fsimpl/verity, move it there. PiperOrigin-RevId: 345377830
2020-12-02Do not unconditionally allocate in kernel.FDTable.setAll().Jamie Liu
`slice := *(*[]unsafe.Pointer)(...)` makes a copy of the slice header, which then escapes because of the conditional `atomic.StorePointer(&f.slice, &slice)` from table expansion. This occurs even when the table doesn't expand, and when it can't (e.g. `close()` => `f.setAll(nil)`). Fix this by avoiding the copy until after table expansion. Before this CL: ``` TEXT pkg/sentry/kernel/kernel.(*FDTable).setAll(SB) pkg/sentry/kernel/fd_table_unsafe.go fd_table_unsafe.go:119 0x7f00005f50e0 64488b0c25f8ffffff MOVQ FS:0xfffffff8, CX fd_table_unsafe.go:119 0x7f00005f50e9 483b6110 CMPQ 0x10(CX), SP fd_table_unsafe.go:119 0x7f00005f50ed 0f864d040000 JBE 0x7f00005f5540 fd_table_unsafe.go:119 0x7f00005f50f3 4883c480 ADDQ $-0x80, SP fd_table_unsafe.go:119 0x7f00005f50f7 48896c2478 MOVQ BP, 0x78(SP) fd_table_unsafe.go:119 0x7f00005f50fc 488d6c2478 LEAQ 0x78(SP), BP fd_table_unsafe.go:120 0x7f00005f5101 488b8424a8000000 MOVQ 0xa8(SP), AX fd_table_unsafe.go:120 0x7f00005f5109 4885c0 TESTQ AX, AX fd_table_unsafe.go:120 0x7f00005f510c 7411 JE 0x7f00005f511f fd_table_unsafe.go:120 0x7f00005f510e 488b8c24b0000000 MOVQ 0xb0(SP), CX fd_table_unsafe.go:120 0x7f00005f5116 4885c9 TESTQ CX, CX fd_table_unsafe.go:120 0x7f00005f5119 0f8500040000 JNE 0x7f00005f551f fd_table_unsafe.go:124 0x7f00005f511f 488d05da115700 LEAQ 0x5711da(IP), AX fd_table_unsafe.go:124 0x7f00005f5126 48890424 MOVQ AX, 0(SP) fd_table_unsafe.go:124 0x7f00005f512a e8d19fa1ff CALL runtime.newobject(SB) fd_table_unsafe.go:124 0x7f00005f512f 488b7c2408 MOVQ 0x8(SP), DI fd_table_unsafe.go:124 0x7f00005f5134 488b842488000000 MOVQ 0x88(SP), AX fd_table_unsafe.go:124 0x7f00005f513c 488b4820 MOVQ 0x20(AX), CX fd_table_unsafe.go:124 0x7f00005f5140 488b5108 MOVQ 0x8(CX), DX fd_table_unsafe.go:124 0x7f00005f5144 488b19 MOVQ 0(CX), BX fd_table_unsafe.go:124 0x7f00005f5147 488b4910 MOVQ 0x10(CX), CX fd_table_unsafe.go:124 0x7f00005f514b 48895708 MOVQ DX, 0x8(DI) fd_table_unsafe.go:124 0x7f00005f514f 48894f10 MOVQ CX, 0x10(DI) fd_table_unsafe.go:124 0x7f00005f5153 833df6e1120100 CMPL $0x0, runtime.writeBarrier(SB) fd_table_unsafe.go:124 0x7f00005f515a 660f1f440000 NOPW 0(AX)(AX*1) fd_table_unsafe.go:124 0x7f00005f5160 0f8589030000 JNE 0x7f00005f54ef fd_table_unsafe.go:124 0x7f00005f5166 48891f MOVQ BX, 0(DI) fd_table_unsafe.go:124 0x7f00005f5169 48897c2470 MOVQ DI, 0x70(SP) fd_table_unsafe.go:127 0x7f00005f516e 8bb424a0000000 MOVL 0xa0(SP), SI fd_table_unsafe.go:127 0x7f00005f5175 39d6 CMPL DX, SI fd_table_unsafe.go:127 0x7f00005f5177 0f8c5f030000 JL 0x7f00005f54dc ... ``` After this CL: ``` TEXT pkg/sentry/kernel/kernel.(*FDTable).setAll(SB) pkg/sentry/kernel/fd_table_unsafe.go fd_table_unsafe.go:119 0x7f00005f50e0 64488b0c25f8ffffff MOVQ FS:0xfffffff8, CX fd_table_unsafe.go:119 0x7f00005f50e9 488d4424e8 LEAQ -0x18(SP), AX fd_table_unsafe.go:119 0x7f00005f50ee 483b4110 CMPQ 0x10(CX), AX fd_table_unsafe.go:119 0x7f00005f50f2 0f868e040000 JBE 0x7f00005f5586 fd_table_unsafe.go:119 0x7f00005f50f8 4881ec98000000 SUBQ $0x98, SP fd_table_unsafe.go:119 0x7f00005f50ff 4889ac2490000000 MOVQ BP, 0x90(SP) fd_table_unsafe.go:119 0x7f00005f5107 488dac2490000000 LEAQ 0x90(SP), BP fd_table_unsafe.go:120 0x7f00005f510f 488b9424c0000000 MOVQ 0xc0(SP), DX fd_table_unsafe.go:120 0x7f00005f5117 660f1f840000000000 NOPW 0(AX)(AX*1) fd_table_unsafe.go:120 0x7f00005f5120 4885d2 TESTQ DX, DX fd_table_unsafe.go:120 0x7f00005f5123 0f8406040000 JE 0x7f00005f552f fd_table_unsafe.go:120 0x7f00005f5129 488b9c24c8000000 MOVQ 0xc8(SP), BX fd_table_unsafe.go:120 0x7f00005f5131 4885db TESTQ BX, BX fd_table_unsafe.go:120 0x7f00005f5134 0f852b040000 JNE 0x7f00005f5565 fd_table_unsafe.go:124 0x7f00005f513a 488bb424a0000000 MOVQ 0xa0(SP), SI fd_table_unsafe.go:124 0x7f00005f5142 488b7e20 MOVQ 0x20(SI), DI fd_table_unsafe.go:127 0x7f00005f5146 4c8b4708 MOVQ 0x8(DI), R8 fd_table_unsafe.go:127 0x7f00005f514a 448b8c24b8000000 MOVL 0xb8(SP), R9 fd_table_unsafe.go:127 0x7f00005f5152 4539c1 CMPL R8, R9 fd_table_unsafe.go:127 0x7f00005f5155 0f8d4a020000 JGE 0x7f00005f53a5 ... ``` PiperOrigin-RevId: 345363242
2020-12-02Make testutil.RandomID safe for concurrent usesZeling Feng
testutil.RandomID was using Rand.Read which is not safe for concurrent use. It caused name conflicts in packetimpact tests when they are run in parallel. Adding a mutex to protect the Rand.Read operation. PiperOrigin-RevId: 345360062
2020-12-02Consolidate most synchronization primitive linknames in the sync package.Jamie Liu
PiperOrigin-RevId: 345359823
2020-12-02Extract ICMPv4/v6 specific stats to their own typesArthur Sfez
This change lets us split the v4 stats from the v6 stats, which will be useful when adding stats for each network endpoint. PiperOrigin-RevId: 345322615
2020-12-02Abandon reassembly of a packet if fragments overlapArthur Sfez
However, receiving duplicated fragments will not cause reassembly to fail. This is what Linux does too: https://github.com/torvalds/linux/blob/38525c6/net/ipv4/inet_fragment.c#L355 PiperOrigin-RevId: 345309546
2020-12-02[netstack] Refactor common utils out of netstack to socket package.Ayush Ranjan
Moved AddressAndFamily() and ConvertAddress() to socket package from netstack. This helps because these utilities are used by sibling netstack packages. Such sibling dependencies can later cause circular dependencies. Common utils shared between siblings should be moved up to the parent. PiperOrigin-RevId: 345275571
2020-12-02Clean up verity tests.Dean Deng
Refactor some utilities and rename some others for clarity. PiperOrigin-RevId: 345247836
2020-12-02[netstack] Add back EndpointInfo struct in tcp.Ayush Ranjan
This was removed in an earlier commit. This should remain as it allows to add tcp-only state to be exposed. PiperOrigin-RevId: 345246155
2020-12-02Add /proc/sys/kernel/sem.Jing Chen
PiperOrigin-RevId: 345178956
2020-12-01Deflake stack_test.TestRouterSolicitationGhanan Gowripalan
...by using the fake clock. TestRouterSolicitation no longer runs its sub-tests in parallel now that the sub-tests are not long-running - the fake clock simulates time moving forward. PiperOrigin-RevId: 345165794