summaryrefslogtreecommitdiffhomepage
path: root/pkg
AgeCommit message (Collapse)Author
2021-11-08Add reference counting to packet buffers.Lucas Manning
PiperOrigin-RevId: 408426639
2021-11-08Replace references of ConnectableEndpoint with BoundEndpoint.Ayush Ranjan
PiperOrigin-RevId: 408366542
2021-11-08Simplify {Un}MarshalUnsafeSlice method signatures.Ayush Ranjan
Earlier this function was returning (int, error) much like the Copy{In/Out} methods. The returned error was always nil. The returned int was never used. Instead make it returned the shifted buffer which is more useful. Updates #6450 PiperOrigin-RevId: 408268327
2021-11-05Make {Un}Marshal{Bytes/Unsafe} return remaining buffer.Ayush Ranjan
Change marshal.Marshallable method signatures to return the remaining buffer. This makes it easier to implement these method manually. Without this, we would have to manually do buffer shifting which is error prone. tools/go_marshal/test:benchmark test does not show change in performance. Additionally fixed some marshalling bugs in fsimpl/fuse. Updated multiple callpoints to get rid of redundant slice indexing work and simplified code using this new signature. Updates #6450 PiperOrigin-RevId: 407857019
2021-11-05Fix unfair comparison to unbuffered channels in sleep_test.go.Jamie Liu
Consider the following benchmark, which is equivalent to BenchmarkSleeperWaitOnSingleSelect with names changed to more closely reflect the behavior of BenchmarkGoWaitOnSingleSelect: var ( empty Sleeper emptyCond Waker full Sleeper fullCond Waker ) empty.AddWaker(&emptyCond) full.AddWaker(&fullCond) go func() { for i := 0; i < b.N; i++ { empty.Fetch(true) fullCond.Assert() } }() for i := 0; i < b.N; i++ { emptyCond.Assert() full.Fetch(true) } The unfairness arises because runtime.chansend and runtime.chanrecv don't actually work this way. If runtime.chansend blocks, it has already enqueued the element to be sent on runtime.hchan.sendq, which runtime.chanrecv dequeues before calling goready(); in sleep-like terms, by the time empty.Fetch() returns, fullCond.Assert() has already happened and been fetched by the other goroutine. The same property applies to runtime.chanrecv/runtime.hchan.recvq. This property has no correspondence to the actual usage of the sleep package, so change the channel benchmarks to explicitly exchange control using buffered channels instead. Also remove some stale comments and align the syncevent benchmarks with the sleep ones. BenchmarkSleeperWaitOnSingleSelect BenchmarkSleeperWaitOnSingleSelect-12 2118603 472.5 ns/op BenchmarkGoWaitOnSingleSelect BenchmarkGoWaitOnSingleSelect-12 2224262 517.7 ns/op BenchmarkSleeperWaitOnMultiSelect BenchmarkSleeperWaitOnMultiSelect-12 2630569 459.8 ns/op BenchmarkGoWaitOnMultiSelect BenchmarkGoWaitOnMultiSelect-12 807918 1312 ns/op BenchmarkWaiterPingPong BenchmarkWaiterPingPong-12 2955579 385.8 ns/op BenchmarkSleeperPingPong BenchmarkSleeperPingPong-12 2454367 474.3 ns/op BenchmarkChannelPingPong BenchmarkChannelPingPong-12 2302662 513.5 ns/op BenchmarkWaiterPingPongMulti BenchmarkWaiterPingPongMulti-12 3023676 388.8 ns/op BenchmarkSleeperPingPongMulti BenchmarkSleeperPingPongMulti-12 2574064 471.5 ns/op BenchmarkChannelPingPongMulti BenchmarkChannelPingPongMulti-12 1000000 1088 ns/op PiperOrigin-RevId: 407760956
2021-11-04Remove id from sleep.Sleeper API.Adin Scannell
In a subsequent change, the Sleeper API will be plumbed through and used for arbitrary task wakeups. This requires a non-static association of Wakers and Sleepers, which means that a fixed ID no longer works. This is a relatively simple change that removes the ID from the Waker association, and simply uses the Waker pointer itself. That change also makes minor improvements to the tests to ensure that the benchmarks are more representative by removing goroutine start from the hot path (and uses Wakers for required synchronization), adds assertion checks to AddWaker, and clears relevant fields during Done (to allow assertions to pass). PiperOrigin-RevId: 407719630
2021-11-04[syserr] Move ConvertIntr function to linuxerr packageZach Koopmans
Move ConverIntr out of syserr package and delete an unused function. PiperOrigin-RevId: 407676258
2021-11-04[syserr] Reverse dependency for tcpip.ErrorZach Koopmans
PiperOrigin-RevId: 407638912
2021-11-02Merge pull request #6805 from bradfitz:bradfitz/mipslegVisor bot
PiperOrigin-RevId: 407188968
2021-11-02Extract tcb & lastUsed to its own lockGhanan Gowripalan
These fields do not need to synchronize reads/writes with the rest of the connection. PiperOrigin-RevId: 407183693
2021-11-02Properly reap NATed connectionsGhanan Gowripalan
This change fixes a bug when reaping tuples of NAT-ed connections. Previously when reaping a tuple, the other direction's tuple ID was calculated by taking the reaping tuple's ID and inverting it. This works when a connection is not NATed but doesn't work when NAT is performed as the other direction's tuple may use different addresses. PiperOrigin-RevId: 407160930
2021-11-02Allow SetAttr and Allocate for deleted filesFabricio Voznika
It's safe to call SetAttr and Allocate on fsgofer because the file path is not used to open the file, if needed. Fixes #3654 PiperOrigin-RevId: 407149393
2021-11-01Allow partial packets in ICMP errors when NATingGhanan Gowripalan
An ICMP error may not hold the full packet that triggered the ICMP response. As long as the IP header and the transport header is parsable, we should be able to successfully NAT as that is all that we need to identify the connection. PiperOrigin-RevId: 406966048
2021-11-01Handle UMOUNT_NOFOLLOW in VFS2 umount(2).Ayush Ranjan
Reported-by: syzbot+f9ecb181a4b3abdde9b9@syzkaller.appspotmail.com Reported-by: syzbot+8c5cb9d7a044a91a513b@syzkaller.appspotmail.com PiperOrigin-RevId: 406951359
2021-11-01Move ThreadGroupIDFromContext to kernel/auth.Adin Scannell
This function doesn't belong in the global context package. Move to a more suitable package to break the dependency cycle. PiperOrigin-RevId: 406942122
2021-11-01pkg/atomicbitops: support 32-bit GOARCH value "mipsle"Brad Fitzpatrick
mips was supported, but mipsle had been forgotten. Fixes google/gvisor#6804
2021-10-29[syserr] Covert all linuxerr returns to error type.Zach Koopmans
Change the linuxerr.ErrorFromErrno to return an error type and not a *errors.Error type. The latter results in problems comparing to nil as <nil><nil> != <nil><*errors.Error>. In a follow up, there will be a change to remove *errors.Error.Errno(), which will also encourage users to not use Errnos to reference linuxerr. PiperOrigin-RevId: 406444419
2021-10-28NAT ICMPv6 errorsGhanan Gowripalan
...so a NAT-ed connection's socket can handle ICMP errors. Updates #5916. PiperOrigin-RevId: 406270868
2021-10-28Use Task blocking timer for nanosleep(2).Jamie Liu
kernel/time.Timer allocation is expensive and not sync.Poolable (since time.Timer only supports notification through a channel, requiring a goroutine to receive from the channel, and sync.Pool doesn't invoke any kind of cleanup on discarded items in the pool so it would leak timer goroutines). Using the existing Task.blockingTimer for nanosleep(), and applicable cases in clock_nanosleep(), at least avoids Timer allocation in common cases. PiperOrigin-RevId: 406248394
2021-10-27Replace bespoke WaitGroupErr with errgroupTamir Duberstein
PiperOrigin-RevId: 406027220
2021-10-27Sychronize access to cpuset controller bitmaps.Rahat Mahmood
Reported-by: syzbot+39d434b96cf7c29a66ad@syzkaller.appspotmail.com Reported-by: syzbot+7c38bce6353d91facca3@syzkaller.appspotmail.com PiperOrigin-RevId: 406024052
2021-10-27Reduce eventFD notifications on transmit.Bhasker Hariharan
When transmitting packets we only need to notify if the peer is not already processing packets. sharedData region is used to enable/disable notifications and the peer will disable notifications when its actively processing packets and enable notifications just before it goes to sleep waiting on packets. This allows more efficient transmit as the sharedmem endpoint does not need to notify on eventFD and incur an expensive host systemcall when the peer is already awake. PiperOrigin-RevId: 406018843
2021-10-27rename tcp_conntrack inbound/outbound to reply/originalKevin Krakauer
Connection tracking is agnostic to whether the packet is inbound or outbound. It cares who initiated the connection. The naming can get confusing as conntrack can track connections originating from any host. Part of resolving #6736. PiperOrigin-RevId: 405997540
2021-10-27NAT ICMPv4 errorsGhanan Gowripalan
...so a NAT-ed connection's socket can handle ICMP errors. Updates #5916. PiperOrigin-RevId: 405970089
2021-10-27Record counts of packets with unknown L3/L4 numbersNick Brown
Previously, we recorded a single aggregated count. These per-protocol counts can help us debug field issues when frames are dropped for this reason. PiperOrigin-RevId: 405913911
2021-10-26Simplify vfs.NewDisconnectedMount signature and callpoints.Ayush Ranjan
vfs.NewDisconnectedMount has no error paths. Its much prettier without the error return value. Also simplify MountDisconnected which would immediately drop the refs taken by NewDisconnectedMount. Instead make it directly call newMount. PiperOrigin-RevId: 405767966
2021-10-26Validate an icmp header before accessing itAndrei Vagin
A header can't be smaller than header.ICMPv4MinimumSize. Reported-by: syzbot+57b68b14b4f6a58bf985@syzkaller.appspotmail.com PiperOrigin-RevId: 405748438
2021-10-26platform/kvm: map vdso and vvar into a guest address spaceAndrei Vagin
Right now, each vdso call triggers vmexit. VDSO and VVAR pages are mapped with VM_IO and get_user_pages fails for such vma-s. KVM was not able to handle this case up to the v4.8 kernel. This problem was fixed by add6a0cd1c5ba ("KVM: MMU: try to fix up page faults before giving up"). For some unknown reasons, it still doesn't work in case of nested virtualization. Before: BenchmarkKernelVDSO-6 252519 4598 ns/op After: BenchmarkKernelVDSO-6 34431957 34.91 ns/op PiperOrigin-RevId: 405715941
2021-10-26Obtain ref on root dentry in mqfs.GetFilesystem.Ayush Ranjan
As documented in FilesystemType.GetFilesystem, a reference should be taken on the returned dentry and filesystem by GetFilesystem implementation. mqfs did not do that. Additionally cleanup and clarify ref counting of dentry, filesystem and mount in mqfs. Reported-by: syzbot+a2c54bfb6e1525228e5f@syzkaller.appspotmail.com Reported-by: syzbot+ccd305cdab11cfebbfff@syzkaller.appspotmail.com PiperOrigin-RevId: 405700565
2021-10-26Move attestation definitions to standalone packageChong Cai
PiperOrigin-RevId: 405698863
2021-10-26Change Notify() to use unix.RawSyscall.Bhasker Hariharan
eventfd.Notify() uses unix.Write which will eventually call unix.Syscall which will yield the current go processor resulting in the Go scheduler parking the current goroutine till the syscall returns. But in most cases where Notify() is called there is no reason to yield as the caller probably wants to continue doing something right afterwards. Like in the case of the sharedmem endpoint which may still have more packets to write. PiperOrigin-RevId: 405693801
2021-10-26Ensure statfs::f_namelen is set by VFS2 gofer statfs/fstatfs.Jamie Liu
VFS1 discards the value of f_namelen returned by the filesystem and returns NAME_MAX unconditionally instead, so it doesn't run into this. Also set f_frsize for completeness. PiperOrigin-RevId: 405579707
2021-10-25Do not leak non-permission mode bits in mq_open(2).Ayush Ranjan
As caught by syzkaller, we were leaking non-permission bits while passing the user generated mode. DynamicBytesFile panics in this case. Reported-by: syzbot+5abe52d47d56a5a98c89@syzkaller.appspotmail.com PiperOrigin-RevId: 405481392
2021-10-25Add support for containerd 1.5Fabricio Voznika
"cri.runtimeoptions.v1" moved to "runtimeoptions.v1" and containerd configuration format version 2 is required. Updates #6449 PiperOrigin-RevId: 405474653
2021-10-23initialize hostFeatureSet from init functionJing Chen
2021-10-23fix the failed test target //pkg/cpuid:cpuid_test on arm64.Jing Chen
2021-10-21Merge pull request #6345 from sudo-sturbia:mq/syscallsgVisor bot
PiperOrigin-RevId: 404901660
2021-10-21Add an integration test for istio like redirect.Bhasker Hariharan
Updates #6441,#6317 PiperOrigin-RevId: 404872327
2021-10-20Report correct error when restore failsFabricio Voznika
When file corruption is detected, report vfs.ErrCorruption to distinguish corruption error from other restore errors. Updates #1035 PiperOrigin-RevId: 404588445
2021-10-19Always parse Transport headersGhanan Gowripalan
..including ICMP headers before delivering them to the TransportDispatcher. Updates #3810. PiperOrigin-RevId: 404404002
2021-10-19Fix typo in FIXMEFabricio Voznika
PiperOrigin-RevId: 404400399
2021-10-19Do not return non-nil *lisafs.Inode to doCreateAt on error.Ayush Ranjan
lisafs.ClientFile.MkdirAt is allowed to return a non-nil Inode and a non-nil error on an RPC error. The caller must not use the returned (invalid) Inode on error. But a code path in the gofer client does end up using it. More specifically, when the Mkdir RPC fails and we end up creating a synthetic dentry for a mountpoint, we end up returning the (invalid) non-nil Inode to filesystem.doCreateAt implementation which thinks that a remote file was created. But that non-nil Inode is actually invalid because the RPC failed. Things go downhill from there. Update client to not use childDirInode if RPC failed. PiperOrigin-RevId: 404396573
2021-10-19Continue reaping bucket after reaping a tupleGhanan Gowripalan
Reaping an expired tuple removes it from its bucket so we need to grab the succeeding tuple in the bucket before reaping the expired tuple. Before this change, only the first expired tuple in a bucket was reaped per reaper run on the bucket. This change just allows more connections to be reaped. PiperOrigin-RevId: 404392925
2021-10-19Stub cpuset cgroup control files.Rahat Mahmood
PiperOrigin-RevId: 404382475
2021-10-18conntrack: update state of un-NATted connectionsKevin Krakauer
This prevents reaping connections unnecessarily early. This change both moves the state update to the beginning of handlePacket and fixes a bug where un-finalized connections could become un-reapable. Fixes #6748 PiperOrigin-RevId: 404141012
2021-10-18conntrack: use tcpip.Clock instead of time.TimeKevin Krakauer
- We should be using a monotonic clock - This will make future testing easier Updates #6748. PiperOrigin-RevId: 404072318
2021-10-18Report ramdiskfs usage correctlyFabricio Voznika
Updates #1035 PiperOrigin-RevId: 404072231
2021-10-18Support distinction for RWMutex and read-only locks.Adin Scannell
Fixes #6590 PiperOrigin-RevId: 404007524
2021-10-15Satisfy nogoGhanan Gowripalan
PiperOrigin-RevId: 403479257
2021-10-15Implement WriteRawPacket for pipeTony Gong
Implement WriteRawPacket for pipe by calling `DeliverNetworkPacket` on the other end with empty values for the route and protocol number, and relies on the `NetworkDispatcher` to decapsulate the link layer header from the raw packet itself. PiperOrigin-RevId: 403461448