summaryrefslogtreecommitdiffhomepage
path: root/pkg
AgeCommit message (Collapse)Author
2020-02-04Remove argument from vfs.MountNamespace.DecRef()Fabricio Voznika
Updates #1035 PiperOrigin-RevId: 293194631
2020-02-04VFS2 gofer clientJamie Liu
Updates #1198 Opening host pipes (by spinning in fdpipe) and host sockets is not yet complete, and will be done in a future CL. Major differences from VFS1 gofer client (sentry/fs/gofer), with varying levels of backportability: - "Cache policies" are replaced by InteropMode, which control the behavior of timestamps in addition to caching. Under InteropModeExclusive (analogous to cacheAll) and InteropModeWritethrough (analogous to cacheAllWritethrough), client timestamps are *not* written back to the server (it is not possible in 9P or Linux for clients to set ctime, so writing back client-authoritative timestamps results in incoherence between atime/mtime and ctime). Under InteropModeShared (analogous to cacheRemoteRevalidating), client timestamps are not used at all (remote filesystem clocks are authoritative). cacheNone is translated to InteropModeShared + new option filesystemOptions.specialRegularFiles. - Under InteropModeShared, "unstable attribute" reloading for permission checks, lookup, and revalidation are fused, which is feasible in VFS2 since gofer.filesystem controls path resolution. This results in a ~33% reduction in RPCs for filesystem operations compared to cacheRemoteRevalidating. For example, consider stat("/foo/bar/baz") where "/foo/bar/baz" fails revalidation, resulting in the instantiation of a new dentry: VFS1 RPCs: getattr("/") // fs.MountNamespace.FindLink() => fs.Inode.CheckPermission() => gofer.inodeOperations.check() => gofer.inodeOperations.UnstableAttr() walkgetattr("/", "foo") = fid1 // fs.Dirent.walk() => gofer.session.Revalidate() => gofer.cachePolicy.Revalidate() clunk(fid1) getattr("/foo") // CheckPermission walkgetattr("/foo", "bar") = fid2 // Revalidate clunk(fid2) getattr("/foo/bar") // CheckPermission walkgetattr("/foo/bar", "baz") = fid3 // Revalidate clunk(fid3) walkgetattr("/foo/bar", "baz") = fid4 // fs.Dirent.walk() => gofer.inodeOperations.Lookup getattr("/foo/bar/baz") // linux.stat() => gofer.inodeOperations.UnstableAttr() VFS2 RPCs: getattr("/") // gofer.filesystem.walkExistingLocked() walkgetattr("/", "foo") = fid1 // gofer.filesystem.stepExistingLocked() clunk(fid1) // No getattr: walkgetattr already updated metadata for permission check walkgetattr("/foo", "bar") = fid2 clunk(fid2) walkgetattr("/foo/bar", "baz") = fid3 // No clunk: fid3 used for new gofer.dentry // No getattr: walkgetattr already updated metadata for stat() - gofer.filesystem.unlinkAt() does not require instantiation of a dentry that represents the file to be deleted. Updates #898. - gofer.regularFileFD.OnClose() skips Tflushf for regular files under InteropModeExclusive, as it's nonsensical to request a remote file flush without flushing locally-buffered writes to that remote file first. - Symlink targets are cached when InteropModeShared is not in effect. - p9.QID.Path (which is already required to be unique for each file within a server, and is accordingly already synthesized from device/inode numbers in all known gofers) is used as-is for inode numbers, rather than being mapped along with attr.RDev in the client to yet another synthetic inode number. - Relevant parts of fsutil.CachingInodeOperations are inlined directly into gofer package code. This avoids having to duplicate part of its functionality in fsutil.HostMappable. PiperOrigin-RevId: 293190213
2020-02-04Add support for sentry internal pipe for gofer mountsFabricio Voznika
Internal pipes are supported similarly to how internal UDS is done. It is also controlled by the same flag. Fixes #1102 PiperOrigin-RevId: 293150045
2020-02-03seccomp: allow to filter syscalls by instruction pointerAndrei Vagin
PiperOrigin-RevId: 293029446
2020-01-31Fix method comment to match method name.Ian Gudger
PiperOrigin-RevId: 292624867
2020-01-31Implement file locks for regular tmpfs files in VFSv2.Dean Deng
Add a file lock implementation that can be embedded into various filesystem implementations. Updates #1480 PiperOrigin-RevId: 292614758
2020-01-31Use multicast Ethernet address for multicast NDPGhanan Gowripalan
As per RFC 2464 section 7, an IPv6 packet with a multicast destination address is transmitted to the mapped Ethernet multicast address. Test: - ipv6.TestLinkResolution - stack_test.TestDADResolve - stack_test.TestRouterSolicitation PiperOrigin-RevId: 292610529
2020-01-31Extract multicast IP to Ethernet address mappingGhanan Gowripalan
Test: header.TestEthernetAddressFromMulticastIPAddress PiperOrigin-RevId: 292604649
2020-01-31Internal change.gVisor bot
PiperOrigin-RevId: 292587459
2020-01-30Fix for panic in endpoint.Close().Bhasker Hariharan
When sending a RST on shutdown we need to double check the state after acquiring the work mutex as the endpoint could have transitioned out of a connected state from the time we checked it and we acquired the workMutex. I added two tests but sadly neither reproduce the panic. I am going to leave the tests in as they are good to have anyway. PiperOrigin-RevId: 292393800
2020-01-30Merge pull request #1288 from lubinszARM:pr_ring0_6gVisor bot
PiperOrigin-RevId: 292369598
2020-01-30Enforce splice offset limitsMichael Pratt
Splice must not allow negative offsets. Writes also must not allow offset + size to overflow int64. Reads are similarly broken, but not just in splice (b/148095030). Reported-by: syzbot+0e1ff0b95fb2859b4190@syzkaller.appspotmail.com PiperOrigin-RevId: 292361208
2020-01-30Do not include the Source Link Layer option with an unspecified source addressGhanan Gowripalan
When sending NDP messages with an unspecified source address, the Source Link Layer address must not be included. Test: stack_test.TestDADResolve PiperOrigin-RevId: 292341334
2020-01-29Do not spawn a goroutine when calling stack.NDPDispatcher's methodsGhanan Gowripalan
Do not start a new goroutine when calling stack.NDPDispatcher.OnDuplicateAddressDetectionStatus. PiperOrigin-RevId: 292268574
2020-01-29Add support for TCP_DEFER_ACCEPT.Bhasker Hariharan
PiperOrigin-RevId: 292233574
2020-01-29Add plumbing for file locks in VFS2.Dean Deng
Updates #1480 PiperOrigin-RevId: 292180192
2020-01-29sentry: rename SetRSEQInterruptedIP to SetOldRSeqInterruptedIP for arm64Andrei Vagin
For amd64, this has been done on cl/288342928. PiperOrigin-RevId: 292170856
2020-01-29Add //pkg/sentry/devices/memdev.Jamie Liu
PiperOrigin-RevId: 292165063
2020-01-29supporting sError in guest kernel on Arm64Bin Lu
For test case 'TestBounce', we use KVM_SET_VCPU_EVENTS to trigger sError to leave guest. Signed-off-by: Bin Lu <bin.lu@arm.com>
2020-01-28Prevent arbitrary size allocation when sending UDS messages.Dean Deng
Currently, Send() will copy data into a new byte slice without regard to the original size. Size checks should be performed before the allocation takes place. Note that for the sake of performance, we avoid putting the buffer allocation into the critical section. As a result, the size checks need to be performed again within Enqueue() in case the limit has changed. PiperOrigin-RevId: 292058147
2020-01-28Add support for WritableSource in DynamicBytesFileDescriptionImplFabricio Voznika
WritableSource is a convenience interface used for files that can be written to, e.g. /proc/net/ipv4/tpc_sack. It reads max of 4KB and only from offset 0 which should cover most cases. It can be extended as neeed. Updates #1195 PiperOrigin-RevId: 292056924
2020-01-28Changes missing in last submitFabricio Voznika
Updates #1487 Updates #1623 PiperOrigin-RevId: 292040835
2020-01-28Update link address for senders of Neighbor SolicitationsGhanan Gowripalan
Update link address for senders of NDP Neighbor Solicitations when the NS contains an NDP Source Link Layer Address option. Tests: - ipv6.TestNeighorSolicitationWithSourceLinkLayerOption - ipv6.TestNeighorSolicitationWithInvalidSourceLinkLayerOption PiperOrigin-RevId: 292028553
2020-01-28Add vfs.FileDescription to FD tableFabricio Voznika
FD table now holds both VFS1 and VFS2 types and uses the correct one based on what's set. Parts of this CL are just initial changes (e.g. sys_read.go, runsc/main.go) to serve as a template for the remaining changes. Updates #1487 Updates #1623 PiperOrigin-RevId: 292023223
2020-01-28Add //pkg/sentry/fsimpl/devtmpfs.Jamie Liu
PiperOrigin-RevId: 292021389
2020-01-28Include the NDP Source Link Layer option when sending DAD messagesGhanan Gowripalan
Test: stack_test.TestDADResolve PiperOrigin-RevId: 292003124
2020-01-28fs/splice: don't report partial errors for special filesAndrei Vagin
Special files can have additional requirements for granularity. For example, read from eventfd returns EINVAL if a size is less 8 bytes. Reported-by: syzbot+3905f5493bec08eb7b02@syzkaller.appspotmail.com PiperOrigin-RevId: 292002926
2020-01-28Add VFS2 support for epoll.Jamie Liu
PiperOrigin-RevId: 291997879
2020-01-28netlink: add support for RTM_F_LOOKUP_TABLEJianfeng Tan
Test command: $ ip route get 1.1.1.1 Fixes: #1099 Signed-off-by: Jianfeng Tan <henry.tjf@antfin.com> COPYBARA_INTEGRATE_REVIEW=https://github.com/google/gvisor/pull/1121 from tanjianfeng:fix-1099 e6919f3d4ede5aa51a48b3d2be0d7a4b482dd53d PiperOrigin-RevId: 291990716
2020-01-28Implement an anon_inode equivalent for VFS2.Jamie Liu
PiperOrigin-RevId: 291986033
2020-01-28Check sigsetsize in rt_sigactionMichael Pratt
This isn't in the libc wrapper, but it is in the syscall itself. Discovered by @xiaobo55x in #1625. PiperOrigin-RevId: 291973931
2020-01-27Add a type to represent the NDP Source Link Layer Address optionGhanan Gowripalan
Tests: - header.TestNDPSourceLinkLayerAddressOptionEthernetAddress - header.TestNDPSourceLinkLayerAddressOptionSerialize - header.TestNDPOptionsIterCheck - header.TestNDPOptionsIter PiperOrigin-RevId: 291856429
2020-01-27Merge pull request #1561 from zhangningdlut:chris_ttygVisor bot
PiperOrigin-RevId: 291821850
2020-01-27Cleanup glog and add real caller information.Adin Scannell
In general, we've learned that logging must be avoided at all costs in the hot path. It's unlikely that the optimizations here were significant in any case, since buffer would certainly escape. This also adds a test to ensure that the caller identification works as expected, and so that logging can be benchmarked. Original: BenchmarkGoogleLogging-6 1222255 949 ns/op With this change: BenchmarkGoogleLogging-6 517323 2346 ns/op Fixes #184 PiperOrigin-RevId: 291815420
2020-01-27Update package locations.Adin Scannell
Because the abi will depend on the core types for marshalling (usermem, context, safemem, safecopy), these need to be flattened from the sentry directory. These packages contain no sentry-specific details. PiperOrigin-RevId: 291811289
2020-01-27Merge pull request #1676 from majek:marek/FIX-1632-expose-NewPacketConngVisor bot
PiperOrigin-RevId: 291803499
2020-01-27Fix licenses.Adin Scannell
The preferred Copyright holder is "The gVisor Authors". PiperOrigin-RevId: 291786657
2020-01-27Update ChecksumVVWithoffset to use unrolled version.Bhasker Hariharan
Fixes #1656 PiperOrigin-RevId: 291777279
2020-01-27Update bug number for supporting extended attribute namespaces.Dean Deng
PiperOrigin-RevId: 291774815
2020-01-27Refactor to hide C from channel.Endpoint.Ting-Yu Wang
This is to aid later implementation for /dev/net/tun device. PiperOrigin-RevId: 291746025
2020-01-27Standardize on tools directory.Adin Scannell
PiperOrigin-RevId: 291745021
2020-01-27Expose gonet.NewPacketConn, for parity with gonet.NewConn APIMarek Majkowski
gonet.Conn can be created with both gonet.NewConn and gonet.Dial. gonet.PacketConn was created only by gonet.DialUDP. This prevented us from being able to use PacketConn in udp.NewForwarder() context. This simple constructor - NewPacketConn, allows user to create correct structure from that context.
2020-01-27Replace calculateChecksum w/ the unrolled version.Bhasker Hariharan
Fixes #1656 PiperOrigin-RevId: 291703760
2020-01-26Unroll checksum computation loop.Bhasker Hariharan
Checksum computation is one of the most expensive bits of packet processing. Manual unrolling of the loop provides significant improvement in checksum speed. Updates #1656 BenchmarkChecksum/checksum_64-12 49834124 23.6 ns/op BenchmarkChecksum/checksum_128-12 27111997 44.1 ns/op BenchmarkChecksum/checksum_256-12 11416683 91.5 ns/op BenchmarkChecksum/checksum_512-12 6375298 174 ns/op BenchmarkChecksum/checksum_1024-12 3403852 338 ns/op BenchmarkChecksum/checksum_1500-12 2343576 493 ns/op BenchmarkChecksum/checksum_2048-12 1730521 656 ns/op BenchmarkChecksum/checksum_4096-12 920469 1327 ns/op BenchmarkChecksum/checksum_8192-12 445885 2637 ns/op BenchmarkChecksum/checksum_16384-12 226342 5268 ns/op BenchmarkChecksum/checksum_32767-12 114210 10503 ns/op BenchmarkChecksum/checksum_32768-12 99138 10610 ns/op BenchmarkChecksum/checksum_65535-12 53438 21158 ns/op BenchmarkChecksum/checksum_65536-12 52993 21067 ns/op BenchmarkUnrolledChecksum/checksum_64-12 61035639 19.1 ns/op BenchmarkUnrolledChecksum/checksum_128-12 36067015 33.6 ns/op BenchmarkUnrolledChecksum/checksum_256-12 19731220 60.4 ns/op BenchmarkUnrolledChecksum/checksum_512-12 9091291 116 ns/op BenchmarkUnrolledChecksum/checksum_1024-12 4976406 226 ns/op BenchmarkUnrolledChecksum/checksum_1500-12 3685224 328 ns/op BenchmarkUnrolledChecksum/checksum_2048-12 2579108 447 ns/op BenchmarkUnrolledChecksum/checksum_4096-12 1350475 887 ns/op BenchmarkUnrolledChecksum/checksum_8192-12 658248 1780 ns/op BenchmarkUnrolledChecksum/checksum_16384-12 335869 3534 ns/op BenchmarkUnrolledChecksum/checksum_32767-12 168650 7095 ns/op BenchmarkUnrolledChecksum/checksum_32768-12 168075 7098 ns/op BenchmarkUnrolledChecksum/checksum_65535-12 75085 14277 ns/op BenchmarkUnrolledChecksum/checksum_65536-12 75921 14127 ns/op PiperOrigin-RevId: 291643290
2020-01-24Add support for device special files to VFS2 tmpfs.Jamie Liu
PiperOrigin-RevId: 291471892
2020-01-24Lock the NIC when checking if an address is tentativeGhanan Gowripalan
PiperOrigin-RevId: 291426657
2020-01-24Add anonymous device number allocation to VFS2.Jamie Liu
Note that in VFS2, filesystem device numbers are per-vfs.FilesystemImpl rather than global, avoiding the need for a "registry" type to handle save/restore. (This is more consistent with Linux anyway: compare e.g. mm/shmem.c:shmem_mount() => fs/super.c:mount_nodev() => (indirectly) set_anon_super().) PiperOrigin-RevId: 291425193
2020-01-24Increase timeouts for NDP tests' async eventsGhanan Gowripalan
Increase the timeout to 1s when waiting for async NDP events to help reduce flakiness. This will not significantly increase test times as the async events continue to receive an event on a channel. The increased timeout allows more time for an event to be sent on the channel as the previous timeout of 100ms caused some flakes. Test: Existing tests pass PiperOrigin-RevId: 291420936
2020-01-24Ignore external SIGURGMichael Pratt
Go 1.14+ sends SIGURG to Ms to attempt asynchronous preemption of a G. Since it can't guarantee that a SIGURG is only related to preemption, it continues to forward them to signal.Notify (see runtime.sighandler). We should ignore these signals, as applications shouldn't receive them. Note that this means that truly external SIGURG can no longer be sent to the application (as with SIGCHLD). PiperOrigin-RevId: 291415357
2020-01-23Remove epoll entry from map when dropping it.Nicolas Lacasse
This pattern (delete from map when dropping) is also used in epoll.RemoveEntry, and seems like generally a good idea. PiperOrigin-RevId: 291268208