summaryrefslogtreecommitdiffhomepage
path: root/pkg
AgeCommit message (Collapse)Author
2020-08-19Remove weak references from unix sockets.Dean Deng
The abstract socket namespace no longer holds any references on sockets. Instead, TryIncRef() is used when a socket is being retrieved in BoundEndpoint(). Abstract sockets are now responsible for removing themselves from the namespace they are in, when they are destroyed. Updates #1486. PiperOrigin-RevId: 327064173
2020-08-19Add a unit test for out of order IP reassemblyArthur Sfez
PiperOrigin-RevId: 327042869
2020-08-19[vfs] Return EIO when opening /dev/tty.Ayush Ranjan
This is in compliance with VFS1. See pkg/sentry/fs/dev/tty.go in the struct ttyInodeOperations. Fixes the failure of python runtime test_ioctl. Updates #3515 PiperOrigin-RevId: 327042758
2020-08-19Don't support address rangesGhanan Gowripalan
Previously the netstack supported assignment of a range of addresses. This feature is not used so remove it. PiperOrigin-RevId: 326791119
2020-08-19Use a single NetworkEndpoint per NIC per protocolGhanan Gowripalan
The NetworkEndpoint does not need to be created for each address. Most of the work the NetworkEndpoint does is address agnostic. PiperOrigin-RevId: 326759605
2020-08-14Merge pull request #3375 from kevinGC:ipt-test-early-returngVisor bot
PiperOrigin-RevId: 326693922
2020-08-14Give the ICMP Code its own typeJulian Elischer
This is a preparatory commit for a larger commit working on ICMP generation in error cases. This is removal of technical debt and cleanup in the gvisor code as part of gvisor issue 2211. Updates #2211. PiperOrigin-RevId: 326615389
2020-08-13[vfs2][gofer] Fix file creation flags sent to gofer.Ayush Ranjan
Fixes php runtime test ext/standard/tests/file/readfile_basic.phpt Fixes #3516 fsgofers only want the access mode in the OpenFlags passed to Create(). If more flags are supplied (like O_APPEND in this case), read/write from that fd will fail with EBADF. See runsc/fsgofer/fsgofer.go:WriteAt() VFS2 was providing more than just access modes. So filtering the flags using p9.OpenFlagsModeMask == linux.O_ACCMODE fixes the issue. Gofer in VFS1 also only extracts the access mode flags while making the create RPC. See pkg/sentry/fs/gofer/path.go:Create() Even in VFS2, when we open a handle, we extract out only the access mode flags + O_TRUNC. See third_party/gvisor/pkg/sentry/fsimpl/gofer/handle.go:openHandle() Added a test for this. PiperOrigin-RevId: 326574829
2020-08-13Use the user supplied MSS for accepted connectionsGhanan Gowripalan
This change supports using the user supplied MSS (TCP_MAXSEG socket option) for new socket connections created from a listening TCP socket. Note that the user supplied MSS will only be used if it is not greater than the maximum possible MSS for a TCP connection's route. If it is greater than the maximum possible MSS, the MSS will be capped at that maximum value. Test: tcp_test.TestUserSuppliedMSSOnListenAccept PiperOrigin-RevId: 326567442
2020-08-13Merge pull request #3476 from zhlhahaha:1930gVisor bot
PiperOrigin-RevId: 326563255
2020-08-13Migrate to PacketHeader API for PacketBuffer.Ting-Yu Wang
Formerly, when a packet is constructed or parsed, all headers are set by the client code. This almost always involved prepending to pk.Header buffer or trimming pk.Data portion. This is known to prone to bugs, due to the complexity and number of the invariants assumed across netstack to maintain. In the new PacketHeader API, client will call Push()/Consume() method to construct/parse an outgoing/incoming packet. All invariants, such as slicing and trimming, are maintained by the API itself. NewPacketBuffer() is introduced to create new PacketBuffer. Zero value is no longer valid. PacketBuffer now assumes the packet is a concatenation of following portions: * LinkHeader * NetworkHeader * TransportHeader * Data Any of them could be empty, or zero-length. PiperOrigin-RevId: 326507688
2020-08-13Ensure TCP TIME-WAIT is not terminated prematurely.Bhasker Hariharan
Netstack's TIME-WAIT state for a TCP socket could be terminated prematurely if the socket entered TIME-WAIT using shutdown(..., SHUT_RDWR) and then was closed using close(). This fixes that bug and updates the tests to verify that Netstack correctly honors TIME-WAIT under such conditions. Fixes #3106 PiperOrigin-RevId: 326456443
2020-08-12Add reference leak checking to vfs2 tmpfs.inode.Dean Deng
Updates #1486. PiperOrigin-RevId: 326354750
2020-08-12[vfs2][gofer] Return appropriate errors when opening and creating files.Ayush Ranjan
Fixes php test ext/standard/tests/file/touch_variation5.phpt on vfs2. Updates #3516 Also spotted a bug with O_EXCL, where we did not return EEXIST when we tried to open the root of the filesystem with O_EXCL | O_CREAT. Added some more tests for open() corner cases. PiperOrigin-RevId: 326346863
2020-08-12ip6tables: ABI structs and constantsKevin Krakauer
Part of #3549. PiperOrigin-RevId: 326329028
2020-08-12Merge pull request #3605 from lubinszARM:pr_helloworld_thunderx2gVisor bot
PiperOrigin-RevId: 326326710
2020-08-12Merge pull request #3250 from craig08:fuse-getattrgVisor bot
PiperOrigin-RevId: 326313858
2020-08-12Redirect TODOFabricio Voznika
Fixes #2923 PiperOrigin-RevId: 326296589
2020-08-12Release fd references on aio callback cancellation.Dean Deng
Discovered by reference leak checker on tmpfs.inode. PiperOrigin-RevId: 326294755
2020-08-12Fix race in vfs.FileDescription.statusFlagFabricio Voznika
PiperOrigin-RevId: 326270643
2020-08-12Running hello-world on Thunderx2 with kvmBin Lu
Signed-off-by: Bin Lu <bin.lu@arm.com>
2020-08-12enable seccomp test on arm64Howard Zhang
syscalls in ARM64 is different from that in X86_64, use differen syscallrules for each arch. The auditnumber are also different for different arch. Use LINUX_AUDIT_ARCH to get correct auditnumber. Signed-off-by: Howard Zhang <howard.zhang@arm.com>
2020-08-11Fix-up issue comment.Adin Scannell
PiperOrigin-RevId: 326129258
2020-08-11Eliminate one allocation per send/recv for non-flipcall transport.Fazlul Shahriar
Ported from https://github.com/hugelgupf/p9/pull/44. name old time/op new time/op delta SendRecvLegacy-6 61.5µs ± 6% 60.1µs ±11% ~ (p=0.063 n=9+9) SendRecv-6 40.7µs ± 2% 39.8µs ± 5% -2.27% (p=0.035 n=10+10) name old alloc/op new alloc/op delta SendRecvLegacy-6 769B ± 0% 705B ± 0% -8.37% (p=0.000 n=8+10) SendRecv-6 320B ± 0% 256B ± 0% -20.00% (p=0.000 n=10+10) name old allocs/op new allocs/op delta SendRecvLegacy-6 25.0 ± 0% 23.0 ± 0% -8.00% (p=0.000 n=10+10) SendRecv-6 14.0 ± 0% 12.0 ± 0% -14.29% (p=0.000 n=10+10) PiperOrigin-RevId: 326127979
2020-08-11Mark integration tests as passing in VFS2 except CheckpointRestore.Zach Koopmans
Mark all tests passing for VFS2 in: image_test integration_test There's no way to do negative look ahead/behind in golang test regex, so check if the tests uses VFS2 and skip CheckPointRestore if it does. PiperOrigin-RevId: 326050915
2020-08-10Set the NetworkProtocolNumber of all PacketBuffers.Kevin Krakauer
NetworkEndpoints set the number on outgoing packets in Write() and NetworkProtocols set them on incoming packets in Parse(). Needed for #3549. PiperOrigin-RevId: 325938745
2020-08-10Implement FUSE_GETATTRCraig Chi
FUSE_GETATTR is called when a stat(2), fstat(2), or lstat(2) is issued from VFS2 layer to a FUSE filesystem. Fixes #3175
2020-08-10Speed up iptables testsKevin Krakauer
//test/iptables:iptables_test runs 30 seconds faster on my machine. * Using contexts instead of many smaller timeouts makes the tests less likely to flake and removes unnecessary complexity. * We also use context to properly shut down concurrent goroutines and the test container. * Container logs are always logged.
2020-08-10Populate IPPacketInfo with destination addressGhanan Gowripalan
IPPacketInfo.DestinationAddr should hold the destination of the IP packet, not the source. This change fixes that bug. PiperOrigin-RevId: 325910766
2020-08-10ip6tables: move target-specific code to targets.goKevin Krakauer
This is purely moving code, no changes. netfilter.go is cluttered and targets.go is a good place for this. #3549 PiperOrigin-RevId: 325879965
2020-08-10Run GC before sandbox exit when leak checking is enabled.Dean Deng
Running garbage collection enqueues all finalizers, which are used by the refs/refs_vfs2 packages to detect reference leaks. Note that even with GC, there is no guarantee that all finalizers will be run before the program exits. This is a best effort attempt to activate leak checks as much as possible. Updates #3545. PiperOrigin-RevId: 325834438
2020-08-08Use unicast source for ICMP echo repliesGhanan Gowripalan
Packets MUST NOT use a non-unicast source address for ICMP Echo Replies. Test: integration_test.TestPingMulticastBroadcast PiperOrigin-RevId: 325634380
2020-08-07Add context.FullStateChanged()Andrei Vagin
It indicates that the Sentry has changed the state of the thread and next calls of PullFullState() has to do nothing. PiperOrigin-RevId: 325567415
2020-08-07[vfs2] Fix tmpfs mounting.Ayush Ranjan
Earlier we were using NLink to decide if /tmp is empty or not. However, NLink at best tells us about the number of subdirectories (via the ".." entries). NLink = n + 2 for n subdirectories. But it does not tell us if the directory is empty. There still might be non-directory files. We could also not rely on NLink because host overlayfs always returned 1. VFS1 uses Readdir to decide if the directory is empty. Used a similar approach. We now use IterDirents to decide if the "/tmp" directory is empty. Fixes #3369 PiperOrigin-RevId: 325554234
2020-08-07Don't hold gofer.filesystem.renameMu during dentry destruction.Jamie Liu
PiperOrigin-RevId: 325546629
2020-08-07Merge pull request #3069 from lubinszARM:pr_serr_injection2gVisor bot
PiperOrigin-RevId: 325546308
2020-08-07Mark dropped pages unevictable in fsimpl/gofer.dentry.destroyLocked.Jamie Liu
PiperOrigin-RevId: 325531657
2020-08-07Fix panic during Address Resolution of neighbor entry created by NSSam Balana
When a Neighbor Solicitation is received, a neighbor entry is created with the remote host's link layer address, but without a link layer address resolver. If the host decides to send a packet addressed to the IP address of that neighbor entry, Address Resolution starts with a nil pointer to the link layer address resolver. This causes the netstack to panic and crash. This change ensures that when a packet is sent in that situation, the link layer address resolver will be set before Address Resolution begins. Tests: pkg/tcpip/stack:stack_test + TestEntryUnknownToStaleToProbeToReachable - TestNeighborCacheEntryNoLinkAddress Updates #1889 Updates #1894 Updates #1895 Updates #1947 Updates #1948 Updates #1949 Updates #1950 PiperOrigin-RevId: 325516471
2020-08-07Port Ruby benchmark.Zach Koopmans
PiperOrigin-RevId: 325500772
2020-08-07tcp: change the limit of TCP_LINGER2Andrei Vagin
It was changed in the Linux kernel: commit f0628c524fd188c3f9418e12478dfdfadacba815 Date: Fri Apr 24 16:06:16 2020 +0800 net: Replace the limit of TCP_LINGER2 with TCP_FIN_TIMEOUT_MAX PiperOrigin-RevId: 325493859
2020-08-07Support separate read/write handles in fsimpl/gofer.dentry.Jamie Liu
PiperOrigin-RevId: 325490674
2020-08-07Try to update atime and mtime on VFS2 gofer files on dentry eviction.Jamie Liu
PiperOrigin-RevId: 325388385
2020-08-06Add LinkAt support to goferFabricio Voznika
Updates #1198 PiperOrigin-RevId: 325350818
2020-08-06Add reference counting utility to VFS2.Dean Deng
The utility has several differences from the VFS1 equivalent: - There are no weak references, which have a significant overhead - In order to print useful debug messages with the type of the reference- counted object, we use a generic Refs object with the owner type as a template parameter. In vfs1, this was accomplished by storing a type name and caller stack directly in the ref count (as in vfs1), which increases the struct size by 6x. (Note that the caller stack was needed because fs types like Dirent were shared by all fs implementations; in vfs2, each impl has its own data structures, so this is no longer necessary.) Updates #1486. PiperOrigin-RevId: 325271469
2020-08-06Only register /dev/net/tun if supported.Dean Deng
PiperOrigin-RevId: 325266487
2020-08-06Join IPv4 all-systems group on NIC enableGhanan Gowripalan
Test: - stack_test.TestJoinLeaveMulticastOnNICEnableDisable - integration_test.TestIncomingMulticastAndBroadcast PiperOrigin-RevId: 325185259
2020-08-05Add loss recovery option for TCP.Nayana Bidari
/proc/sys/net/ipv4/tcp_recovery is used to enable RACK loss recovery in TCP. PiperOrigin-RevId: 325157807
2020-08-05Correctly decrement link counts in tmpfs rename operations.Dean Deng
When a directory is replaced by a rename operation, its link count should reach zero. We were missing the link from `dir/.` PiperOrigin-RevId: 325141730
2020-08-05Support receiving broadcast IPv4 packetsGhanan Gowripalan
Test: integration_test.TestIncomingSubnetBroadcast PiperOrigin-RevId: 325135617
2020-08-05Release extra memfd reference.Dean Deng
PiperOrigin-RevId: 325122849