summaryrefslogtreecommitdiffhomepage
path: root/pkg
AgeCommit message (Collapse)Author
2019-08-22unix: return ECONNRESET if peer closed with data not readJianfeng Tan
For SOCK_STREAM type unix socket, we shall return ECONNRESET if peer is closed with data not read. We explictly set a flag when closing one end, to differentiate from just shutdown (where zero shall be returned). Fixes: #735 Signed-off-by: Jianfeng Tan <henry.tjf@antfin.com>
2019-08-22unix: return zero if peer is closedJianfeng Tan
Previously, recvmsg() on a unix stream socket with its peer closed will never return, with goroutine call trace like this: ... 2 in gvisor.dev/gvisor/pkg/sentry/kernel.(*Task).block at pkg/sentry/kernel/task_block.go:124 3 in gvisor.dev/gvisor/pkg/sentry/kernel.(*Task).BlockWithDeadline at pkg/sentry/kernel/task_block.go:69 4 in gvisor.dev/gvisor/pkg/sentry/socket/unix.(*SocketOperations).RecvMsg at pkg/sentry/socket/unix/unix.go:612 5 in gvisor.dev/gvisor/pkg/sentry/syscalls/linux.recvFrom at pkg/sentry/syscalls/linux/sys_socket.go:885 6 in gvisor.dev/gvisor/pkg/sentry/syscalls/linux.RecvFrom at pkg/sentry/syscalls/linux/sys_socket.go:910 ... The issue is caused by that ErrClosedForReceive returned by unix/transport.queue is turned into nil in unix.(*EndpointReader).ReadToBlocks(): err.ToError() As a result, in unix.(*SocketOperations).RecvMsg(): n == 0 and err == nil We shall differentiate it from another case - no data to read where ErrWouldBlock shall be returned; and return 0 immediately. Fixes: #734 Reported-by: chenglang.hy <chenglang.hy@antfin.com> Signed-off-by: Jianfeng Tan <henry.tjf@antfin.com>
2019-08-21Support binding to multicast and broadcast addressesChris Kuiper
This fixes the issue of not being able to bind to either a multicast or broadcast address as well as to send and receive data from it. The way to solve this is to treat these addresses similar to the ANY address and register their transport endpoint ID with the global stack's demuxer rather than the NIC's. That way there is no need to require an endpoint with that multicast or broadcast address. The stack's demuxer is in fact the only correct one to use, because neither broadcast- nor multicast-bound sockets care which NIC a packet was received on (for multicast a join is still needed to receive packets on a NIC). I also took the liberty of refactoring udp_test.go to consolidate a lot of duplicate code and make it easier to create repetitive tests that test the same feature for a variety of packet and socket types. For this purpose I created a "flowType" that represents two things: 1) the type of packet being sent or received and 2) the type of socket used for the test. E.g., a "multicastV4in6" flow represents a V4-mapped multicast packet run through a V6-dual socket. This allows writing significantly simpler tests. A nice example is testTTL(). PiperOrigin-RevId: 264766909
2019-08-21Use tcpip.Subnet in tcpip.RouteTamir Duberstein
This is the first step in replacing some of the redundant types with the standard library equivalents. PiperOrigin-RevId: 264706552
2019-08-20Add tcpip.Route.String and tcpip.AddressMask.PrefixChris Kuiper
PiperOrigin-RevId: 264544163
2019-08-19Document RWF_HIPRI not implemented for preadv2/pwritev2.Zach Koopmans
Document limitation of no reasonable implementation for RWF_HIPRI flag (High Priority Read/Write for block-based file systems). PiperOrigin-RevId: 264237589
2019-08-19Internal change.gVisor bot
PiperOrigin-RevId: 264218306
2019-08-19hostinet: fix parsing route netlink messageJianfeng Tan
We wrongly parses output interface as gateway address. The fix is straightforward. Fixes #638 Signed-off-by: Jianfeng Tan <henry.tjf@antfin.com> Change-Id: Ia4bab31f3c238b0278ea57ab22590fad00eaf061 COPYBARA_INTEGRATE_REVIEW=https://github.com/google/gvisor/pull/684 from tanjianfeng:fix-638 b940e810367ad1273519bfa594f4371bdd293e83 PiperOrigin-RevId: 264211336
2019-08-19Read iptables via sockopts.Kevin Krakauer
PiperOrigin-RevId: 264180125
2019-08-16netstack: disconnect an unix socket only if the address family is AF_UNSPECAndrei Vagin
Linux allows to call connect for ANY and the zero port. PiperOrigin-RevId: 263892534
2019-08-16procfs: Migrate seqfile implementations.Ayush Ranjan
Migrates all (except 3) seqfile implementations to the vfs.DynamicBytesSource interface. There should not be any change in functionality due to this migration itself. Please note that the following seqfile implementations have not been migrated: - /proc/filesystems in proc/filesystems.go - /proc/[pid]/mountinfo in proc/mounts.go - /proc/[pid]/mounts in proc/mounts.go This is because these depend on pending changes in /pkg/senty/vfs. PiperOrigin-RevId: 263880719
2019-08-16ptrace: detect if a stub process exited unexpectedlyAndrei Vagin
PiperOrigin-RevId: 263880577
2019-08-16Add subnet checking to NIC.findEndpoint and consolidate with NIC.getRefChris Kuiper
This adds the same logic to NIC.findEndpoint that is already done in NIC.getRef. Since this makes the two functions very similar they were combined into one with the originals being wrappers. PiperOrigin-RevId: 263864708
2019-08-16vfs: Remove vfs.DefaultDirectoryFD from embedding vfs.DefaultFD.Ayush Ranjan
This fixes the implementation ambiguity issues when a filesystem implementation embeds vfs.DefaultDirectoryFD to its directory FD along with an internal common fileDescription utility. For similar reasons also removes FileDescriptionDefaultImpl from DynamicBytesFileDescriptionImpl. PiperOrigin-RevId: 263795513
2019-08-15Document source and versioning of the TCPInfo struct.Rahat Mahmood
PiperOrigin-RevId: 263637194
2019-08-15Don't dereference errors passed to panic()Tamir Duberstein
These errors are always pointers; there's no sense in dereferencing them in the panic call. Changed one false positive for clarity. PiperOrigin-RevId: 263611579
2019-08-15netstack: move resumption logic into *_state.goTamir Duberstein
13a98df rearranged some of this code in a way that broke compilation of the netstack-only export at github.com/google/netstack because *_state.go files are not included in that export. This commit moves resumption logic back into *_state.go, fixing the compilation breakage. PiperOrigin-RevId: 263601629
2019-08-14Replace uinptr with int64 when returning lengthsTamir Duberstein
This is in accordance with newer parts of the standard library. PiperOrigin-RevId: 263449916
2019-08-14Add tcpip.AddressWithPrefix.StringTamir Duberstein
PiperOrigin-RevId: 263436592
2019-08-14Improve SendMsg performance.Bhasker Hariharan
SendMsg before this change would copy all the data over into a new slice even if the underlying socket could only accept a small amount of data. This is really inefficient with non-blocking sockets and under high throughput where large writes could get ErrWouldBlock or if there was say a timeout associated with the sendmsg() syscall. With this change we delay copying bytes in till they are needed and only copy what can be potentially sent/held in the socket buffer. Reducing the need to repeatedly copy data over. Also a minor fix to change state FIN-WAIT-1 when shutdown(..., SHUT_WR) is called instead of when we transmit the actual FIN. Otherwise the socket could remain in CONNECTED state even though the user has called shutdown() on the socket. Updates #627 PiperOrigin-RevId: 263430505
2019-08-13Add vfs.DynamicBytesFileDescriptionImpl.Jamie Liu
This replaces fs/proc/seqfile for vfs2-based filesystems. PiperOrigin-RevId: 263254647
2019-08-13Fix file mode check in pipeOperationsFabricio Voznika
PiperOrigin-RevId: 263203441
2019-08-13Add note to name logging mentioning trace logging should be enabled to debug.Ian Gudger
PiperOrigin-RevId: 263194584
2019-08-13gonet: Replace NewPacketConn with DialUDP.Ian Gudger
This better matches the standard library and allows creating connected PacketConns. PiperOrigin-RevId: 263187462
2019-08-12Handle ENOSPC with a partial write.Nicolas Lacasse
Similar to the EPIPE case, we can return the number of bytes written before ENOSPC was encountered. If the app tries to write more, we can return ENOSPC on the next write. PiperOrigin-RevId: 263041648
2019-08-12Compute size of struct tcp_info instead of hardcoding it.Rahat Mahmood
PiperOrigin-RevId: 263040624
2019-08-12Fix netstack build error on non-AMD64.Ian Gudger
This stub had the wrong function signature. PiperOrigin-RevId: 262992682
2019-08-09netlink: return an error in nlmsgerrAndrei Vagin
Now if a process sends an unsupported netlink requests, an error is returned from the send system call. The linux kernel works differently in this case. It returns errors in the nlmsgerr netlink message. Reported-by: syzbot+571d99510c6f935202da@syzkaller.appspotmail.com PiperOrigin-RevId: 262690453
2019-08-09Add congestion control states to sender.Bhasker Hariharan
This change just introduces different congestion control states and ensures the sender.state is updated to reflect the current state of the connection. It is not used for any decisions yet but this is required before algorithms like Eiffel/PRR can be implemented. Fixes #394 PiperOrigin-RevId: 262638292
2019-08-09Add initial ptrace stub and syscall support for arm64.Haibo Xu
Signed-off-by: Haibo Xu <haibo.xu@arm.com> Change-Id: I1dbd23bb240cca71d0cc30fc75ca5be28cb4c37c PiperOrigin-RevId: 262619519
2019-08-09ext: Move to pkg/sentry/fsimpl.Ayush Ranjan
fsimpl is the keeper of all filesystem implementations in VFS2. PiperOrigin-RevId: 262617869
2019-08-08ext: Benchmark tests.Ayush Ranjan
Added benchmark tests which emulate memfs benchmarks. Stat benchmarks BenchmarkVFS2Ext4fsStat/1-12 10000000 145 ns/op BenchmarkVFS2Ext4fsStat/2-12 10000000 170 ns/op BenchmarkVFS2Ext4fsStat/3-12 10000000 202 ns/op BenchmarkVFS2Ext4fsStat/8-12 3000000 374 ns/op BenchmarkVFS2Ext4fsStat/64-12 500000 2159 ns/op BenchmarkVFS2Ext4fsStat/100-12 300000 3459 ns/op BenchmarkVFS1TmpfsStat/1-12 5000000 348 ns/op BenchmarkVFS1TmpfsStat/2-12 3000000 487 ns/op BenchmarkVFS1TmpfsStat/3-12 2000000 655 ns/op BenchmarkVFS1TmpfsStat/8-12 1000000 1365 ns/op BenchmarkVFS1TmpfsStat/64-12 200000 9565 ns/op BenchmarkVFS1TmpfsStat/100-12 100000 15158 ns/op BenchmarkVFS2MemfsStat/1-12 10000000 133 ns/op BenchmarkVFS2MemfsStat/2-12 10000000 155 ns/op BenchmarkVFS2MemfsStat/3-12 10000000 182 ns/op BenchmarkVFS2MemfsStat/8-12 5000000 310 ns/op BenchmarkVFS2MemfsStat/64-12 1000000 1659 ns/op BenchmarkVFS2MemfsStat/100-12 500000 2787 ns/op Mount Stat benchmarks BenchmarkVFS2ExtfsMountStat/1-12 5000000 245 ns/op BenchmarkVFS2ExtfsMountStat/2-12 5000000 266 ns/op BenchmarkVFS2ExtfsMountStat/3-12 5000000 304 ns/op BenchmarkVFS2ExtfsMountStat/8-12 3000000 456 ns/op BenchmarkVFS2ExtfsMountStat/64-12 500000 2308 ns/op BenchmarkVFS2ExtfsMountStat/100-12 300000 3482 ns/op BenchmarkVFS1TmpfsMountStat/1-12 3000000 488 ns/op BenchmarkVFS1TmpfsMountStat/2-12 2000000 658 ns/op BenchmarkVFS1TmpfsMountStat/3-12 2000000 806 ns/op BenchmarkVFS1TmpfsMountStat/8-12 1000000 1514 ns/op BenchmarkVFS1TmpfsMountStat/64-12 100000 10037 ns/op BenchmarkVFS1TmpfsMountStat/100-12 100000 15280 ns/op BenchmarkVFS2MemfsMountStat/1-12 10000000 212 ns/op BenchmarkVFS2MemfsMountStat/2-12 5000000 232 ns/op BenchmarkVFS2MemfsMountStat/3-12 5000000 264 ns/op BenchmarkVFS2MemfsMountStat/8-12 3000000 390 ns/op BenchmarkVFS2MemfsMountStat/64-12 1000000 1813 ns/op BenchmarkVFS2MemfsMountStat/100-12 500000 2812 ns/op PiperOrigin-RevId: 262477158
2019-08-08Return a well-defined socket address type from socket funtions.Rahat Mahmood
Previously we were representing socket addresses as an interface{}, which allowed any type which could be binary.Marshal()ed to be used as a socket address. This is fine when the address is passed to userspace via the linux ABI, but is problematic when used from within the sentry such as by networking procfs files. PiperOrigin-RevId: 262460640
2019-08-08netstack: Don't start endpoint goroutines too soon on restore.Rahat Mahmood
Endpoint protocol goroutines were previously started as part of loading the endpoint. This is potentially too soon, as resources used by these goroutine may not have been loaded. Protocol goroutines may perform meaningful work as soon as they're started (ex: incoming connect) which can cause them to indirectly access resources that haven't been loaded yet. This CL defers resuming all protocol goroutines until the end of restore. PiperOrigin-RevId: 262409429
2019-08-08Merge pull request #653 from xiaobo55x:devgVisor bot
PiperOrigin-RevId: 262402929
2019-08-08memfs fixes.Jamie Liu
- Unexport Filesystem/Dentry/Inode. - Support SEEK_CUR in directoryFD.Seek(). - Hold Filesystem.mu before touching directoryFD.off in directoryFD.Seek(). - Remove deleted Dentries from their parent directory.childLists. - Remove invalid FIXMEs. PiperOrigin-RevId: 262400633
2019-08-07ext: Seek unit tests.Ayush Ranjan
PiperOrigin-RevId: 262264674
2019-08-07ext: StatAt unit tests.Ayush Ranjan
PiperOrigin-RevId: 262249166
2019-08-07ext: Read unit tests.Ayush Ranjan
PiperOrigin-RevId: 262242410
2019-08-07ext: IterDirent unit tests.Ayush Ranjan
PiperOrigin-RevId: 262226761
2019-08-07ext: vfs.FileDescriptionImpl and vfs.FilesystemImpl implementations.Ayush Ranjan
- This also gets rid of pipes for now because pipe does not have vfs2 specific support yet. - Added file path resolution logic. - Fixes testing infrastructure. - Does not include unit tests yet. PiperOrigin-RevId: 262213950
2019-08-07Set target address in ARP ReplyTamir Duberstein
PiperOrigin-RevId: 262163794
2019-08-06Fix for a panic due to writing to a closed accept channel.Bhasker Hariharan
This can happen because endpoint.Close() closes the accept channel first and then drains/resets any accepted but not delivered connections. But there can be connections that are connected but not delivered to the channel as the channel was full. But closing the channel can cause these writes to fail with a write to a closed channel. The correct solution is to abort any connections in SYN-RCVD state and drain/abort all completed connections before closing the accept channel. PiperOrigin-RevId: 261951132
2019-08-06Require pread/pwrite for splice file offsetsMichael Pratt
If there is an offset, the file must support pread/pwrite. See fs/splice.c:do_splice. PiperOrigin-RevId: 261944932
2019-08-05Change syscall.EPOLLET to unix.EPOLLETHaibo Xu
syscall.EPOLLET has been defined with different values on amd64 and arm64(-0x80000000 on amd64, and 0x80000000 on arm64), while unix.EPOLLET has been unified this value to 0x80000000(golang/go#5328). ref #63 Signed-off-by: Haibo Xu <haibo.xu@arm.com> Change-Id: Id97d075c4e79d86a2ea3227ffbef02d8b00ffbb8
2019-08-02Plumbing for iptables sockopts.Kevin Krakauer
PiperOrigin-RevId: 261413396
2019-08-02Job control: controlling TTYs and foreground process groups.Kevin Krakauer
(Don't worry, this is mostly tests.) Implemented the following ioctls: - TIOCSCTTY - set controlling TTY - TIOCNOTTY - remove controlling tty, maybe signal some other processes - TIOCGPGRP - get foreground process group. Also enables tcgetpgrp(). - TIOCSPGRP - set foreground process group. Also enabled tcsetpgrp(). Next steps are to actually turn terminal-generated control characters (e.g. C^c) into signals to the proper process groups, and to send SIGTTOU and SIGTTIN when appropriate. PiperOrigin-RevId: 261387276
2019-08-02Automated rollback of changelist 261191548Rahat Mahmood
PiperOrigin-RevId: 261373749
2019-08-02Remove kernel.mounts.Nicolas Lacasse
We can get the mount namespace from the CreateProcessArgs in all cases where we need it. This also gets rid of kernel.Destroy method, since the only thing it was doing was DecRefing the mounts. Removing the need to call kernel.SetRootMountNamespace also allowed for some more simplifications in the container fs setup code. PiperOrigin-RevId: 261357060
2019-08-01Drop reference on fs.Inode if Mount goes wrong.Nicolas Lacasse
PiperOrigin-RevId: 261203674