summaryrefslogtreecommitdiffhomepage
path: root/pkg/abi/linux
AgeCommit message (Collapse)Author
2020-09-16Rename marshal.Task to marshal.CopyContext.Rahat Mahmood
CopyContext is a better name for the interface because from go-marshal's perspective, the interface has nothing to do with a task. A kernel.Task happens to implement the interface, but so can other things like MemoryManager and IO sequences. PiperOrigin-RevId: 331959678
2020-09-15Enable automated marshalling for the syscall package.Rahat Mahmood
PiperOrigin-RevId: 331940975
2020-09-15Add support for OCI seccomp filters in the sandbox.Ian Lewis
OCI configuration includes support for specifying seccomp filters. In runc, these filter configurations are converted into seccomp BPF programs and loaded into the kernel via libseccomp. runsc needs to be a static binary so, for runsc, we cannot rely on a C library and need to implement the functionality in Go. The generator added here implements basic support for taking OCI seccomp configuration and converting it into a seccomp BPF program with the same behavior as a program generated by libseccomp. - New conditional operations were added to pkg/seccomp to support operations available in OCI. - AllowAny and AllowValue were renamed to MatchAny and EqualTo to better reflect that syscalls matching the conditionals result in the provided action not simply SCMP_RET_ALLOW. - BuildProgram in pkg/seccomp no longer panics if provided an empty list of rules. It now builds a program with the architecture sanity check only. - ProgramBuilder now allows adding labels that are unused. However, backwards jumps are still not permitted. Fixes #510 PiperOrigin-RevId: 331938697
2020-09-15Implement gvisor verity fs ioctl with GETFLAGSChong Cai
PiperOrigin-RevId: 331905347
2020-09-11Move the 'marshal' and 'primitive' packages to the 'pkg' directory.Rahat Mahmood
PiperOrigin-RevId: 331256608
2020-09-08Implement ioctl with enable veritygVisor bot
ioctl with FS_IOC_ENABLE_VERITY is added to verity file system to enable a file as verity file. For a file, a Merkle tree is built with its data. For a directory, a Merkle tree is built with the root hashes of its children. PiperOrigin-RevId: 330604368
2020-09-01Refactor tty codebase to use master-replica terminology.Ayush Ranjan
Updates #2972 PiperOrigin-RevId: 329584905
2020-09-01[go-marshal] Enable auto-marshalling for fs/tty.Ayush Ranjan
PiperOrigin-RevId: 329564614
2020-08-28Implement StatFS for various VFS2 filesystems.Rahat Mahmood
This mainly involved enabling kernfs' client filesystems to provide a StatFS implementation. Fixes #3411, #3515. PiperOrigin-RevId: 329009864
2020-08-27ip6tables: (de)serialize ip6tables structsKevin Krakauer
More implementation+testing to follow. #3549. PiperOrigin-RevId: 328770160
2020-08-26tmpfs: Allow xattrs in the trusted namespace if creds has CAP_SYS_ADMIN.Nicolas Lacasse
This is needed to support the overlay opaque attribute. PiperOrigin-RevId: 328552985
2020-08-25Return non-zero size for tmpfs statfs(2).Jamie Liu
This does not implement accepting or enforcing any size limit, which will be more complex and has performance implications; it just returns a fixed non-zero size. Updates #1936 PiperOrigin-RevId: 328428588
2020-08-25Expose basic coverage information to userspace through kcov interface.Dean Deng
In Linux, a kernel configuration is set that compiles the kernel with a custom function that is called at the beginning of every basic block, which updates the memory-mapped coverage information. The Go coverage tool does not allow us to inject arbitrary instructions into basic blocks, but it does provide data that we can convert to a kcov-like format and transfer them to userspace through a memory mapping. Note that this is not a strict implementation of kcov, which is especially tricky to do because we do not have the same coverage tools available in Go that that are available for the actual Linux kernel. In Linux, a kernel configuration is set that compiles the kernel with a custom function that is called at the beginning of every basic block to write program counters to the kcov memory mapping. In Go, however, coverage tools only give us a count of basic blocks as they are executed. Every time we return to userspace, we collect the coverage information and write out PCs for each block that was executed, providing userspace with the illusion that the kcov data is always up to date. For convenience, we also generate a unique synthetic PC for each block instead of using actual PCs. Finally, we do not provide thread-specific coverage data (each kcov instance only contains PCs executed by the thread owning it); instead, we will supply data for any file specified by -- instrumentation_filter. Also, fix issue in nogo that was causing pkg/coverage:coverage_nogo compilation to fail. PiperOrigin-RevId: 328426526
2020-08-25[go-marshal] Enable auto-marshalling for host tty.Ayush Ranjan
PiperOrigin-RevId: 328415633
2020-08-12ip6tables: ABI structs and constantsKevin Krakauer
Part of #3549. PiperOrigin-RevId: 326329028
2020-08-10Implement FUSE_GETATTRCraig Chi
FUSE_GETATTR is called when a stat(2), fstat(2), or lstat(2) is issued from VFS2 layer to a FUSE filesystem. Fixes #3175
2020-08-04Internal change.gVisor bot
PiperOrigin-RevId: 324819246
2020-07-31iptables: support SO_ORIGINAL_DSTKevin Krakauer
Envoy (#170) uses this to get the original destination of redirected packets.
2020-07-29Add FUSE_INITJinmou Li
This change allows the sentry to send FUSE_INIT request and process the reply. It adds the corresponding structs, employs the fuse device to send and read the message, and stores the results of negotiation in corresponding places (inside connection struct). It adds a CallAsync() function to the FUSE connection interface: - like Call(), but it's for requests that do not expect immediate response (init, release, interrupt etc.) - will block if the connection hasn't initialized, which is the same for Call()
2020-07-27Add device implementation for /dev/fuseRidwan Sharif
This PR adds the following: - [x] Marshall-able structs for fuse headers - [x] Data structures needed in /dev/fuse to communicate with the daemon server - [x] Implementation of the device interface - [x] Go unit tests This change adds the `/dev/fuse` implementation. `Connection` controls the communication between the server and the sentry. The FUSE server uses the `FileDescription` interface to interact with the Sentry. The Sentry implmenetation of fusefs, uses `Connection` and the Connection interface to interact with the Server. All communication messages are in the form of `go_marshal` backed structs defined in the ABI package. This change also adds some go unit tests that test (pretty basically) the interfaces and should be used as an example of an end to end FUSE operation. COPYBARA_INTEGRATE_REVIEW=https://github.com/google/gvisor/pull/3083 from ridwanmsharif:ridwanmsharif/fuse-device-impl 69aa2ce970004938fe9f918168dfe57636ab856e PiperOrigin-RevId: 323428180
2020-07-24Enable automated marshalling for netstack.Ayush Ranjan
PiperOrigin-RevId: 322954792
2020-07-23Implement get/set_robust_list.Nicolas Lacasse
PiperOrigin-RevId: 322904430
2020-07-23Marshallable socket opitons.Ayush Ranjan
Socket option values are now required to implement marshal.Marshallable. Co-authored-by: Rahat Mahmood <rahat@google.com> PiperOrigin-RevId: 322831612
2020-07-22Support for receiving outbound packets in AF_PACKET.Bhasker Hariharan
Updates #173 PiperOrigin-RevId: 322665518
2020-07-15Fix minor bugs in a couple of interface IOCTLs.Bhasker Hariharan
gVisor incorrectly returns the wrong ARP type for SIOGIFHWADDR. This breaks tcpdump as it tries to interpret the packets incorrectly. Similarly, SIOCETHTOOL is used by tcpdump to query interface properties which fails with an EINVAL since we don't implement it. For now change it to return EOPNOTSUPP to indicate that we don't support the query rather than return EINVAL. NOTE: ARPHRD types for link endpoints are distinct from NIC capabilities and NIC flags. In Linux all 3 exist eg. ARPHRD types are stored in dev->type field while NIC capabilities are more like the device features which can be queried using SIOCETHTOOL but not modified and NIC Flags are fields that can be modified from user space. eg. NIC status (UP/DOWN/MULTICAST/BROADCAST) etc. Updates #2746 PiperOrigin-RevId: 321436525
2020-07-01Update preadv2/pwritev2 flag handling in vfs2.Dean Deng
We do not support RWF_SYNC/RWF_DSYNC and probably shouldn't silently accept them, since the user may incorrectly believe that we are synchronizing I/O. Remove the pwritev2 test verifying that we support these flags. gvisor.dev/issue/2601 is the tracking bug for deciding which RWF_.* flags we need and supporting them. Updates #2923, #2601. PiperOrigin-RevId: 319351286
2020-06-27Port GETOWN, SETOWN fcntls to vfs2.Dean Deng
Also make some fixes to vfs1's F_SETOWN. The fcntl test now entirely passes on vfs2. Fixes #2920. PiperOrigin-RevId: 318669529
2020-06-25Add FUSE character deviceRidwan Sharif
This change adds a FUSE character device backed by devtmpfs. This device will be used to establish a connection between the FUSE server daemon and fusefs. The FileDescriptionImpl methods will be implemented as we flesh out fusefs some more. The tests assert that the device can be opened and used.
2020-06-19Port fadvise64 to vfs2.Dean Deng
Like vfs1, we have a trivial implementation that ignores all valid advice. Updates #2923. PiperOrigin-RevId: 317349505
2020-06-16Port aio to VFS2.Nicolas Lacasse
In order to make sure all aio goroutines have stopped during S/R, a new WaitGroup was added to TaskSet, analagous to runningGoroutines. This WaitGroup is incremented with each aio goroutine, and waited on during kernel.Pause. The old VFS1 aio code was changed to use this new WaitGroup, rather than fs.Async. The only uses of fs.Async are now inode and mount Release operations, which do not call fs.Async recursively. This fixes a lock-ordering violation that can cause deadlocks. Updates #1035. PiperOrigin-RevId: 316689380
2020-06-10{S,G}etsockopt for TCP_KEEPCNT option.Nayana Bidari
TCP_KEEPCNT is used to set the maximum keepalive probes to be sent before dropping the connection. WANT_LGTM=jchacon PiperOrigin-RevId: 315758094
2020-05-29Internal change.gVisor bot
PiperOrigin-RevId: 313821986
2020-05-08iptables - filter packets using outgoing interface.gVisor bot
Enables commands with -o (--out-interface) for iptables rules. $ iptables -A OUTPUT -o eth0 -j ACCEPT PiperOrigin-RevId: 310642286
2020-05-07Allocate device numbers for VFS2 filesystems.Jamie Liu
Updates #1197, #1198, #1672 PiperOrigin-RevId: 310432006
2020-04-25Enable automated marshalling for signals and the arch package.Rahat Mahmood
PiperOrigin-RevId: 308472331
2020-04-23Enable automated marshalling for mempolicy syscalls.Rahat Mahmood
PiperOrigin-RevId: 308170679
2020-04-23Enable automated marshalling for epoll events.Rahat Mahmood
Ensure we use the correct architecture-specific defintion of epoll event, and use go-marshal for serialization. PiperOrigin-RevId: 308145677
2020-04-22Specify a memory file in platform.New().Andrei Vagin
PiperOrigin-RevId: 307941984
2020-03-26Support owner matching for iptables.Nayana Bidari
This feature will match UID and GID of the packet creator, for locally generated packets. This match is only valid in the OUTPUT and POSTROUTING chains. Forwarded packets do not have any socket associated with them. Packets from kernel threads do have a socket, but usually no owner.
2020-03-26Combine file mode and isDir argumentsFabricio Voznika
Updates #1035 PiperOrigin-RevId: 303021328
2020-03-16Merge pull request #1943 from kevinGC:ipt-filter-ipgVisor bot
PiperOrigin-RevId: 301197007
2020-03-14Plumb VFS2 imported fds into virtual filesystem.Dean Deng
- When setting up the virtual filesystem, mount a host.filesystem to contain all files that need to be imported. - Make read/preadv syscalls to the host in cases where preadv2 may not be supported yet (likewise for writing). - Make save/restore functions in kernel/kernel.go return early if vfs2 is enabled. PiperOrigin-RevId: 300922353
2020-03-11Merge pull request #1975 from nybidari:iptablesgVisor bot
PiperOrigin-RevId: 300362789
2020-03-09Enable thread local storage support on arm64.Haibo Xu
Linux use the task.thread.uw.tp_value field to store the TLS pointer on arm64 platform, and we use a similar way in gvisor to store it in the arch/State struct. Signed-off-by: Haibo Xu <haibo.xu@arm.com> Change-Id: Ie76b5c6d109bc27ccfd594008a96753806db7764
2020-02-26iptables: filter by IP address (and range)Kevin Krakauer
Enables commands such as: $ iptables -A INPUT -d 127.0.0.1 -j ACCEPT $ iptables -t nat -A PREROUTING ! -d 127.0.0.1 -j REDIRECT Also adds a bunch of REDIRECT+destination tests.
2020-02-25Merge branch 'master' into iptablesnybidari
2020-02-25Add nat table support for iptables.Nayana Bidari
- commit the changes for the comments.
2020-02-25Port most syscalls to VFS2.Jamie Liu
pipe and pipe2 aren't ported, pending a slight rework of pipe FDs for VFS2. mount and umount2 aren't ported out of temporary laziness. access and faccessat need additional FSImpl methods to implement properly, but are stubbed to prevent googletest from CHECK-failing. Other syscalls require additional plumbing. Updates #1623 PiperOrigin-RevId: 297188448
2020-02-21Implement tap/tun device in vfs.Ting-Yu Wang
PiperOrigin-RevId: 296526279
2020-02-20Better strace logging for epoll syscalls.gVisor bot
Example: epoll_ctl(0x3 anon_inode:[eventpoll], EPOLL_CTL_ADD, 0x6 anon_inode:[eventfd], 0x7efe2fd92a80 {events=EPOLLIN|EPOLLOUT data=0x10203040506070a}) = 0x0 (4.411µs) epoll_wait(0x3 anon_inode:[eventpoll], 0x7efe2fd92b50 {{events=EPOLLOUT data=0x102030405060708}{events=EPOLLOUT data=0x102030405060708}{events=EPOLLOUT data=0x102030405060708}}, 0x3, 0xffffffff) = 0x3 (29.891µs) PiperOrigin-RevId: 296258146