summaryrefslogtreecommitdiffhomepage
path: root/pkg/sentry
AgeCommit message (Collapse)Author
2019-03-26Implement memfd_create.Rahat Mahmood
Memfds are simply anonymous tmpfs files with no associated mounts. Also implementing file seals, which Linux only implements for memfds at the moment. PiperOrigin-RevId: 240450031 Change-Id: I31de78b950101ae8d7a13d0e93fe52d98ea06f2f
2019-03-25Call memmap.Mappable.Translate with more conservative usermem.AccessType.Jamie Liu
MM.insertPMAsLocked() passes vma.maxPerms to memmap.Mappable.Translate (although it unsets AccessType.Write if the vma is private). This somewhat simplifies handling of pmas, since it means only COW-break needs to replace existing pmas. However, it also means that a MAP_SHARED mapping of a file opened O_RDWR dirties the file, regardless of the mapping's permissions and whether or not the mapping is ever actually written to with I/O that ignores permissions (e.g. ptrace(PTRACE_POKEDATA)). To fix this: - Change the pma-getting path to request only the permissions that are required for the calling access. - Change memmap.Mappable.Translate to take requested permissions, and return allowed permissions. This preserves the existing behavior in the common cases where the memmap.Mappable isn't fsutil.CachingInodeOperations and doesn't care if the translated platform.File pages are written to. - Change the MM.getPMAsLocked path to support permission upgrading of pmas outside of copy-on-write. PiperOrigin-RevId: 240196979 Change-Id: Ie0147c62c1fbc409467a6fa16269a413f3d7d571
2019-03-25epoll: use ilist:generic_list instead of ilist:ilistAndrei Vagin
ilist:generic_list works faster than ilist:ilist. Here is a beanchmark test to measure performance of epoll_wait, when readyList isn't empty. It shows about 30% better performance with these changes. Benchmark Time(ns) CPU(ns) Iterations Before: BM_EpollAllEvents 46725 46899 14286 After: BM_EpollAllEvents 33167 33300 18919 PiperOrigin-RevId: 240185278 Change-Id: I3e33f9b214db13ab840b91613400525de5b58d18
2019-03-22lstat should resolve the final path component if it ends in a slash.Nicolas Lacasse
PiperOrigin-RevId: 239896221 Change-Id: I0949981fe50c57131c5631cdeb10b225648575c0
2019-03-22Implement PTRACE_SEIZE, PTRACE_INTERRUPT, and PTRACE_LISTEN.Jamie Liu
PiperOrigin-RevId: 239803092 Change-Id: I42d612ed6a889e011e8474538958c6de90c6fcab
2019-03-21Allow BP and OF can be called from user spaceYong He
Change the DPL from 0 to 3 for Breakpoint and Overflow, then user space could trigger Breakpoint and Overflow as excepected. Change-Id: Ibead65fb8c98b32b7737f316db93b3a8d9dcd648 PiperOrigin-RevId: 239736648
2019-03-21Replace manual pty copies to/from userspace with safemem operations.Kevin Krakauer
Also, changing queue.writeBuf from a buffer.Bytes to a [][]byte should reduce copying and reallocating of slices. PiperOrigin-RevId: 239713547 Change-Id: I6ee5ff19c3ee2662f1af5749cae7b73db0569e96
2019-03-21Clear msghdr flags on successful recvmsg.Ian Gudger
.net sets these flags to -1 and then uses their result, especting it to be zero. Does not set actual flags (e.g. MSG_TRUNC), but setting to zero is more correct than what we did before. PiperOrigin-RevId: 239657951 Change-Id: I89c5f84bc9b94a2cd8ff84e8ecfea09e01142030
2019-03-20gvisor: don't allocate a new credential object on forkAndrei Vagin
A credential object is immutable, so we don't need to copy it for a new task. PiperOrigin-RevId: 239519266 Change-Id: I0632f641fdea9554779ac25d84bee4231d0d18f2
2019-03-20Record sockets created during accept(2) for all families.Rahat Mahmood
Track new sockets created during accept(2) in the socket table for all families. Previously we were only doing this for unix domain sockets. PiperOrigin-RevId: 239475550 Change-Id: I16f009f24a06245bfd1d72ffd2175200f837c6ac
2019-03-19netstack: reduce MSS from SYN to account tcp optionsAndrei Vagin
See: https://tools.ietf.org/html/rfc6691#section-2 PiperOrigin-RevId: 239305632 Change-Id: Ie8eb912a43332e6490045dc95570709c5b81855e
2019-03-19Fix data race in netlink send buffer sizeFabricio Voznika
PiperOrigin-RevId: 239221041 Change-Id: Icc19e32a00fa89167447ab2f45e90dcfd61bea04
2019-03-18Remove references to replaced child in Rename in ramfs/agentfsMichael Pratt
In the case of a rename replacing an existing destination inode, ramfs Rename failed to first remove the replaced inode. This caused: 1. A leak of a reference to the inode (making it live indefinitely). 2. For directories, a leak of the replaced directory's .. link to the parent. This would cause the parent's link count to incorrectly increase. (2) is much simpler to test than (1), so that's what I've done. agentfs has a similar bug with link count only, so the Dirent layer informs the Inode if this is a replacing rename. Fixes #133 PiperOrigin-RevId: 239105698 Change-Id: I4450af2462d8ae3339def812287213d2cbeebde0
2019-03-18Remove racy access to shm fields.Rahat Mahmood
PiperOrigin-RevId: 239016776 Change-Id: Ia7af4258e7c69b16a4630a6f3278aa8e6b627746
2019-03-14Decouple filemem from platform and move it to pgalloc.MemoryFile.Jamie Liu
This is in preparation for improved page cache reclaim, which requires greater integration between the page cache and page allocator. PiperOrigin-RevId: 238444706 Change-Id: Id24141b3678d96c7d7dc24baddd9be555bffafe4
2019-03-14Use WalkGetAttr in gofer.inodeOperations.Create.Jamie Liu
p9.Twalk.handle() with a non-empty path also stats the walked-to path anyway, so the preceding GetAttr is completely wasted. PiperOrigin-RevId: 238440645 Change-Id: I7fbc7536f46b8157639d0d1f491e6aaa9ab688a3
2019-03-13Allow filesystem.Mount to take an optional interface argument.Nicolas Lacasse
PiperOrigin-RevId: 238360231 Change-Id: I5eaf8d26f8892f77d71c7fbd6c5225ef471cedf1
2019-03-12Clarify the platform.File interface.Jamie Liu
- Redefine some memmap.Mappable, platform.File, and platform.Memory semantics in terms of File reference counts (no functional change). - Make AddressSpace.MapFile take a platform.File instead of a raw FD, and replace platform.File.MapInto with platform.File.FD. This allows kvm.AddressSpace.MapFile to always use platform.File.MapInternal instead of maintaining its own (redundant) cache of file mappings in the sentry address space. PiperOrigin-RevId: 238044504 Change-Id: Ib73a11e4275c0da0126d0194aa6c6017a9cef64f
2019-03-11kvm: minimum guest/host timekeeping delta.Adin Scannell
PiperOrigin-RevId: 237927368 Change-Id: I359badd1967bb118fe74eab3282c946c18937edc
2019-03-11Add profiling commands to runscFabricio Voznika
Example: runsc debug --root=<dir> \ --profile-heap=/tmp/heap.prof \ --profile-cpu=/tmp/cpu.prod --profile-delay=30 \ <container ID> PiperOrigin-RevId: 237848456 Change-Id: Icff3f20c1b157a84d0922599eaea327320dad773
2019-03-09Fix getsockopt(IP_MULTICAST_IF).Ian Gudger
getsockopt(IP_MULTICAST_IF) only supports struct in_addr. Also adds support for setsockopt(IP_MULTICAST_IF) with struct in_addr. PiperOrigin-RevId: 237620230 Change-Id: I75e7b5b3e08972164eb1906f43ddd67aedffc27c
2019-03-08Make IP_MULTICAST_LOOP and IP_MULTICAST_TTL allow setting int or char.Ian Gudger
This is the correct Linux behavior, and at least PHP depends on it. PiperOrigin-RevId: 237565639 Change-Id: I931af09c8ed99a842cf70d22bfe0b65e330c4137
2019-03-08Implement IP_MULTICAST_LOOP.Ian Gudger
IP_MULTICAST_LOOP controls whether or not multicast packets sent on the default route are looped back. In order to implement this switch, support for sending and looping back multicast packets on the default route had to be implemented. For now we only support IPv4 multicast. PiperOrigin-RevId: 237534603 Change-Id: I490ac7ff8e8ebef417c7eb049a919c29d156ac1c
2019-03-06No need to check for negative uintptr.Nicolas Lacasse
Fixes #134 PiperOrigin-RevId: 237128306 Change-Id: I396e808484c18931fc5775970ec1f5ae231e1cb9
2019-03-05Priority-inheritance futex implementationFabricio Voznika
It is Implemented without the priority inheritance part given that gVisor defers scheduling decisions to Go runtime and doesn't have control over it. PiperOrigin-RevId: 236989545 Change-Id: I714c8ca0798743ecf3167b14ffeb5cd834302560
2019-03-05Add new retransmissions and recovery related metrics.Bhasker Hariharan
PiperOrigin-RevId: 236945145 Change-Id: I051760d95154ea5574c8bb6aea526f488af5e07b
2019-03-05Remove unused commit() function argument to Bind.Kevin Krakauer
PiperOrigin-RevId: 236926132 Change-Id: I5cf103f22766e6e65a581de780c7bb9ca0fa3181
2019-03-04Make tmpfs respect MountNoATime now that fs.Handle is gone.Nicolas Lacasse
PiperOrigin-RevId: 236752802 Change-Id: I9e50600b2ae25d5f2ac632c4405a7a185bdc3c92
2019-03-01ptrace: drop old FIXMEAdin Scannell
The globalPool uses a sync.Once mechanism for initialization, and no cleanup is strictly required. It's not really feasible to have the platform implement a full creation -> destruction cycle (due to the way filters are assumed to be installed), so drop the FIXME. PiperOrigin-RevId: 236385278 Change-Id: I98ac660ed58cc688d8a07147d16074a3e8181314
2019-03-01DecRef replaced dirent in inode_overlay.Nicolas Lacasse
PiperOrigin-RevId: 236352158 Change-Id: Ide5104620999eaef6820917505e7299c7b0c5a03
2019-03-01Add semctl(GETPID) syscallFabricio Voznika
Also added unimplemented notification for semctl(2) commands. PiperOrigin-RevId: 236340672 Change-Id: I0795e3bd2e6d41d7936fabb731884df426a42478
2019-03-01Format capget/capset argumentsMichael Pratt
I0225 15:32:10.795034 4166 x:0] [ 6] E capget(0x7f477fdff8c8 {Version: 3, Pid: 0}, 0x7f477fdff8b0) I0225 15:32:10.795059 4166 x:0] [ 6] X capget(0x7f477fdff8c8 {Version: 3, Pid: 0}, 0x7f477fdff8b0 {Permitted: CAP_CHOWN|CAP_DAC_OVERRIDE|CAP_DAC_READ_SEARCH|CAP_FOWNER|CAP_FSETID|CAP_KILL|CAP_SETGID|CAP_SETUID|CAP_SETPCAP|CAP_LINUX_IMMUTABLE|CAP_NET_BIND_SERVICE|CAP_NET_BROADCAST|CAP_NET_ADMIN|CAP_NET_RAW|CAP_IPC_LOCK|CAP_IPC_OWNER|CAP_SYS_MODULE|CAP_SYS_RAWIO|CAP_SYS_CHROOT|CAP_SYS_PTRACE|CAP_SYS_PACCT|CAP_SYS_ADMIN|CAP_SYS_BOOT|CAP_SYS_NICE|CAP_SYS_RESOURCE|CAP_SYS_TIME|CAP_SYS_TTY_CONFIG|CAP_MKNOD|CAP_LEASE|CAP_AUDIT_WRITE|CAP_AUDIT_CONTROL|CAP_SETFCAP|CAP_MAC_OVERRIDE|CAP_MAC_ADMIN|CAP_SYSLOG|CAP_WAKE_ALARM|CAP_BLOCK_SUSPEND|CAP_AUDIT_READ, Inheritable: CAP_CHOWN|CAP_DAC_OVERRIDE|CAP_DAC_READ_SEARCH|CAP_FOWNER|CAP_FSETID|CAP_KILL|CAP_SETGID|CAP_SETUID|CAP_SETPCAP|CAP_LINUX_IMMUTABLE|CAP_NET_BIND_SERVICE|CAP_NET_BROADCAST|CAP_NET_ADMIN|CAP_NET_RAW|CAP_IPC_LOCK|CAP_IPC_OWNER|CAP_SYS_MODULE|CAP_SYS_RAWIO|CAP_SYS_CHROOT|CAP_SYS_PTRACE|CAP_SYS_PACCT|CAP_SYS_ADMIN|CAP_SYS_BOOT|CAP_SYS_NICE|CAP_SYS_RESOURCE|CAP_SYS_TIME|CAP_SYS_TTY_CONFIG|CAP_MKNOD|CAP_LEASE|CAP_AUDIT_WRITE|CAP_AUDIT_CONTROL|CAP_SETFCAP|CAP_MAC_OVERRIDE|CAP_MAC_ADMIN|CAP_SYSLOG|CAP_WAKE_ALARM|CAP_BLOCK_SUSPEND|CAP_AUDIT_READ, Effective: 0x0}) = 0x0 (3.399?s) I0225 15:32:10.795114 4166 x:0] [ 6] E capset(0x7f477fdff8c8 {Version: 3, Pid: 0}, 0x7f477fdff8b0 {Permitted: CAP_CHOWN|CAP_DAC_OVERRIDE|CAP_DAC_READ_SEARCH|CAP_FOWNER|CAP_FSETID|CAP_KILL|CAP_SETGID|CAP_SETUID|CAP_SETPCAP|CAP_LINUX_IMMUTABLE|CAP_NET_BIND_SERVICE|CAP_NET_BROADCAST|CAP_NET_ADMIN|CAP_NET_RAW|CAP_IPC_LOCK|CAP_IPC_OWNER|CAP_SYS_MODULE|CAP_SYS_RAWIO|CAP_SYS_CHROOT|CAP_SYS_PTRACE|CAP_SYS_PACCT|CAP_SYS_ADMIN|CAP_SYS_BOOT|CAP_SYS_NICE|CAP_SYS_RESOURCE|CAP_SYS_TIME|CAP_SYS_TTY_CONFIG|CAP_MKNOD|CAP_LEASE|CAP_AUDIT_WRITE|CAP_AUDIT_CONTROL|CAP_SETFCAP|CAP_MAC_OVERRIDE|CAP_MAC_ADMIN|CAP_SYSLOG|CAP_WAKE_ALARM|CAP_BLOCK_SUSPEND|CAP_AUDIT_READ, Inheritable: CAP_CHOWN|CAP_DAC_OVERRIDE|CAP_DAC_READ_SEARCH|CAP_FOWNER|CAP_FSETID|CAP_KILL|CAP_SETGID|CAP_SETUID|CAP_SETPCAP|CAP_LINUX_IMMUTABLE|CAP_NET_BIND_SERVICE|CAP_NET_BROADCAST|CAP_NET_ADMIN|CAP_NET_RAW|CAP_IPC_LOCK|CAP_IPC_OWNER|CAP_SYS_MODULE|CAP_SYS_RAWIO|CAP_SYS_CHROOT|CAP_SYS_PTRACE|CAP_SYS_PACCT|CAP_SYS_ADMIN|CAP_SYS_BOOT|CAP_SYS_NICE|CAP_SYS_RESOURCE|CAP_SYS_TIME|CAP_SYS_TTY_CONFIG|CAP_MKNOD|CAP_LEASE|CAP_AUDIT_WRITE|CAP_AUDIT_CONTROL|CAP_SETFCAP|CAP_MAC_OVERRIDE|CAP_MAC_ADMIN|CAP_SYSLOG|CAP_WAKE_ALARM|CAP_BLOCK_SUSPEND|CAP_AUDIT_READ, Effective: CAP_FOWNER}) I0225 15:32:10.795127 4166 x:0] [ 6] X capset(0x7f477fdff8c8 {Version: 3, Pid: 0}, 0x7f477fdff8b0 {Permitted: CAP_CHOWN|CAP_DAC_OVERRIDE|CAP_DAC_READ_SEARCH|CAP_FOWNER|CAP_FSETID|CAP_KILL|CAP_SETGID|CAP_SETUID|CAP_SETPCAP|CAP_LINUX_IMMUTABLE|CAP_NET_BIND_SERVICE|CAP_NET_BROADCAST|CAP_NET_ADMIN|CAP_NET_RAW|CAP_IPC_LOCK|CAP_IPC_OWNER|CAP_SYS_MODULE|CAP_SYS_RAWIO|CAP_SYS_CHROOT|CAP_SYS_PTRACE|CAP_SYS_PACCT|CAP_SYS_ADMIN|CAP_SYS_BOOT|CAP_SYS_NICE|CAP_SYS_RESOURCE|CAP_SYS_TIME|CAP_SYS_TTY_CONFIG|CAP_MKNOD|CAP_LEASE|CAP_AUDIT_WRITE|CAP_AUDIT_CONTROL|CAP_SETFCAP|CAP_MAC_OVERRIDE|CAP_MAC_ADMIN|CAP_SYSLOG|CAP_WAKE_ALARM|CAP_BLOCK_SUSPEND|CAP_AUDIT_READ, Inheritable: CAP_CHOWN|CAP_DAC_OVERRIDE|CAP_DAC_READ_SEARCH|CAP_FOWNER|CAP_FSETID|CAP_KILL|CAP_SETGID|CAP_SETUID|CAP_SETPCAP|CAP_LINUX_IMMUTABLE|CAP_NET_BIND_SERVICE|CAP_NET_BROADCAST|CAP_NET_ADMIN|CAP_NET_RAW|CAP_IPC_LOCK|CAP_IPC_OWNER|CAP_SYS_MODULE|CAP_SYS_RAWIO|CAP_SYS_CHROOT|CAP_SYS_PTRACE|CAP_SYS_PACCT|CAP_SYS_ADMIN|CAP_SYS_BOOT|CAP_SYS_NICE|CAP_SYS_RESOURCE|CAP_SYS_TIME|CAP_SYS_TTY_CONFIG|CAP_MKNOD|CAP_LEASE|CAP_AUDIT_WRITE|CAP_AUDIT_CONTROL|CAP_SETFCAP|CAP_MAC_OVERRIDE|CAP_MAC_ADMIN|CAP_SYSLOG|CAP_WAKE_ALARM|CAP_BLOCK_SUSPEND|CAP_AUDIT_READ, Effective: CAP_FOWNER}) = 0x0 (3.062?s) Not the most readable, but better than just a pointer. PiperOrigin-RevId: 236338875 Change-Id: I4b83f778122ab98de3874e16f4258dae18da916b
2019-02-28Fix "-c dbg" build breakFabricio Voznika
Remove allocation from vCPU.die() to save stack space. Closes #131 PiperOrigin-RevId: 236238102 Change-Id: Iafca27a1a3a472d4cb11dcda9a2060e585139d11
2019-02-28Fix procfs bugsRuidong Cao
Current procfs has some bugs. After executing ls twice, many dirs come out with same name like "1" or ".". Files like "cpuinfo" disappear. Here variable names is a slice with cap() > len(). Sort after appending to it will not alloc a new space and impact orignal slice. Same to m. Signed-off-by: Ruidong Cao <crdfrank@gmail.com> Change-Id: I83e5cd1c7968c6fe28c35ea4fee497488d4f9eef PiperOrigin-RevId: 236222270
2019-02-28Upgrade to Go 1.12Michael Pratt
PiperOrigin-RevId: 236218980 Change-Id: I82cb4aeb2a56524ee1324bfea2ad41dce26db354
2019-02-28Hold dataMu for writing in CachingInodeOperations.WriteOut.Jamie Liu
fsutil.SyncDirtyAll mutates the DirtySet. PiperOrigin-RevId: 236183349 Change-Id: I7e809d5b406ac843407e61eff17d81259a819b4f
2019-02-27Ping support via IPv4 raw sockets.Kevin Krakauer
Broadly, this change: * Enables sockets to be created via `socket(AF_INET, SOCK_RAW, IPPROTO_ICMP)`. * Passes the network-layer (IP) header up the stack to the transport endpoint, which can pass it up to the socket layer. This allows a raw socket to return the entire IP packet to users. * Adds functions to stack.TransportProtocol, stack.Stack, stack.transportDemuxer that enable incoming packets to be delivered to raw endpoints. New raw sockets of other protocols (not ICMP) just need to register with the stack. * Enables ping.endpoint to return IP headers when created via SOCK_RAW. PiperOrigin-RevId: 235993280 Change-Id: I60ed994f5ff18b2cbd79f063a7fdf15d093d845a
2019-02-27Allow overlay to merge Directories and SepcialDirectories.Nicolas Lacasse
Needed to mount inside /proc or /sys. PiperOrigin-RevId: 235936529 Change-Id: Iee6f2671721b1b9b58a3989705ea901322ec9206
2019-02-26Fix bad mergeFabricio Voznika
PiperOrigin-RevId: 235818534 Change-Id: I99f7e3fd1dc808b35f7a08b96b7c3226603ab808
2019-02-26FPE_INTOVF (integer overflow) should be 2 refer to Linux.Ruidong Cao
Signed-off-by: Ruidong Cao <crdfrank@gmail.com> Change-Id: I03f8ab25cf29257b31f145cf43304525a93f3300 PiperOrigin-RevId: 235763203
2019-02-26Lazily allocate inotify map on inodeFabricio Voznika
PiperOrigin-RevId: 235735865 Change-Id: I84223eb18eb51da1fa9768feaae80387ff6bfed0
2019-02-25Handle invalid offset in sendfile(2)Fabricio Voznika
PiperOrigin-RevId: 235578698 Change-Id: I608ff5e25eac97f6e1bda058511c1f82b0e3b736
2019-02-21Internal change.Googler
PiperOrigin-RevId: 235053594 Change-Id: Ie3d7b11843d0710184a2463886c7034e8f5305d1
2019-02-20Make some ptrace commands x86-onlyHaibo Xu
Signed-off-by: Haibo Xu <haibo.xu@arm.com> Change-Id: I9751f859332d433ca772d6b9733f5a5a64398ec7 PiperOrigin-RevId: 234877624
2019-02-20Implement Broadcast supportAmanda Tait
This change adds support for the SO_BROADCAST socket option in gVisor Netstack. This support includes getsockopt()/setsockopt() functionality for both UDP and TCP endpoints (the latter being a NOOP), dispatching broadcast messages up and down the stack, and route finding/creation for broadcast packets. Finally, a suite of tests have been implemented, exercising this functionality through the Linux syscall API. PiperOrigin-RevId: 234850781 Change-Id: If3e666666917d39f55083741c78314a06defb26c
2019-02-19netstack: Add SIOCGSTAMP support.Kevin Krakauer
Ping sometimes uses this instead of SO_TIMESTAMP. PiperOrigin-RevId: 234699590 Change-Id: Ibec9c34fa0d443a931557a2b1b1ecd83effe7765
2019-02-19Set rax to syscall number on SECCOMP_RET_TRAP.Jamie Liu
PiperOrigin-RevId: 234690475 Change-Id: I1cbfb5aecd4697a4a26ec8524354aa8656cc3ba1
2019-02-19Fix clone(CLONE_NEWUSER).Jamie Liu
- Use new user namespace for namespace creation checks. - Ensure userns is never nil since it's used by other namespaces. PiperOrigin-RevId: 234673175 Change-Id: I4b9d9d1e63ce4e24362089793961a996f7540cd9
2019-02-19Break /proc/[pid]/{uid,gid}_map's dependence on seqfile.Jamie Liu
In addition to simplifying the implementation, this fixes two bugs: - seqfile.NewSeqFile unconditionally creates an inode with mode 0444, but {uid,gid}_map have mode 0644. - idMapSeqFile.Write implements fs.FileOperations.Write ... but it doesn't implement any other fs.FileOperations methods and is never used as fs.FileOperations. idMapSeqFile.GetFile() => seqfile.SeqFile.GetFile() uses seqfile.seqFileOperations instead, which rejects all writes. PiperOrigin-RevId: 234638212 Change-Id: I4568f741ab07929273a009d7e468c8205a8541bc
2019-02-15Implement IP_MULTICAST_IF.Ian Gudger
This allows setting a default send interface for IPv4 multicast. IPv6 support will come later. PiperOrigin-RevId: 234251379 Change-Id: I65922341cd8b8880f690fae3eeb7ddfa47c8c173