summaryrefslogtreecommitdiffhomepage
path: root/pkg/sentry
AgeCommit message (Collapse)Author
2019-08-14Replace uinptr with int64 when returning lengthsTamir Duberstein
This is in accordance with newer parts of the standard library. PiperOrigin-RevId: 263449916
2019-08-14Improve SendMsg performance.Bhasker Hariharan
SendMsg before this change would copy all the data over into a new slice even if the underlying socket could only accept a small amount of data. This is really inefficient with non-blocking sockets and under high throughput where large writes could get ErrWouldBlock or if there was say a timeout associated with the sendmsg() syscall. With this change we delay copying bytes in till they are needed and only copy what can be potentially sent/held in the socket buffer. Reducing the need to repeatedly copy data over. Also a minor fix to change state FIN-WAIT-1 when shutdown(..., SHUT_WR) is called instead of when we transmit the actual FIN. Otherwise the socket could remain in CONNECTED state even though the user has called shutdown() on the socket. Updates #627 PiperOrigin-RevId: 263430505
2019-08-13Add vfs.DynamicBytesFileDescriptionImpl.Jamie Liu
This replaces fs/proc/seqfile for vfs2-based filesystems. PiperOrigin-RevId: 263254647
2019-08-13Fix file mode check in pipeOperationsFabricio Voznika
PiperOrigin-RevId: 263203441
2019-08-12Handle ENOSPC with a partial write.Nicolas Lacasse
Similar to the EPIPE case, we can return the number of bytes written before ENOSPC was encountered. If the app tries to write more, we can return ENOSPC on the next write. PiperOrigin-RevId: 263041648
2019-08-09netlink: return an error in nlmsgerrAndrei Vagin
Now if a process sends an unsupported netlink requests, an error is returned from the send system call. The linux kernel works differently in this case. It returns errors in the nlmsgerr netlink message. Reported-by: syzbot+571d99510c6f935202da@syzkaller.appspotmail.com PiperOrigin-RevId: 262690453
2019-08-09Add initial ptrace stub and syscall support for arm64.Haibo Xu
Signed-off-by: Haibo Xu <haibo.xu@arm.com> Change-Id: I1dbd23bb240cca71d0cc30fc75ca5be28cb4c37c PiperOrigin-RevId: 262619519
2019-08-09ext: Move to pkg/sentry/fsimpl.Ayush Ranjan
fsimpl is the keeper of all filesystem implementations in VFS2. PiperOrigin-RevId: 262617869
2019-08-08ext: Benchmark tests.Ayush Ranjan
Added benchmark tests which emulate memfs benchmarks. Stat benchmarks BenchmarkVFS2Ext4fsStat/1-12 10000000 145 ns/op BenchmarkVFS2Ext4fsStat/2-12 10000000 170 ns/op BenchmarkVFS2Ext4fsStat/3-12 10000000 202 ns/op BenchmarkVFS2Ext4fsStat/8-12 3000000 374 ns/op BenchmarkVFS2Ext4fsStat/64-12 500000 2159 ns/op BenchmarkVFS2Ext4fsStat/100-12 300000 3459 ns/op BenchmarkVFS1TmpfsStat/1-12 5000000 348 ns/op BenchmarkVFS1TmpfsStat/2-12 3000000 487 ns/op BenchmarkVFS1TmpfsStat/3-12 2000000 655 ns/op BenchmarkVFS1TmpfsStat/8-12 1000000 1365 ns/op BenchmarkVFS1TmpfsStat/64-12 200000 9565 ns/op BenchmarkVFS1TmpfsStat/100-12 100000 15158 ns/op BenchmarkVFS2MemfsStat/1-12 10000000 133 ns/op BenchmarkVFS2MemfsStat/2-12 10000000 155 ns/op BenchmarkVFS2MemfsStat/3-12 10000000 182 ns/op BenchmarkVFS2MemfsStat/8-12 5000000 310 ns/op BenchmarkVFS2MemfsStat/64-12 1000000 1659 ns/op BenchmarkVFS2MemfsStat/100-12 500000 2787 ns/op Mount Stat benchmarks BenchmarkVFS2ExtfsMountStat/1-12 5000000 245 ns/op BenchmarkVFS2ExtfsMountStat/2-12 5000000 266 ns/op BenchmarkVFS2ExtfsMountStat/3-12 5000000 304 ns/op BenchmarkVFS2ExtfsMountStat/8-12 3000000 456 ns/op BenchmarkVFS2ExtfsMountStat/64-12 500000 2308 ns/op BenchmarkVFS2ExtfsMountStat/100-12 300000 3482 ns/op BenchmarkVFS1TmpfsMountStat/1-12 3000000 488 ns/op BenchmarkVFS1TmpfsMountStat/2-12 2000000 658 ns/op BenchmarkVFS1TmpfsMountStat/3-12 2000000 806 ns/op BenchmarkVFS1TmpfsMountStat/8-12 1000000 1514 ns/op BenchmarkVFS1TmpfsMountStat/64-12 100000 10037 ns/op BenchmarkVFS1TmpfsMountStat/100-12 100000 15280 ns/op BenchmarkVFS2MemfsMountStat/1-12 10000000 212 ns/op BenchmarkVFS2MemfsMountStat/2-12 5000000 232 ns/op BenchmarkVFS2MemfsMountStat/3-12 5000000 264 ns/op BenchmarkVFS2MemfsMountStat/8-12 3000000 390 ns/op BenchmarkVFS2MemfsMountStat/64-12 1000000 1813 ns/op BenchmarkVFS2MemfsMountStat/100-12 500000 2812 ns/op PiperOrigin-RevId: 262477158
2019-08-08Return a well-defined socket address type from socket funtions.Rahat Mahmood
Previously we were representing socket addresses as an interface{}, which allowed any type which could be binary.Marshal()ed to be used as a socket address. This is fine when the address is passed to userspace via the linux ABI, but is problematic when used from within the sentry such as by networking procfs files. PiperOrigin-RevId: 262460640
2019-08-08netstack: Don't start endpoint goroutines too soon on restore.Rahat Mahmood
Endpoint protocol goroutines were previously started as part of loading the endpoint. This is potentially too soon, as resources used by these goroutine may not have been loaded. Protocol goroutines may perform meaningful work as soon as they're started (ex: incoming connect) which can cause them to indirectly access resources that haven't been loaded yet. This CL defers resuming all protocol goroutines until the end of restore. PiperOrigin-RevId: 262409429
2019-08-08Merge pull request #653 from xiaobo55x:devgVisor bot
PiperOrigin-RevId: 262402929
2019-08-08memfs fixes.Jamie Liu
- Unexport Filesystem/Dentry/Inode. - Support SEEK_CUR in directoryFD.Seek(). - Hold Filesystem.mu before touching directoryFD.off in directoryFD.Seek(). - Remove deleted Dentries from their parent directory.childLists. - Remove invalid FIXMEs. PiperOrigin-RevId: 262400633
2019-08-07ext: Seek unit tests.Ayush Ranjan
PiperOrigin-RevId: 262264674
2019-08-07ext: StatAt unit tests.Ayush Ranjan
PiperOrigin-RevId: 262249166
2019-08-07ext: Read unit tests.Ayush Ranjan
PiperOrigin-RevId: 262242410
2019-08-07ext: IterDirent unit tests.Ayush Ranjan
PiperOrigin-RevId: 262226761
2019-08-07ext: vfs.FileDescriptionImpl and vfs.FilesystemImpl implementations.Ayush Ranjan
- This also gets rid of pipes for now because pipe does not have vfs2 specific support yet. - Added file path resolution logic. - Fixes testing infrastructure. - Does not include unit tests yet. PiperOrigin-RevId: 262213950
2019-08-06Require pread/pwrite for splice file offsetsMichael Pratt
If there is an offset, the file must support pread/pwrite. See fs/splice.c:do_splice. PiperOrigin-RevId: 261944932
2019-08-05Change syscall.EPOLLET to unix.EPOLLETHaibo Xu
syscall.EPOLLET has been defined with different values on amd64 and arm64(-0x80000000 on amd64, and 0x80000000 on arm64), while unix.EPOLLET has been unified this value to 0x80000000(golang/go#5328). ref #63 Signed-off-by: Haibo Xu <haibo.xu@arm.com> Change-Id: Id97d075c4e79d86a2ea3227ffbef02d8b00ffbb8
2019-08-02Plumbing for iptables sockopts.Kevin Krakauer
PiperOrigin-RevId: 261413396
2019-08-02Job control: controlling TTYs and foreground process groups.Kevin Krakauer
(Don't worry, this is mostly tests.) Implemented the following ioctls: - TIOCSCTTY - set controlling TTY - TIOCNOTTY - remove controlling tty, maybe signal some other processes - TIOCGPGRP - get foreground process group. Also enables tcgetpgrp(). - TIOCSPGRP - set foreground process group. Also enabled tcsetpgrp(). Next steps are to actually turn terminal-generated control characters (e.g. C^c) into signals to the proper process groups, and to send SIGTTOU and SIGTTIN when appropriate. PiperOrigin-RevId: 261387276
2019-08-02Automated rollback of changelist 261191548Rahat Mahmood
PiperOrigin-RevId: 261373749
2019-08-02Remove kernel.mounts.Nicolas Lacasse
We can get the mount namespace from the CreateProcessArgs in all cases where we need it. This also gets rid of kernel.Destroy method, since the only thing it was doing was DecRefing the mounts. Removing the need to call kernel.SetRootMountNamespace also allowed for some more simplifications in the container fs setup code. PiperOrigin-RevId: 261357060
2019-08-01Drop reference on fs.Inode if Mount goes wrong.Nicolas Lacasse
PiperOrigin-RevId: 261203674
2019-08-01tmpfs and ramfs Dirs should drop references on children in Release().Nicolas Lacasse
This is the source of many warnings like: AtomicRefCount 0x7f5ff84e3500 owned by "fs.Inode" garbage collected with ref count of 1 (want 0) PiperOrigin-RevId: 261197093
2019-08-01Implement getsockopt(TCP_INFO).Rahat Mahmood
Export some readily-available fields for TCP_INFO and stub out the rest. PiperOrigin-RevId: 261191548
2019-07-31Basic support for 'ip route'Ian Lewis
Implements support for RTM_GETROUTE requests for netlink sockets. Fixes #507 PiperOrigin-RevId: 261051045
2019-07-31Initialize kernel.unimplementedSyscallEmitter with a sync.Once.Nicolas Lacasse
This is initialized lazily on the first unimplemented syscall. Without the sync.Once, this is racy. PiperOrigin-RevId: 260971758
2019-07-30Cache pages in CachingInodeOperations.Read when memory evictions are delayed.Jamie Liu
PiperOrigin-RevId: 260851452
2019-07-30ext: Migrate from using fileReader custom interface to using io.Reader.Ayush Ranjan
It gets rid of holding state of the io.Reader offset (which is anyways held by the vfs.FileDescriptor struct. It is also odd using a io.Reader becuase we using io.ReaderAt to interact with the device. So making a io.ReaderAt wrapper makes more sense. Most importantly, it gets rid of the complexity of extracting the file reader from a regular file implementation and then using it. Now we can just use the regular file implementation as a reader which is more intuitive. PiperOrigin-RevId: 260846927
2019-07-30ext: block map file reader implementation.Ayush Ranjan
Also adds stress tests for block map reader and intensifies extent reader tests. PiperOrigin-RevId: 260838177
2019-07-30Merge pull request #607 from DarcySail:mastergVisor bot
PiperOrigin-RevId: 260783254
2019-07-30Add feature to launch Sentry from an open host FD.Zach Koopmans
Adds feature to launch from an open host FD instead of a binary_path. The FD should point to a valid executable and most likely be statically compiled. If the executable is not statically compiled, the loader will search along the interpreter paths, which must be able to be resolved in the Sandbox's file system or start will fail. PiperOrigin-RevId: 260756825
2019-07-29Migrate from using io.ReadSeeker to io.ReaderAt.Ayush Ranjan
This provides the following benefits: - We can now use pkg/fd package which does not take ownership of the file descriptor. So it does not close the fd when garbage collected. This reduces scope of errors from unexpected garbage collection of io.File. - It enforces the offset parameter in every read call. It does not affect the fd offset nor is it affected by it. Hence reducing scope of error of using stale offsets when reading. - We do not need to serialize the usage of any global file descriptor anymore. So this drops the mutual exclusion req hence reducing complexity and congestion. PiperOrigin-RevId: 260635174
2019-07-30Combine multiple epoll events copiesHang Su
Allocate a larger memory buffer and combine multiple copies into one copy, to reduce the number of copies from kernel memory to user memory. Signed-off-by: Hang Su <darcy.sh@antfin.com>
2019-07-29ext: extent reader implementation.Ayush Ranjan
PiperOrigin-RevId: 260629559
2019-07-29ext: inode implementations.Ayush Ranjan
PiperOrigin-RevId: 260624470
2019-07-29Rate limit the unimplemented syscall event handler.Nicolas Lacasse
This introduces two new types of Emitters: 1. MultiEmitter, which will forward events to other registered Emitters, and 2. RateLimitedEmitter, which will forward events to a wrapped Emitter, subject to given rate limits. The methods in the eventchannel package itself act like a multiEmitter, but is not actually an Emitter. Now we have a DefaultEmitter, and the methods in eventchannel simply forward calls to the DefaultEmitter. The unimplemented syscall handler now uses a RateLimetedEmitter that wraps the DefaultEmitter. PiperOrigin-RevId: 260612770
2019-07-26Merge pull request #452 from zhangningdlut:chris_test_pidnsgVisor bot
PiperOrigin-RevId: 260220279
2019-07-25Automated rollback of changelist 255679453Fabricio Voznika
PiperOrigin-RevId: 260047477
2019-07-24ext: filesystem boilerplate code.Ayush Ranjan
PiperOrigin-RevId: 259865366
2019-07-24ext: Add tests for root directory inode.Ayush Ranjan
PiperOrigin-RevId: 259856442
2019-07-24ext: testing environment setup with VFS2 support.Ayush Ranjan
PiperOrigin-RevId: 259835948
2019-07-24Add support for a subnet prefix length on interface network addressesChris Kuiper
This allows the user code to add a network address with a subnet prefix length. The prefix length value is stored in the network endpoint and provided back to the user in the ProtocolAddress type. PiperOrigin-RevId: 259807693
2019-07-24Use different pidns among different containerschris.zn
The different containers in a sandbox used only one pid namespace before. This results in that a container can see the processes in another container in the same sandbox. This patch use different pid namespace for different containers. Signed-off-by: chris.zn <chris.zn@antfin.com>
2019-07-23ext: Inode creation logic.Ayush Ranjan
PiperOrigin-RevId: 259666476
2019-07-23ext: Add ext2 and ext3 tiny images.Ayush Ranjan
PiperOrigin-RevId: 259657917
2019-07-23ext: Added extent tree building logic.Ayush Ranjan
PiperOrigin-RevId: 259628657
2019-07-23Give each container a distinct MountNamespace.Nicolas Lacasse
This keeps all container filesystem completely separate from eachother (including from the root container filesystem), and allows us to get rid of the "__runsc_containers__" directory. It also simplifies container startup/teardown as we don't have to muck around in the root container's filesystem. PiperOrigin-RevId: 259613346