Age | Commit message (Collapse) | Author |
|
SendMsg before this change would copy all the data over into a
new slice even if the underlying socket could only accept a
small amount of data. This is really inefficient with non-blocking
sockets and under high throughput where large writes could get
ErrWouldBlock or if there was say a timeout associated with the sendmsg()
syscall.
With this change we delay copying bytes in till they are needed and only
copy what can be potentially sent/held in the socket buffer. Reducing
the need to repeatedly copy data over.
Also a minor fix to change state FIN-WAIT-1 when shutdown(..., SHUT_WR) is called
instead of when we transmit the actual FIN. Otherwise the socket could remain in
CONNECTED state even though the user has called shutdown() on the socket.
Updates #627
PiperOrigin-RevId: 263430505
|
|
This replaces fs/proc/seqfile for vfs2-based filesystems.
PiperOrigin-RevId: 263254647
|
|
PiperOrigin-RevId: 263203441
|
|
Similar to the EPIPE case, we can return the number of bytes written before
ENOSPC was encountered. If the app tries to write more, we can return ENOSPC on
the next write.
PiperOrigin-RevId: 263041648
|
|
Now if a process sends an unsupported netlink requests,
an error is returned from the send system call.
The linux kernel works differently in this case. It returns errors in the
nlmsgerr netlink message.
Reported-by: syzbot+571d99510c6f935202da@syzkaller.appspotmail.com
PiperOrigin-RevId: 262690453
|
|
Signed-off-by: Haibo Xu <haibo.xu@arm.com>
Change-Id: I1dbd23bb240cca71d0cc30fc75ca5be28cb4c37c
PiperOrigin-RevId: 262619519
|
|
fsimpl is the keeper of all filesystem implementations in VFS2.
PiperOrigin-RevId: 262617869
|
|
Added benchmark tests which emulate memfs benchmarks.
Stat benchmarks
BenchmarkVFS2Ext4fsStat/1-12 10000000 145 ns/op
BenchmarkVFS2Ext4fsStat/2-12 10000000 170 ns/op
BenchmarkVFS2Ext4fsStat/3-12 10000000 202 ns/op
BenchmarkVFS2Ext4fsStat/8-12 3000000 374 ns/op
BenchmarkVFS2Ext4fsStat/64-12 500000 2159 ns/op
BenchmarkVFS2Ext4fsStat/100-12 300000 3459 ns/op
BenchmarkVFS1TmpfsStat/1-12 5000000 348 ns/op
BenchmarkVFS1TmpfsStat/2-12 3000000 487 ns/op
BenchmarkVFS1TmpfsStat/3-12 2000000 655 ns/op
BenchmarkVFS1TmpfsStat/8-12 1000000 1365 ns/op
BenchmarkVFS1TmpfsStat/64-12 200000 9565 ns/op
BenchmarkVFS1TmpfsStat/100-12 100000 15158 ns/op
BenchmarkVFS2MemfsStat/1-12 10000000 133 ns/op
BenchmarkVFS2MemfsStat/2-12 10000000 155 ns/op
BenchmarkVFS2MemfsStat/3-12 10000000 182 ns/op
BenchmarkVFS2MemfsStat/8-12 5000000 310 ns/op
BenchmarkVFS2MemfsStat/64-12 1000000 1659 ns/op
BenchmarkVFS2MemfsStat/100-12 500000 2787 ns/op
Mount Stat benchmarks
BenchmarkVFS2ExtfsMountStat/1-12 5000000 245 ns/op
BenchmarkVFS2ExtfsMountStat/2-12 5000000 266 ns/op
BenchmarkVFS2ExtfsMountStat/3-12 5000000 304 ns/op
BenchmarkVFS2ExtfsMountStat/8-12 3000000 456 ns/op
BenchmarkVFS2ExtfsMountStat/64-12 500000 2308 ns/op
BenchmarkVFS2ExtfsMountStat/100-12 300000 3482 ns/op
BenchmarkVFS1TmpfsMountStat/1-12 3000000 488 ns/op
BenchmarkVFS1TmpfsMountStat/2-12 2000000 658 ns/op
BenchmarkVFS1TmpfsMountStat/3-12 2000000 806 ns/op
BenchmarkVFS1TmpfsMountStat/8-12 1000000 1514 ns/op
BenchmarkVFS1TmpfsMountStat/64-12 100000 10037 ns/op
BenchmarkVFS1TmpfsMountStat/100-12 100000 15280 ns/op
BenchmarkVFS2MemfsMountStat/1-12 10000000 212 ns/op
BenchmarkVFS2MemfsMountStat/2-12 5000000 232 ns/op
BenchmarkVFS2MemfsMountStat/3-12 5000000 264 ns/op
BenchmarkVFS2MemfsMountStat/8-12 3000000 390 ns/op
BenchmarkVFS2MemfsMountStat/64-12 1000000 1813 ns/op
BenchmarkVFS2MemfsMountStat/100-12 500000 2812 ns/op
PiperOrigin-RevId: 262477158
|
|
Previously we were representing socket addresses as an interface{},
which allowed any type which could be binary.Marshal()ed to be used as
a socket address. This is fine when the address is passed to userspace
via the linux ABI, but is problematic when used from within the sentry
such as by networking procfs files.
PiperOrigin-RevId: 262460640
|
|
Endpoint protocol goroutines were previously started as part of
loading the endpoint. This is potentially too soon, as resources used
by these goroutine may not have been loaded. Protocol goroutines may
perform meaningful work as soon as they're started (ex: incoming
connect) which can cause them to indirectly access resources that
haven't been loaded yet.
This CL defers resuming all protocol goroutines until the end of
restore.
PiperOrigin-RevId: 262409429
|
|
PiperOrigin-RevId: 262402929
|
|
- Unexport Filesystem/Dentry/Inode.
- Support SEEK_CUR in directoryFD.Seek().
- Hold Filesystem.mu before touching directoryFD.off in
directoryFD.Seek().
- Remove deleted Dentries from their parent directory.childLists.
- Remove invalid FIXMEs.
PiperOrigin-RevId: 262400633
|
|
PiperOrigin-RevId: 262264674
|
|
PiperOrigin-RevId: 262249166
|
|
PiperOrigin-RevId: 262242410
|
|
PiperOrigin-RevId: 262226761
|
|
- This also gets rid of pipes for now because pipe does not have vfs2 specific
support yet.
- Added file path resolution logic.
- Fixes testing infrastructure.
- Does not include unit tests yet.
PiperOrigin-RevId: 262213950
|
|
If there is an offset, the file must support pread/pwrite. See
fs/splice.c:do_splice.
PiperOrigin-RevId: 261944932
|
|
syscall.EPOLLET has been defined with different values on amd64 and
arm64(-0x80000000 on amd64, and 0x80000000 on arm64), while unix.EPOLLET
has been unified this value to 0x80000000(golang/go#5328). ref #63
Signed-off-by: Haibo Xu <haibo.xu@arm.com>
Change-Id: Id97d075c4e79d86a2ea3227ffbef02d8b00ffbb8
|
|
PiperOrigin-RevId: 261413396
|
|
(Don't worry, this is mostly tests.)
Implemented the following ioctls:
- TIOCSCTTY - set controlling TTY
- TIOCNOTTY - remove controlling tty, maybe signal some other processes
- TIOCGPGRP - get foreground process group. Also enables tcgetpgrp().
- TIOCSPGRP - set foreground process group. Also enabled tcsetpgrp().
Next steps are to actually turn terminal-generated control characters (e.g. C^c)
into signals to the proper process groups, and to send SIGTTOU and SIGTTIN when
appropriate.
PiperOrigin-RevId: 261387276
|
|
PiperOrigin-RevId: 261373749
|
|
We can get the mount namespace from the CreateProcessArgs in all cases where we
need it. This also gets rid of kernel.Destroy method, since the only thing it
was doing was DecRefing the mounts.
Removing the need to call kernel.SetRootMountNamespace also allowed for some
more simplifications in the container fs setup code.
PiperOrigin-RevId: 261357060
|
|
PiperOrigin-RevId: 261203674
|
|
This is the source of many warnings like:
AtomicRefCount 0x7f5ff84e3500 owned by "fs.Inode" garbage collected with ref count of 1 (want 0)
PiperOrigin-RevId: 261197093
|
|
Export some readily-available fields for TCP_INFO and stub out the rest.
PiperOrigin-RevId: 261191548
|
|
Implements support for RTM_GETROUTE requests for netlink sockets.
Fixes #507
PiperOrigin-RevId: 261051045
|
|
This is initialized lazily on the first unimplemented
syscall. Without the sync.Once, this is racy.
PiperOrigin-RevId: 260971758
|
|
PiperOrigin-RevId: 260851452
|
|
It gets rid of holding state of the io.Reader offset (which is anyways held by
the vfs.FileDescriptor struct. It is also odd using a io.Reader becuase we
using io.ReaderAt to interact with the device. So making a io.ReaderAt wrapper
makes more sense.
Most importantly, it gets rid of the complexity of extracting the file reader
from a regular file implementation and then using it. Now we can just use the
regular file implementation as a reader which is more intuitive.
PiperOrigin-RevId: 260846927
|
|
Also adds stress tests for block map reader and intensifies extent reader tests.
PiperOrigin-RevId: 260838177
|
|
PiperOrigin-RevId: 260783254
|
|
Adds feature to launch from an open host FD instead of a binary_path.
The FD should point to a valid executable and most likely be statically
compiled. If the executable is not statically compiled, the loader will
search along the interpreter paths, which must be able to be resolved in
the Sandbox's file system or start will fail.
PiperOrigin-RevId: 260756825
|
|
This provides the following benefits:
- We can now use pkg/fd package which does not take ownership
of the file descriptor. So it does not close the fd when garbage collected.
This reduces scope of errors from unexpected garbage collection of io.File.
- It enforces the offset parameter in every read call.
It does not affect the fd offset nor is it affected by it. Hence reducing
scope of error of using stale offsets when reading.
- We do not need to serialize the usage of any global file descriptor anymore.
So this drops the mutual exclusion req hence reducing complexity and
congestion.
PiperOrigin-RevId: 260635174
|
|
Allocate a larger memory buffer and combine multiple copies into one copy,
to reduce the number of copies from kernel memory to user memory.
Signed-off-by: Hang Su <darcy.sh@antfin.com>
|
|
PiperOrigin-RevId: 260629559
|
|
PiperOrigin-RevId: 260624470
|
|
This introduces two new types of Emitters:
1. MultiEmitter, which will forward events to other registered Emitters, and
2. RateLimitedEmitter, which will forward events to a wrapped Emitter, subject
to given rate limits.
The methods in the eventchannel package itself act like a multiEmitter, but is
not actually an Emitter. Now we have a DefaultEmitter, and the methods in
eventchannel simply forward calls to the DefaultEmitter.
The unimplemented syscall handler now uses a RateLimetedEmitter that wraps the
DefaultEmitter.
PiperOrigin-RevId: 260612770
|
|
PiperOrigin-RevId: 260220279
|
|
PiperOrigin-RevId: 260047477
|
|
PiperOrigin-RevId: 259865366
|
|
PiperOrigin-RevId: 259856442
|
|
PiperOrigin-RevId: 259835948
|
|
This allows the user code to add a network address with a subnet prefix length.
The prefix length value is stored in the network endpoint and provided back to
the user in the ProtocolAddress type.
PiperOrigin-RevId: 259807693
|
|
The different containers in a sandbox used only one pid
namespace before. This results in that a container can see
the processes in another container in the same sandbox.
This patch use different pid namespace for different containers.
Signed-off-by: chris.zn <chris.zn@antfin.com>
|
|
PiperOrigin-RevId: 259666476
|
|
PiperOrigin-RevId: 259657917
|
|
PiperOrigin-RevId: 259628657
|
|
This keeps all container filesystem completely separate from eachother
(including from the root container filesystem), and allows us to get rid of the
"__runsc_containers__" directory.
It also simplifies container startup/teardown as we don't have to muck around
in the root container's filesystem.
PiperOrigin-RevId: 259613346
|
|
PiperOrigin-RevId: 259427074
|