Age | Commit message (Collapse) | Author |
|
The limits for snd/rcv buffers for unix domain socket is controlled by the
following sysctls on linux
- net.core.rmem_default
- net.core.rmem_max
- net.core.wmem_default
- net.core.wmem_max
Today in gVisor we do not expose these sysctls but we do support setting the
equivalent in netstack via stack.Options() method. But AF_UNIX sockets in gVisor
can be used without netstack, with hostinet or even without any networking stack
at all. Which means ideally these sysctls need to live as globals in gVisor.
But rather than make this a big change for now we hardcode the limits in the
AF_UNIX implementation itself (which in itself is better than where we were
before) where it SO_SNDBUF was hardcoded to 16KiB. Further we bump the initial
limit to a default value of 208 KiB to match linux from the paltry 16 KiB we use
today.
Updates #5132
PiperOrigin-RevId: 356665498
|
|
PiperOrigin-RevId: 356536548
|
|
This makes it possible to add data to types that implement tcpip.Error.
ErrBadLinkEndpoint is removed as it is unused.
PiperOrigin-RevId: 354437314
|
|
open() has to return ENXIO in this case.
O_PATH isn't supported by vfs1.
PiperOrigin-RevId: 348820478
|
|
PiperOrigin-RevId: 331256608
|
|
Fixes an error where in case of a receive buffer larger than the host send
buffer size for a host backed unix dgram socket we would end up swallowing EOF
from recvmsg syscall causing the read() to block forever.
PiperOrigin-RevId: 331192810
|
|
Updates #2972
PiperOrigin-RevId: 329584905
|
|
PiperOrigin-RevId: 328415633
|
|
This is needed to avoid circular dependencies between the vfs and kernel
packages.
PiperOrigin-RevId: 327355524
|
|
context is passed to DecRef() and Release() which is
needed for SO_LINGER implementation.
PiperOrigin-RevId: 324672584
|
|
Now it calls pkt.Data.ToView() when writing the packet. This may require
copying when the packet is large, which puts the worse case in an even worse
situation.
This sent out in a separate preparation change as it requires syscall filter
changes. This change will be followed by the change for the adoption of the new
PacketHeader API.
PiperOrigin-RevId: 321447003
|
|
Updates #2972
PiperOrigin-RevId: 316942245
|
|
PiperOrigin-RevId: 315991648
|
|
When I do high-performance networking,
the value of wmem_max is often set very high,
specially for 10/25/50 Gigabit NIC.
I think maybe this restriction is not suitable.
Signed-off-by: Bin Lu <bin.lu@arm.com>
|
|
When the file closes, it attempts to write dirty cached
attributes to the file. This should not be done when the
mount is readonly.
PiperOrigin-RevId: 315585058
|
|
The FileDescription implementation for hostfs sockets uses the standard Unix
socket implementation (unix.SocketVFS2), but is also tied to a hostfs dentry.
Updates #1672, #1476
PiperOrigin-RevId: 308716426
|
|
Fixes #1477.
PiperOrigin-RevId: 308317511
|
|
PiperOrigin-RevId: 305807868
|
|
PiperOrigin-RevId: 305588941
|
|
This required minor restructuring of how system call tables were saved
and restored, but it makes way more sense this way.
Updates #2243
|
|
Using the host-defined file owner matches VFS1. It is more correct to use the
host-defined mode, since the cached value may become out of date. However,
kernfs.Inode.Mode() does not return an error--other filesystems on kernfs are
in-memory so retrieving mode should not fail. Therefore, if the host syscall
fails, we rely on a cached value instead.
Updates #1672.
PiperOrigin-RevId: 303220864
|
|
Refactor fs/host.TTYFileOperations so that the relevant functionality can be
shared with VFS2 (fsimpl/host.ttyFD).
Incorporate host.defaultFileFD into the default host.fileDescription. This way,
there is no need for a separate default_file.go. As in vfs1, the TTY file
implementation can be built on top of this default and override operations as
necessary (PRead/Read/PWrite/Write, Release, Ioctl).
Note that these changes still need to be plumbed into runsc, which refers to
imported TTYs in control/proc.go:ExecAsync.
Updates #1672.
PiperOrigin-RevId: 301718157
|
|
PiperOrigin-RevId: 301402181
|
|
- When setting up the virtual filesystem, mount a host.filesystem to contain
all files that need to be imported.
- Make read/preadv syscalls to the host in cases where preadv2 may not be
supported yet (likewise for writing).
- Make save/restore functions in kernel/kernel.go return early if vfs2 is
enabled.
PiperOrigin-RevId: 300922353
|
|
In VFS2, imported file descriptors are stored in a kernfs-based filesystem.
Upon calling ImportFD, the host fd can be accessed in two ways:
1. a FileDescription that can be added to the FDTable, and
2. a Dentry in the host.filesystem mount, which we will want to access through
magic symlinks in /proc/[pid]/fd/.
An implementation of the kernfs.Inode interface stores a unique host fd. This
inode can be inserted into file descriptions as well as dentries.
This change also plumbs in three FileDescriptionImpls corresponding to fds for
sockets, TTYs, and other files (only the latter is implemented here).
These implementations will mostly make corresponding syscalls to the host.
Where possible, the logic is ported over from pkg/sentry/fs/host.
Updates #1672
PiperOrigin-RevId: 299417263
|
|
Because the abi will depend on the core types for marshalling (usermem,
context, safemem, safecopy), these need to be flattened from the sentry
directory. These packages contain no sentry-specific details.
PiperOrigin-RevId: 291811289
|
|
PiperOrigin-RevId: 291745021
|
|
* Rename syncutil to sync.
* Add aliases to sync types.
* Replace existing usage of standard library sync package.
This will make it easier to swap out synchronization primitives. For example,
this will allow us to use primitives from github.com/sasha-s/go-deadlock to
check for lock ordering violations.
Updates #1472
PiperOrigin-RevId: 289033387
|
|
newfstatat() syscall is not supported on arm64, so we resort
to use the fstatat() syscall.
Signed-off-by: Haibo Xu <haibo.xu@arm.com>
Change-Id: Iea95550ea53bcf85c01f7b3b95da70ad0952177d
|
|
PiperOrigin-RevId: 275139066
|
|
PiperOrigin-RevId: 275114157
|
|
The gofer's CachingInodeOperations implementation contains an optimization for
the common open-read-close pattern when we have a host FD. In this case, the
host kernel will update the timestamp for us to a reasonably close time, so we
don't need an extra RPC to the gofer.
However, when the app explicitly sets the timestamps (via futimes or similar)
then we actually DO need to update the timestamps, because the host kernel
won't do it for us.
To fix this, a new boolean `forceSetTimestamps` was added to
CachineInodeOperations.SetMaskedAttributes. It is only set by
gofer.InodeOperations.SetTimestamps.
PiperOrigin-RevId: 272048146
|
|
How to reproduce:
$ echo "timeout 10 ls" > foo.sh
$ chmod +x foo.sh
$ ./foo.sh
(will hang here for 10 secs, and the output of ls does not show)
When "ls" process writes to stdout, it receives SIGTTOU signal, and
hangs there. Until "timeout" process timeouts, and kills "ls" process.
The expected result is: "ls" writes its output into tty, and terminates
immdedately, then "timeout" process receives SIGCHLD and terminates.
The reason for this failure is that we missed the check for TOSTOP (if
set, background processes will receive the SIGTTOU signal when they do
write).
We use drivers/tty/n_tty.c:n_tty_write() as a reference.
Fixes: #862
Reported-by: chris.zn <chris.zn@antfin.com>
Signed-off-by: Jianfeng Tan <henry.tjf@antfin.com>
Signed-off-by: chenglang.hy <chenglang.hy@antfin.com>
|
|
They are no-ops, so the standard rule works fine.
PiperOrigin-RevId: 268776264
|
|
PiperOrigin-RevId: 266177409
|
|
For SOCK_STREAM type unix socket, we shall return ECONNRESET if peer is
closed with data not read.
We explictly set a flag when closing one end, to differentiate from
just shutdown (where zero shall be returned).
Fixes: #735
Signed-off-by: Jianfeng Tan <henry.tjf@antfin.com>
|
|
This is in accordance with newer parts of the standard library.
PiperOrigin-RevId: 263449916
|
|
PiperOrigin-RevId: 255711454
|
|
Get/Set pipe size and ioctl support were missing from
overlayfs. It required moving the pipe.Sizer interface
to fs so that overlay could get access.
Fixes #318
PiperOrigin-RevId: 255511125
|
|
Currently, the overlay dirCache is only used for a single logical use of
getdents. i.e., it is discard when the FD is closed or seeked back to
the beginning.
But the initial work of getting the directory contents can be quite
expensive (particularly sorting large directories), so we should keep it
as long as possible.
This is very similar to the readdirCache in fs/gofer.
Since the upper filesystem does not have to allow caching readdir
entries, the new CacheReaddir MountSourceOperations method controls this
behavior.
This caching should be trivially movable to all Inodes if desired,
though that adds an additional copy step for non-overlay Inodes.
(Overlay Inodes already do the extra copy).
PiperOrigin-RevId: 255477592
|
|
All functions which allocate objects containing AtomicRefCounts will soon need
a context.
PiperOrigin-RevId: 253147709
|
|
This can be merged after:
https://github.com/google/gvisor-website/pull/77
or
https://github.com/google/gvisor-website/pull/78
PiperOrigin-RevId: 253132620
|
|
Store enough information in the kernel socket table to distinguish
between different types of sockets. Previously we were only storing
the socket family, but this isn't enough to classify sockets. For
example, TCPv4 and UDPv4 sockets are both AF_INET, and ICMP sockets
are SOCK_DGRAM sockets with a particular protocol.
Instead of creating more sub-tables, flatten the socket table and
provide a filtering mechanism based on the socket entry.
Also generate and store a socket entry index ("sl" in linux) which
allows us to output entries in a stable order from procfs.
PiperOrigin-RevId: 252495895
|
|
SockType isn't specific to unix domain sockets, and the current
definition basically mirrors the linux ABI's definition.
PiperOrigin-RevId: 251956740
|
|
This is required to make the shutdown visible to peers outside the
sandbox.
The readClosed / writeClosed fields were dropped, as they were
preventing a shutdown socket from reading the remainder of queued bytes.
The host syscalls will return the appropriate errors for shutdown.
The control message tests have been split out of socket_unix.cc to make
the (few) remaining tests accessible to testing inherited host UDS,
which don't support sending control messages.
Updates #273
PiperOrigin-RevId: 251763060
|
|
This does not actually implement an efficient splice or sendfile. Rather, it
adds a generic plumbing to the file internals so that this can be added. All
file implementations use the stub fileutil.NoSplice implementation, which
causes sendfile and splice to fall back to an internal copy.
A basic splice system call interface is added, along with a test.
PiperOrigin-RevId: 249335960
Change-Id: Ic5568be2af0a505c19e7aec66d5af2480ab0939b
|
|
This more directly matches what Linux does with unsupported
nodes.
PiperOrigin-RevId: 248780425
Change-Id: I17f3dd0b244f6dc4eb00e2e42344851b8367fbec
|
|
There is a lot of redundancy that we can simplify in the stat_times
test. This will make it easier to add new tests. However, the
simplification reveals that cached uattrs on goferfs don't properly
update ctime on rename.
PiperOrigin-RevId: 248773425
Change-Id: I52662728e1e9920981555881f9a85f9ce04041cf
|
|
Closes #225
PiperOrigin-RevId: 247508791
Change-Id: I04f47cf2770b30043e5a272aba4ba6e11d0476cc
|
|
Updates google/gvisor#206
PiperOrigin-RevId: 245880573
Change-Id: Ifa715e98d47f64b8a32b04ae9378d6cd6bd4025e
|