Age | Commit message (Collapse) | Author |
|
- Remove the pipe package's dependence on the buffer package, which becomes
unused as a result. The buffer package is currently intended to serve two use
cases, pipes and temporary buffers, and does neither optimally as a result;
this change facilitates retooling the buffer package to better serve the
latter.
- Pass callbacks taking safemem.BlockSeq to the internal pipe I/O methods,
which makes most callbacks trivial.
- Fix VFS1's splice() and tee() to immediately return if a pipe returns a
partial write.
PiperOrigin-RevId: 351911375
|
|
These are primarily simplification and lint mistakes. However, minor
fixes are also included and tests added where appropriate.
PiperOrigin-RevId: 351425971
|
|
Signed-off-by: Andrei Vagin <avagin@gmail.com>
|
|
PiperOrigin-RevId: 350223482
|
|
Syzkaller discovered this bug in pipefs by doing something quite strange:
creat(&(0x7f0000002a00)='./file1\x00', 0x0)
mount(&(0x7f0000000440)=ANY=[], &(0x7f00000002c0)='./file1\x00', &(0x7f0000000300)='devtmpfs\x00', 0x20000d, 0x0)
creat(&(0x7f0000000000)='./file1/file0\x00', 0x0)
This can be reproduced with:
touch mymount
mkfifo /dev/mypipe
mount -o ro -t devtmpfs devtmpfs mymount
echo 123 > mymount/mypipe
PiperOrigin-RevId: 349687714
|
|
open() has to return ENXIO in this case.
O_PATH isn't supported by vfs1.
PiperOrigin-RevId: 348820478
|
|
Introduces the per-socket error queue and the necessary cmsg mechanisms.
PiperOrigin-RevId: 348028508
|
|
PiperOrigin-RevId: 347711998
|
|
syzkaller reported the closing of a nil channel. This is only possible when the
AIOContext was destroyed twice.
Some scenarios that could lead to this:
- It died and then some called aioCtx.Prepare() on it and then killed it again
which could cause the double destroy. The context could have been destroyed
in between the call to LookupAIOContext() and Prepare().
- aioManager was destroyed but it did not update the contexts map. So
Lookup could still return a dead AIOContext and then someone could call
Prepare on it and kill it again.
So added a check in aioCtx.Prepare() for the context being dead. This will
prevent a dead context from resurrecting.
Also refactored code to destroy the aioContext consistently. Earlier we were not
munmapping the aioContexts that were destroyed upon aioManager destruction.
Reported-by: syzbot+ef6a588d0ce6059991d2@syzkaller.appspotmail.com
PiperOrigin-RevId: 347704347
|
|
PiperOrigin-RevId: 347047550
|
|
tcpip.ControlMessages can not contain Linux specific structures which makes it
painful to convert back and forth from Linux to tcpip back to Linux when passing
around control messages in hostinet and raw sockets.
Now we convert to the Linux version of the control message as soon as we are
out of tcpip.
PiperOrigin-RevId: 347027065
|
|
PiperOrigin-RevId: 346973338
|
|
Fixes #5004
PiperOrigin-RevId: 346643745
|
|
PiperOrigin-RevId: 345589628
|
|
These options allow overriding the signal that gets sent to the process when
I/O operations are available on the file descriptor, rather than the default
`SIGIO` signal. Doing so also populates `siginfo` to contain extra information
about which file descriptor caused the event (`si_fd`) and what events happened
on it (`si_band`). The logic around which FD is populated within `si_fd`
matches Linux's, which means it has some weird edge cases where that value may
not actually refer to a file descriptor that is still valid.
This CL also ports extra S/R logic regarding async handler in VFS2.
Without this, async I/O handlers aren't properly re-registered after S/R.
PiperOrigin-RevId: 345436598
|
|
As part of this, change Task.interrupted() to not drain Task.interruptChan, and
do so explicitly using new function Task.unsetInterrupted() instead.
PiperOrigin-RevId: 342768365
|
|
Closes #4746
PiperOrigin-RevId: 342747165
|
|
This reduces confusion with context.Context (which is also relevant to
kernel.Tasks) and is consistent with existing function kernel.LoadTaskImage().
PiperOrigin-RevId: 342167298
|
|
PiperOrigin-RevId: 341154192
|
|
Writes to pipes of size < PIPE_BUF are guaranteed to be atomic, so writes
larger than that will return EAGAIN if the pipe has capacity < PIPE_BUF.
Writes to eventfds will return EAGAIN if the write would cause the eventfd
value to go over the max.
In both such cases, calling Ready() on the FD will return true (because it is
possible to write), but specific kinds of writes will in fact return EAGAIN.
This CL fixes an infinite loop in splice and sendfile (VFS1 and VFS2) by
forcing skipping the readiness check for the outfile in send, splice, and tee.
PiperOrigin-RevId: 341102260
|
|
The default pipe size already matched linux, and is unchanged.
Furthermore `atomicIOBytes` is made a proper constant (as it is in Linux). We
were plumbing usermem.PageSize everywhere, so this is no functional change.
PiperOrigin-RevId: 340497006
|
|
PiperOrigin-RevId: 340389884
|
|
PiperOrigin-RevId: 339166854
|
|
- Check the sticky bit in overlay.filesystem.UnlinkAt(). Fixes
StickyTest.StickyBitPermDenied.
- When configuring a VFS2 overlay in runsc, copy the lower layer's root
owner/group/mode to the upper layer's root (as in the VFS1 equivalent,
boot.addOverlay()). This makes the overlay root owned by UID/GID 65534 with
mode 0755 rather than owned by UID/GID 0 with mode 01777. Fixes
CreateTest.CreateFailsOnUnpermittedDir, which assumes that the test cannot
create files in /.
- MknodTest.UnimplementedTypesReturnError assumes that the creation of device
special files is not supported. However, while the VFS2 gofer client still
doesn't support device special files, VFS2 tmpfs does, and in the overlay
test dimension mknod() targets a tmpfs upper layer. The test initially has
all capabilities, including CAP_MKNOD, so its creation of these files
succeeds. Constrain these tests to VFS1.
- Rename overlay.nonDirectoryFD to overlay.regularFileFD and only use it for
regular files, using the original FD for pipes and device special files. This
is more consistent with Linux (which gets the original inode_operations, and
therefore file_operations, for these file types from ovl_fill_inode() =>
init_special_inode()) and fixes remaining mknod and pipe tests.
- Read/write 1KB at a time in PipeTest.Streaming, rather than 4 bytes. This
isn't strictly necessary, but it makes the test less obnoxiously slow on
ptrace.
Fixes #4407
PiperOrigin-RevId: 337971042
|
|
Reported-by: syzbot+0268cc591c0f517a1de0@syzkaller.appspotmail.com
PiperOrigin-RevId: 337901664
|
|
This change makes the following changes:
- Unlocks MemoryFile.mu while calling mincore (checkCommitted) because mincore
can take a really long time. Accordingly looks up the segment in the tree
tree again and handles changes to the segment.
- MemoryFile.UpdateUsage() can now only be called at frequency at most 100Hz.
100 Hz = linux.CLOCKS_PER_SEC.
Co-authored-by: Jamie Liu <jamieliu@google.com>
PiperOrigin-RevId: 337865250
|
|
Control messages should be released on Read (which ignores the control message)
or zero-byte Send. Otherwise, open fds sent through the control messages will
be leaked.
PiperOrigin-RevId: 337110774
|
|
This fixes reference leaks related to accidentally forgetting to DecRef()
after calling one or the other.
PiperOrigin-RevId: 336918922
|
|
- sysinfo(2) does not actually require a fine-grained breakdown of memory
usage. Accordingly, instead of calling pgalloc.MemoryFile.UpdateUsage() to
update the sentry's fine-grained memory accounting snapshot, just use
pgalloc.MemoryFile.TotalUsage() (which is a single fstat(), and therefore far
cheaper).
- Use the number of threads in the root PID namespace (i.e. globally) rather
than in the task's PID namespace for consistency with Linux (which just reads
global variable nr_threads), and add a new method to kernel.PIDNamespace to
allow this to be read directly from an underlying map rather than requiring
the allocation and population of an intermediate slice.
PiperOrigin-RevId: 336353100
|
|
Reported-by: syzbot+bb82fb556d5d0a43f632@syzkaller.appspotmail.com
PiperOrigin-RevId: 336324720
|
|
cf. 2a36ab717e8f "rseq/membarrier: Add MEMBARRIER_CMD_PRIVATE_EXPEDITED_RSEQ"
PiperOrigin-RevId: 336186795
|
|
Updates #267
PiperOrigin-RevId: 335713923
|
|
PiperOrigin-RevId: 335492800
|
|
PiperOrigin-RevId: 335051794
|
|
arm64 vfs2: Add support for io_submit/fallocate/
sendfile/newfstatat/readahead/fadvise64
Signed-off-by: Howard Zhang <howard.zhang@arm.com>
|
|
This patch adds minor changes for Arm64 platform:
1, add SetRobustList/GetRobustList support for arm64 syscall module.
2, add newfstatat support for arm64 vfs2 syscall module.
3, add tls value in ProtoBuf.
Signed-off-by: Bin Lu <bin.lu@arm.com>
|
|
Use HandleIOErrorVFS2 instead of custom error handling.
PiperOrigin-RevId: 333227581
|
|
This is more consistent with Linux (see comment on MM.NewSharedAnonMappable()).
We don't do the same thing on VFS1 for reasons documented by the updated
comment.
PiperOrigin-RevId: 332514849
|
|
Linux defines this struct as:
struct sched_param {
int priority;
}
... in include/linux/sched.h.
PiperOrigin-RevId: 332473133
|
|
PiperOrigin-RevId: 331940975
|
|
Discovered by ayushranjan@:
VFS2 was employing the following algorithm for fetching ready events from an
epoll instance:
- Create a statically sized EpollEvent slice on the stack of size 16.
- Pass that to EpollInstance.ReadEvents() to populate.
- EpollInstance.ReadEvents() requeues level-triggered events that it returns
back into the ready queue.
- Write the results to usermem.
- If the number of results were = 16 then recall EpollInstance.ReadEvents() in
the hopes of getting more. But this will cause duplication of the "requeued"
ready level-triggered events.
So if the ready queue has >= 16 ready events, the EpollWait for loop will spin
until it fills the usermem with `maxEvents` events.
Fixes #3521
PiperOrigin-RevId: 331840527
|
|
PiperOrigin-RevId: 331648296
|
|
PiperOrigin-RevId: 331256608
|
|
PiperOrigin-RevId: 330554450
|
|
Fixes #3779.
PiperOrigin-RevId: 330057268
|
|
PiperOrigin-RevId: 329572337
|
|
Also, add corresponding EOF tests for splice/sendfile.
Discovered by syzkaller.
PiperOrigin-RevId: 328975990
|
|
Fixes *.sh Java runtime tests, where splice()-ing from a pipe to /dev/zero
would not actually empty the pipe.
There was no guarantee that the data would actually be consumed on a splice
operation unless the output file's implementation of Write/PWrite actually
called VFSPipeFD.CopyIn. Now, whatever bytes are "written" are consumed
regardless of whether CopyIn is called or not.
Furthermore, the number of bytes in the IOSequence for reads is now capped at
the amount of data actually available. Before, splicing to /dev/zero would
always return the requested splice size without taking the actual available
data into account.
This change also refactors the case where an input file is spliced into an
output pipe so that it follows a similar pattern, which is arguably cleaner
anyway.
Updates #3576.
PiperOrigin-RevId: 328843954
|
|
We now allow hard links to be created within gofer fs (see
github.com/google/gvisor/commit/f20e63e31b56784c596897e86f03441f9d05f567).
Update the inotify documentation accordingly.
PiperOrigin-RevId: 328177485
|
|
This lets us create "synthetic" mountpoint directories in ReadOnly mounts
during VFS setup.
Also add context.WithMountNamespace, as some filesystems (like overlay) require
a MountNamespace on ctx to handle vfs.Filesystem Operations.
PiperOrigin-RevId: 327874971
|