Age | Commit message (Collapse) | Author |
|
This change mainly aims to define the semantics of communication for the LISAFS
(LInux SAndbox Filesystem) protocol. This protocol aims to replace 9P and
intends to bring some performance benefits with it.
Some of the notable differences from the p9 package are:
- Now the server implementations own the handlers.
- As a result, there is no verbose interface like `p9.File` that all servers
need to implement. Different implementations can extend their File
implementations to varying degrees without imposing those extensions to other
server implementations that might not have anything to do with those features.
- If a server implementation adds a new RPC message, other implementations are
not compelled to support it.
I wrote a benchmark `BenchmarkSendRecv` in connection_test.go which competes
with p9's `BenchmarkSendRecvChannel`. Running these on an AMD Milan machine
shows that lisafs is **45%** faster.
**With 9P**
goos: linux
goarch: amd64
pkg: gvisor/pkg/p9/p9
cpu: AMD EPYC 7B13 64-Core Processor
BenchmarkSendRecvLegacy-256 82830 14053 ns/op 633 B/op 23 allocs/op
BenchmarkSendRecvChannel-256 776971 1551 ns/op 184 B/op 6 allocs/op
**With lisafs**
goos: linux
goarch: amd64
pkg: pkg/lisafs/connection_test
cpu: AMD EPYC 7B13 64-Core Processor
BenchmarkSendRecv-256 1399610 853.5 ns/op 48 B/op 2 allocs/op
Fixes #5464
PiperOrigin-RevId: 397803163
|
|
PiperOrigin-RevId: 394560866
|
|
The old implementation was mostly correct but error prone - making way for the
issue in question here. In its error path, it would leak the intermediate file
being walked. Each return/break needed explicit cleanup.
This change implements a more clean way to cleaning up intermediate directories.
If the code were to evolve to be more complex, it would still work.
PiperOrigin-RevId: 392102826
|
|
Convert remaining public errors (e.g. EINTR) from syserror to linuxerr.
PiperOrigin-RevId: 390471763
|
|
Change the p9 server to use *errors.Error defined in pkg linuxerr. Done
separate from the client so that we ensure different p9 server/client versions
work with each other.
PiperOrigin-RevId: 380084491
|
|
PiperOrigin-RevId: 371015541
|
|
PiperOrigin-RevId: 369686285
|
|
While using remote-validation, the vast majority of time spent during
FS operations is re-walking the path to check for modifications and
then closing the file given that in most cases it has not been
modified externally.
This change introduces a new 9P message called MultiGetAttr which bulks
query attributes of several files in one shot. The returned attributes are
then used to update cached dentries before they are walked. File attributes
are updated for files that still exist. Dentries that have been deleted are
removed from the cache. And negative cache entries are removed if a new
file/directory was created externally. Similarly, synthetic dentries are
replaced if a file/directory is created externally.
The bulk update needs to be carefull not to follow symlinks, cross mount
points, because the gofer doesn't know how to resolve symlinks and where
mounts points are located. It also doesn't walk to the parent ("..") to
avoid deadlocks.
Here are the results:
Workload VFS1 VFS2 Change
bazel action 115s 70s 28.8s
Stat/100 11,043us 7,623us 974us
Updates #1638
PiperOrigin-RevId: 369325957
|
|
Also adds support for clearing the setuid bit when appropriate (writing,
truncating, changing size, changing UID, or changing GID).
VFS2 only.
PiperOrigin-RevId: 364661835
|
|
The syscall package has been deprecated in favor of golang.org/x/sys.
Note that syscall is still used in the following places:
- pkg/sentry/socket/hostinet/stack.go: some netlink related functionalities
are not yet available in golang.org/x/sys.
- syscall.Stat_t is still used in some places because os.FileInfo.Sys() still
returns it and not unix.Stat_t.
Updates #214
PiperOrigin-RevId: 360701387
|
|
sync.WaitGroup.Add(positive delta) is illegal if the WaitGroup counter is zero
and WaitGroup.Wait() may be called concurrently. This is problematic for
p9.connState.pendingWg, which counts inflight requests (so transitions from
zero are normal) and is waited-upon when receiving from the underlying Unix
domain socket returns an error, e.g. during connection shutdown. (Even if the
socket has been closed, new requests can still be concurrently received via
flipcall channels.)
PiperOrigin-RevId: 359416057
|
|
These are primarily simplification and lint mistakes. However, minor
fixes are also included and tests added where appropriate.
PiperOrigin-RevId: 351425971
|
|
PiperOrigin-RevId: 347047550
|
|
openedMu has lock ordering violations. Most locks go through OpenedFlag(),
which is usually taken after renameMu and opMu. On the other hand, Tlopen takes
openedMu before renameMu and opMu (via safelyRead).
Resolving this violation is simple: just drop openedMu. The opened and
openFlags fields are already protected by opMu in most cases, renameMu (for
write) in one case (via safelyGlobal), and only in doWalk by neither.
This is a bit ugly because opMu is supposed to be a "semantic" lock, but it
works. I'm open to other suggestions.
Note that doWalk has a race condition where a FID may open after the open check
but before actually walking. This race existed before this change as well; it
is not clear if it is problematic.
PiperOrigin-RevId: 346108483
|
|
They were returning io.ErrShortWrite, but that is not handled at higher levels
and resulted in a panic.
We can just return the short write directly from the p9 call without
ErrShortWrite.
PiperOrigin-RevId: 342960441
|
|
Fixes #2714
PiperOrigin-RevId: 342950412
|
|
This is to cover the common pattern: open->read/write->close,
where SetAttr needs to be called to update atime/mtime before
the file is closed.
Benchmark results:
BM_OpenReadClose/10240 CPU
setattr+clunk: 63783 ns
VFS2: 68109 ns
VFS1: 72507 ns
Updates #1198
PiperOrigin-RevId: 329628461
|
|
- Remove sendDone, which currently does nothing whatsoever (errors sent to the
channel are completely unused). Instead, have request handlers log errors
they get from p9.send() inline.
- Replace recvOkay and recvDone with recvMu/recvIdle/recvShutdown. In addition
to being slightly clearer (IMO), this eliminates the p9.connState.service()
goroutine, significantly reducing the overhead involved in passing connection
receive access between goroutines (from buffered chan send/recv + unbuffered
chan send/recv to just a mutex unlock/lock).
PiperOrigin-RevId: 327476755
|
|
... including those invoked via flipcall.
PiperOrigin-RevId: 327283194
|
|
Ported from https://github.com/hugelgupf/p9/pull/44.
name old time/op new time/op delta
SendRecvLegacy-6 61.5µs ± 6% 60.1µs ±11% ~ (p=0.063 n=9+9)
SendRecv-6 40.7µs ± 2% 39.8µs ± 5% -2.27% (p=0.035 n=10+10)
name old alloc/op new alloc/op delta
SendRecvLegacy-6 769B ± 0% 705B ± 0% -8.37% (p=0.000 n=8+10)
SendRecv-6 320B ± 0% 256B ± 0% -20.00% (p=0.000 n=10+10)
name old allocs/op new allocs/op delta
SendRecvLegacy-6 25.0 ± 0% 23.0 ± 0% -8.00% (p=0.000 n=10+10)
SendRecv-6 14.0 ± 0% 12.0 ± 0% -14.29% (p=0.000 n=10+10)
PiperOrigin-RevId: 326127979
|
|
|
|
PiperOrigin-RevId: 319283715
|
|
PiperOrigin-RevId: 315745386
|
|
Continues the modifications in cl/272963663. This prevents non-syscall errors
from being propogated to kernel/task_syscall.go:ExtractErrno(), which causes a
sentry panic.
PiperOrigin-RevId: 305913127
|
|
PiperOrigin-RevId: 305721329
|
|
PiperOrigin-RevId: 305171772
|
|
PiperOrigin-RevId: 304542967
|
|
These are not used outside of the p9 package.
PiperOrigin-RevId: 295200052
|
|
Note that these are only implemented for tmpfs, and other impls will still
return EOPNOTSUPP.
PiperOrigin-RevId: 293899385
|
|
PiperOrigin-RevId: 293617493
|
|
PiperOrigin-RevId: 291745021
|
|
PiperOrigin-RevId: 290145451
|
|
There was a very bare get/setxattr in the InodeOperations interface. Add
context.Context to both, size to getxattr, and flags to setxattr.
Note that extended attributes are passed around as strings in this
implementation, so size is automatically encoded into the value. Size is
added in getxattr so that implementations can return ERANGE if a value is larger
than can fit in the user-allocated buffer. This prevents us from unnecessarily
passing around an arbitrarily large xattr when the user buffer is actually too
small.
Don't use the existing xattrwalk and xattrcreate messages and define our
own, mainly for the sake of simplicity.
Extended attributes will be implemented in future commits.
PiperOrigin-RevId: 290121300
|
|
* Rename syncutil to sync.
* Add aliases to sync types.
* Replace existing usage of standard library sync package.
This will make it easier to swap out synchronization primitives. For example,
this will allow us to use primitives from github.com/sasha-s/go-deadlock to
check for lock ordering violations.
Updates #1472
PiperOrigin-RevId: 289033387
|
|
These comments provided nothing, and have been copy-pasted into all
implementations. The code is clear without them.
I considered also removing the "handle implements handler.handle" comments, but
will let those stay for now.
PiperOrigin-RevId: 285876428
|
|
Note that the Sentry still calls Truncate() on the file before calling Open.
A new p9 version check was added to ensure that the p9 server can handle the
the OpenTruncate flag. If not, then the flag is stripped before sending.
PiperOrigin-RevId: 281609112
|
|
Aside from the performance hit, there is no guarantee that p9.ClientFile's
finalizer runs before the associated p9.Client is closed.
PiperOrigin-RevId: 280702509
|
|
This is required to implement O_TRUNC correctly on filesystems backed by
gofers.
9P2000.L: "lopen prepares fid for file I/O. flags contains Linux open(2) flags
bits, e.g. O_RDONLY, O_RDWR, O_WRONLY."
open(2): "The argument flags must include one of the following access modes:
O_RDONLY, O_WRONLY, or O_RDWR. ... In addition, zero or more file creation
flags and file status flags can be bitwise-or'd in flags."
The reference 9P2000.L implementation also appears to expect arbitrary flags,
not just access modes, in Tlopen.flags:
https://github.com/chaos/diod/blob/master/diod/ops.c#L703
PiperOrigin-RevId: 278972683
|
|
This gets quite spammy, especially in tests.
PiperOrigin-RevId: 277970468
|
|
Since the syscall.Stat_t.Nlink is defined as different types on
amd64 and arm64(uint64 and uint32 respectively), we need to cast
them to a unified uint64 type in gVisor code.
Signed-off-by: Haibo Xu <haibo.xu@arm.com>
Change-Id: I7542b99b195c708f3fc49b1cbe6adebdd2f6e96b
|
|
Also ensure that all flipcall transport errors not returned by p9 (converted to
EIO by the client, or dropped on the floor by channel server goroutines) are
logged.
PiperOrigin-RevId: 272963663
|
|
PiperOrigin-RevId: 270789146
|
|
- Do not call Rread.SetPayload(flipcall packet window) in p9.channel.recv().
- Ignore EINTR from ppoll() in p9.Client.watch().
- Clean up handling of client socket FD lifetimes so that p9.Client.watch()
never ppoll()s a closed FD.
- Make p9test.Harness.Finish() call clientSocket.Shutdown() instead of
clientSocket.Close() for the same reason.
- Rework channel reuse to avoid leaking channels in the following case (suppose
we have two channels):
sendRecvChannel
len(channels) == 2 => idx = 1
inuse[1] = ch0
sendRecvChannel
len(channels) == 1 => idx = 0
inuse[0] = ch1
inuse[1] = nil
sendRecvChannel
len(channels) == 1 => idx = 0
inuse[0] = ch0
inuse[0] = nil
inuse[0] == nil => ch0 leaked
- Avoid deadlocking p9.Client.watch() by calling channelsWg.Wait() without
holding channelsMu.
- Bump p9test:client_test size to medium.
PiperOrigin-RevId: 270200314
|
|
PiperOrigin-RevId: 268845090
|
|
They are no-ops, so the standard rule works fine.
PiperOrigin-RevId: 268776264
|
|
PiperOrigin-RevId: 255713414
|
|
Addresses obvious typos, in the documentation only.
COPYBARA_INTEGRATE_REVIEW=https://github.com/google/gvisor/pull/443 from Pixep:fix/documentation-spelling 4d0688164eafaf0b3010e5f4824b35d1e7176d65
PiperOrigin-RevId: 255477779
|
|
Currently, the path tracking in the gofer involves an O(n) lookup of
child fidRefs. This causes a significant overhead on unlinks in
directories with lots of child fidRefs (<4k).
In this transition, pathNode moves from sync.Map to normal synchronized
maps. There is a small chance of contention in walk, but the lock is
held for a very short time (and sync.Map also had a chance of requiring
locking).
OTOH, sync.Map makes it very difficult to add a fidRef reverse map.
PiperOrigin-RevId: 254489952
|
|
Neither fidRefs or children are (directly) synchronized by mu. Remove
the preconditions that say so.
That said, the surrounding does enforce some synchronization guarantees
(e.g., fidRef.renameChildTo does not atomically replace the child in the
maps). I've tried to note the need for callers to do this
synchronization.
I've also renamed the maps to what are (IMO) clearer names. As is, it is
not obvious that pathNode.fidRefs is a map of *child* fidRefs rather
than self fidRefs.
PiperOrigin-RevId: 254446965
|
|
Otherwise future renames may miss Renamed calls.
PiperOrigin-RevId: 254060946
|