Age | Commit message (Collapse) | Author |
|
These were out-of-band notes that can help provide additional context
and simplify automated imports.
PiperOrigin-RevId: 293525915
|
|
Updates #1198
Opening host pipes (by spinning in fdpipe) and host sockets is not yet
complete, and will be done in a future CL.
Major differences from VFS1 gofer client (sentry/fs/gofer), with varying levels
of backportability:
- "Cache policies" are replaced by InteropMode, which control the behavior of
timestamps in addition to caching. Under InteropModeExclusive (analogous to
cacheAll) and InteropModeWritethrough (analogous to cacheAllWritethrough),
client timestamps are *not* written back to the server (it is not possible in
9P or Linux for clients to set ctime, so writing back client-authoritative
timestamps results in incoherence between atime/mtime and ctime). Under
InteropModeShared (analogous to cacheRemoteRevalidating), client timestamps
are not used at all (remote filesystem clocks are authoritative). cacheNone
is translated to InteropModeShared + new option
filesystemOptions.specialRegularFiles.
- Under InteropModeShared, "unstable attribute" reloading for permission
checks, lookup, and revalidation are fused, which is feasible in VFS2 since
gofer.filesystem controls path resolution. This results in a ~33% reduction
in RPCs for filesystem operations compared to cacheRemoteRevalidating. For
example, consider stat("/foo/bar/baz") where "/foo/bar/baz" fails
revalidation, resulting in the instantiation of a new dentry:
VFS1 RPCs:
getattr("/") // fs.MountNamespace.FindLink() => fs.Inode.CheckPermission() => gofer.inodeOperations.check() => gofer.inodeOperations.UnstableAttr()
walkgetattr("/", "foo") = fid1 // fs.Dirent.walk() => gofer.session.Revalidate() => gofer.cachePolicy.Revalidate()
clunk(fid1)
getattr("/foo") // CheckPermission
walkgetattr("/foo", "bar") = fid2 // Revalidate
clunk(fid2)
getattr("/foo/bar") // CheckPermission
walkgetattr("/foo/bar", "baz") = fid3 // Revalidate
clunk(fid3)
walkgetattr("/foo/bar", "baz") = fid4 // fs.Dirent.walk() => gofer.inodeOperations.Lookup
getattr("/foo/bar/baz") // linux.stat() => gofer.inodeOperations.UnstableAttr()
VFS2 RPCs:
getattr("/") // gofer.filesystem.walkExistingLocked()
walkgetattr("/", "foo") = fid1 // gofer.filesystem.stepExistingLocked()
clunk(fid1)
// No getattr: walkgetattr already updated metadata for permission check
walkgetattr("/foo", "bar") = fid2
clunk(fid2)
walkgetattr("/foo/bar", "baz") = fid3
// No clunk: fid3 used for new gofer.dentry
// No getattr: walkgetattr already updated metadata for stat()
- gofer.filesystem.unlinkAt() does not require instantiation of a dentry that
represents the file to be deleted. Updates #898.
- gofer.regularFileFD.OnClose() skips Tflushf for regular files under
InteropModeExclusive, as it's nonsensical to request a remote file flush
without flushing locally-buffered writes to that remote file first.
- Symlink targets are cached when InteropModeShared is not in effect.
- p9.QID.Path (which is already required to be unique for each file within a
server, and is accordingly already synthesized from device/inode numbers in
all known gofers) is used as-is for inode numbers, rather than being mapped
along with attr.RDev in the client to yet another synthetic inode number.
- Relevant parts of fsutil.CachingInodeOperations are inlined directly into
gofer package code. This avoids having to duplicate part of its functionality
in fsutil.HostMappable.
PiperOrigin-RevId: 293190213
|
|
Internal pipes are supported similarly to how internal UDS is done.
It is also controlled by the same flag.
Fixes #1102
PiperOrigin-RevId: 293150045
|
|
PiperOrigin-RevId: 292587459
|
|
Splice must not allow negative offsets. Writes also must not allow offset +
size to overflow int64. Reads are similarly broken, but not just in splice
(b/148095030).
Reported-by: syzbot+0e1ff0b95fb2859b4190@syzkaller.appspotmail.com
PiperOrigin-RevId: 292361208
|
|
Special files can have additional requirements for granularity.
For example, read from eventfd returns EINVAL if a size is less 8 bytes.
Reported-by: syzbot+3905f5493bec08eb7b02@syzkaller.appspotmail.com
PiperOrigin-RevId: 292002926
|
|
Because the abi will depend on the core types for marshalling (usermem,
context, safemem, safecopy), these need to be flattened from the sentry
directory. These packages contain no sentry-specific details.
PiperOrigin-RevId: 291811289
|
|
PiperOrigin-RevId: 291745021
|
|
Also renames TMutex to Mutex.
These custom mutexes aren't any worse than the standard library versions (same
code), so having both seems redundant.
PiperOrigin-RevId: 290873587
|
|
Some files were missing the last line break.
PiperOrigin-RevId: 290808898
|
|
Java 11 parses /proc/self/mountinfo for cgroup information. Java 11.0.4 uses
the mount path to determine what cgroups existed, but Java 11.0.5 reads the
cgroup names from the superblock options.
This CL adds the cgroup name to the superblock options if the filesystem type
is "cgroup". Since gVisor doesn't actually support cgroups yet, we just infer
the cgroup name from the path.
PiperOrigin-RevId: 290434323
|
|
We must hold fs.renameMu to access Dirent.parent.
PiperOrigin-RevId: 290340804
|
|
We were setting queue.readable without holding the lock.
PiperOrigin-RevId: 290306922
|
|
PiperOrigin-RevId: 290272560
|
|
PiperOrigin-RevId: 290198756
|
|
PiperOrigin-RevId: 290186303
|
|
There is a lot of code duplication for VFSv2 and this
serves as remind to keep the copies in sync.
Updates #1195
PiperOrigin-RevId: 290139234
|
|
There was a very bare get/setxattr in the InodeOperations interface. Add
context.Context to both, size to getxattr, and flags to setxattr.
Note that extended attributes are passed around as strings in this
implementation, so size is automatically encoded into the value. Size is
added in getxattr so that implementations can return ERANGE if a value is larger
than can fit in the user-allocated buffer. This prevents us from unnecessarily
passing around an arbitrarily large xattr when the user buffer is actually too
small.
Don't use the existing xattrwalk and xattrcreate messages and define our
own, mainly for the sake of simplicity.
Extended attributes will be implemented in future commits.
PiperOrigin-RevId: 290121300
|
|
Except for one under /proc/sys/net/ipv4/tcp_sack.
/proc/pid/* is still incomplete.
Updates #1195
PiperOrigin-RevId: 290120438
|
|
* Rename syncutil to sync.
* Add aliases to sync types.
* Replace existing usage of standard library sync package.
This will make it easier to swap out synchronization primitives. For example,
this will allow us to use primitives from github.com/sasha-s/go-deadlock to
check for lock ordering violations.
Updates #1472
PiperOrigin-RevId: 289033387
|
|
PiperOrigin-RevId: 288642552
|
|
- Renamed memfs to tmpfs.
- Copied fileRangeSet bits from fs/fsutil/ to fsimpl/tmpfs/
- Changed tmpfs to be backed by filemem instead of byte slice.
- regularFileReadWriter uses a sync.Pool, similar to gofer client.
PiperOrigin-RevId: 288356380
|
|
Otherwise a copy happens, which triggers a data race when reading
masterInodeOperations.SimpleFileOperations.uattr, which must be accessed with a
lock held.
PiperOrigin-RevId: 286464473
|
|
PiperOrigin-RevId: 286248378
|
|
PiperOrigin-RevId: 286051631
|
|
PiperOrigin-RevId: 285874181
|
|
Add checks for input arguments, file type, permissions, etc. that match
the Linux implementation. A call to get/setxattr that passes all the
checks will still currently return EOPNOTSUPP. Actual support will be
added in following commits.
Only allow user.* extended attributes for the time being.
PiperOrigin-RevId: 285835159
|
|
Copy up parent when binding UDS on overlayfs is supported in commit
02ab1f187cd24c67b754b004229421d189cee264.
But the using of copyUp in overlayBind will cause sentry stuck, reason
is dead lock in renameMu.
1 [Process A] Invoke a Unix socket bind operation
renameMu is hold in fs.(*Dirent).genericCreate by process A
2 [Process B] Invoke a read syscall on /proc/task/mounts
waitng on Lock of renameMu in fs.(*MountNamespace).FindMount
3 [Process A] Continue Unix socket bind operation
wating on RLock of renameMu in fs.copyUp
Root cause is recursive reading lock of reanmeMu in bind call trace,
if there are writing lock between the two reading lock, then deadlock
occured.
Fixes #1397
|
|
After the finalizer optimize in 76039f895995c3fe0deef5958f843868685ecc38
commit, clientFile needs to closed before finalizer release it.
The clientFile is not closed if it is created via
gofer.(*inodeOperations).Bind, this will cause fd leak which is hold
by gofer process.
Fixes #1396
Signed-off-by: Yong He <chenglang.hy@antfin.com>
Signed-off-by: Jianfeng Tan <henry.tjf@antfin.com>
|
|
PiperOrigin-RevId: 284606233
|
|
Threadgroups already know their TTY (if they have one), which now contains the
TTY Index, and is returned in the Processes() call.
PiperOrigin-RevId: 284263850
|
|
PiperOrigin-RevId: 284038840
|
|
PiperOrigin-RevId: 283630669
|
|
This allows writable proc and devices files to be opened with O_CREAT|O_TRUNC.
This is encountered most frequently when interacting with proc or devices files
via the command line.
e.g. $ echo 8192 1048576 4194304 > /proc/sys/net/ipv4/tcp_rmem
Also adds a test to test the behavior of open(O_TRUNC), truncate, and ftruncate
on named pipes.
Fixes #1116
PiperOrigin-RevId: 282677425
|
|
PiperOrigin-RevId: 282382564
|
|
PiperOrigin-RevId: 281795269
|
|
Note that the Sentry still calls Truncate() on the file before calling Open.
A new p9 version check was added to ensure that the p9 server can handle the
the OpenTruncate flag. If not, then the flag is stripped before sending.
PiperOrigin-RevId: 281609112
|
|
PiperOrigin-RevId: 281112758
|
|
It was possible to panic the sentry by opening a cache revalidating folder with
O_TRUNC|O_CREAT.
Avoids breaking php tests.
PiperOrigin-RevId: 280533213
|
|
PiperOrigin-RevId: 280131840
|
|
newfstatat() syscall is not supported on arm64, so we resort
to use the fstatat() syscall.
Signed-off-by: Haibo Xu <haibo.xu@arm.com>
Change-Id: Iea95550ea53bcf85c01f7b3b95da70ad0952177d
|
|
This patch also include a minor change to replace syscall.Dup2
with syscall.Dup3 which was missed in a previous commit(ref a25a976).
Signed-off-by: Haibo Xu <haibo.xu@arm.com>
Change-Id: I00beb9cc492e44c762ebaa3750201c63c1f7c2f3
|
|
PiperOrigin-RevId: 279365629
|
|
It was possible to panic the sentry by opening a cache revalidating folder with
O_TRUNC|O_CREAT.
PiperOrigin-RevId: 278417533
|
|
PiperOrigin-RevId: 276380008
|
|
PiperOrigin-RevId: 275139066
|
|
Linux kernel before 4.19 doesn't implement a feature that updates
open FD after a file is open for write (and is copied to the upper
layer). Already open FD will continue to read the old file content
until they are reopened. This is especially problematic for gVisor
because it caches open files.
Flag was added to force readonly files to be reopenned when the
same file is open for write. This is only needed if using kernels
prior to 4.19.
Closes #1006
It's difficult to really test this because we never run on tests
on older kernels. I'm adding a test in GKE which uses kernels
with the overlayfs problem for 1.14 and lower.
PiperOrigin-RevId: 275115289
|
|
When any of these flags are set, all writes will trigger a subsequent fsync
call. This behavior already existed for "write-through" mounts.
O_DIRECT is treated as an alias for O_SYNC. Better support coming soon.
PiperOrigin-RevId: 275114392
|
|
PiperOrigin-RevId: 275114157
|
|
This proc file reports routing information to applications inside the
container.
Signed-off-by: Jianfeng Tan <henry.tjf@antfin.com>
Change-Id: I498e47f8c4c185419befbb42d849d0b099ec71f3
|