Age | Commit message (Collapse) | Author |
|
Currently, the overlay dirCache is only used for a single logical use of
getdents. i.e., it is discard when the FD is closed or seeked back to
the beginning.
But the initial work of getting the directory contents can be quite
expensive (particularly sorting large directories), so we should keep it
as long as possible.
This is very similar to the readdirCache in fs/gofer.
Since the upper filesystem does not have to allow caching readdir
entries, the new CacheReaddir MountSourceOperations method controls this
behavior.
This caching should be trivially movable to all Inodes if desired,
though that adds an additional copy step for non-overlay Inodes.
(Overlay Inodes already do the extra copy).
PiperOrigin-RevId: 255477592
|
|
All functions which allocate objects containing AtomicRefCounts will soon need
a context.
PiperOrigin-RevId: 253147709
|
|
This can be merged after:
https://github.com/google/gvisor-website/pull/77
or
https://github.com/google/gvisor-website/pull/78
PiperOrigin-RevId: 253132620
|
|
SockType isn't specific to unix domain sockets, and the current
definition basically mirrors the linux ABI's definition.
PiperOrigin-RevId: 251956740
|
|
and don't report a sender address if it doesn't have one
PiperOrigin-RevId: 251371284
|
|
This does not actually implement an efficient splice or sendfile. Rather, it
adds a generic plumbing to the file internals so that this can be added. All
file implementations use the stub fileutil.NoSplice implementation, which
causes sendfile and splice to fall back to an internal copy.
A basic splice system call interface is added, along with a test.
PiperOrigin-RevId: 249335960
Change-Id: Ic5568be2af0a505c19e7aec66d5af2480ab0939b
|
|
The backing 9p server must allow named pipe creation, which the runsc
fsgofer currently does not.
There are small changes to the overlay here. GetFile may block when
opening a named pipe, which can cause a deadlock:
1. open(O_RDONLY) -> copyMu.Lock() -> GetFile()
2. open(O_WRONLY) -> copyMu.Lock() -> Deadlock
A named pipe usable for writing must already be on the upper filesystem,
but we are still taking copyMu for write when checking for upper. That
can be changed to a read lock to fix the common case.
However, a named pipe on the lower filesystem would still deadlock in
open(O_WRONLY) when it tries to actually perform copy up (which would
simply return EINVAL). Move the copy up type check before taking copyMu
for write to avoid this.
p9 must be modified, as it was incorrectly removing the file mode when
sending messages on the wire.
PiperOrigin-RevId: 249154033
Change-Id: Id6637130e567b03758130eb6c7cdbc976384b7d6
|
|
* Creation of files, directories (and other fs objects) in a directory
should always update ctime.
* Same for removal.
* atime should not be updated on lookup, only readdir.
I've also renamed some misleading functions that update mtime and ctime.
PiperOrigin-RevId: 249115063
Change-Id: I30fa275fa7db96d01aa759ed64628c18bb3a7dc7
|
|
This more directly matches what Linux does with unsupported
nodes.
PiperOrigin-RevId: 248780425
Change-Id: I17f3dd0b244f6dc4eb00e2e42344851b8367fbec
|
|
There is a lot of redundancy that we can simplify in the stat_times
test. This will make it easier to add new tests. However, the
simplification reveals that cached uattrs on goferfs don't properly
update ctime on rename.
PiperOrigin-RevId: 248773425
Change-Id: I52662728e1e9920981555881f9a85f9ce04041cf
|
|
PiperOrigin-RevId: 248437159
Change-Id: Ife71f6ca032fca59ec97a82961000ed0af257101
|
|
PiperOrigin-RevId: 248249285
Change-Id: I9b6d267baa666798b22def590ff20c9a118efd47
|
|
Closes #225
PiperOrigin-RevId: 247508791
Change-Id: I04f47cf2770b30043e5a272aba4ba6e11d0476cc
|
|
Most are single line methods in hot paths.
PiperOrigin-RevId: 247050267
Change-Id: I428d78723fe00b57483185899dc8fa9e1f01e2ea
|
|
Fixes #219
PiperOrigin-RevId: 246568639
Change-Id: Ic7afd15dde922638d77f6429c508d1cbe2e4288a
|
|
Based on the guidelines at
https://opensource.google.com/docs/releasing/authors/.
1. $ rg -l "Google LLC" | xargs sed -i 's/Google LLC.*/The gVisor Authors./'
2. Manual fixup of "Google Inc" references.
3. Add AUTHORS file. Authors may request to be added to this file.
4. Point netstack AUTHORS to gVisor AUTHORS. Drop CONTRIBUTORS.
Fixes #209
PiperOrigin-RevId: 245823212
Change-Id: I64530b24ad021a7d683137459cafc510f5ee1de9
|
|
PiperOrigin-RevId: 245818639
Change-Id: I03703ef0fb9b6675955637b9fe2776204c545789
|
|
Maximum filename length is filesystem-dependent, and obtained via
statfs::f_namelen. This limit is usually 255 bytes (NAME_MAX), but not
always. For example, VFAT supports filenames of up to 255... UCS-2
characters, which Linux conservatively takes to mean UTF-8-encoded
bytes: fs/fat/inode.c:fat_statfs(), FAT_LFN_LEN * NLS_MAX_CHARSET_SIZE.
As a result, Linux's VFS does not enforce NAME_MAX:
$ rg --maxdepth=1 '\WNAME_MAX\W' fs/ include/linux/
fs/libfs.c
38: buf->f_namelen = NAME_MAX;
64: if (dentry->d_name.len > NAME_MAX)
include/linux/relay.h
74: char base_filename[NAME_MAX]; /* saved base filename */
include/linux/fscrypt.h
149: * filenames up to NAME_MAX bytes, since base64 encoding expands the length.
include/linux/exportfs.h
176: * understanding that it is already pointing to a a %NAME_MAX+1 sized
Remove this check from core VFS, and add it to ramfs (and by extension
tmpfs), where it is actually applicable:
mm/shmem.c:shmem_dir_inode_operations.lookup == simple_lookup *does*
enforce NAME_MAX.
PiperOrigin-RevId: 245324748
Change-Id: I17567c4324bfd60e31746a5270096e75db963fac
|
|
FD limit and file size limit is read from the host, instead
of using hard-coded defaults, given that they effect the sandbox
process. Also limit the direct cache to use no more than half
if the available FDs.
PiperOrigin-RevId: 244050323
Change-Id: I787ad0fdf07c49d589e51aebfeae477324fe26e6
|
|
PiperOrigin-RevId: 243018347
Change-Id: I1e5b80607c1df0747482abea61db7fcf24536d37
|
|
RootFromContext can return a dirent with reference taken, or nil. We must call
DecRef if (and only if) a real dirent is returned.
PiperOrigin-RevId: 242965515
Change-Id: Ie2b7b4cb19ee09b6ccf788b71f3fd7efcdf35a11
|
|
- Document fsutil.CachedFileObject.FD() requirements on access
permissions, and change gofer.inodeFileState.FD() to honor them.
Fixes #147.
- Combine gofer.inodeFileState.readonly and
gofer.inodeFileState.readthrough, and simplify handle caching logic.
- Inline gofer.cachePolicy.cacheHandles into
gofer.inodeFileState.setSharedHandles, because users with access to
gofer.inodeFileState don't necessarily have access to the fs.Inode
(predictably, this is a save/restore problem).
Before this CL:
$ docker run --runtime=runsc-d -v $(pwd)/gvisor/repro:/root/repro -it ubuntu bash
root@34d51017ed67:/# /root/repro/runsc-b147
mmap: 0x7f3c01e45000
Segmentation fault
After this CL:
$ docker run --runtime=runsc-d -v $(pwd)/gvisor/repro:/root/repro -it ubuntu bash
root@d3c3cb56bbf9:/# /root/repro/runsc-b147
mmap: 0x7f78987ec000
o
PiperOrigin-RevId: 240818413
Change-Id: I49e1d4a81a0cb9177832b0a9f31a10da722a896b
|
|
In the case of a rename replacing an existing destination inode, ramfs
Rename failed to first remove the replaced inode. This caused:
1. A leak of a reference to the inode (making it live indefinitely).
2. For directories, a leak of the replaced directory's .. link to the
parent. This would cause the parent's link count to incorrectly
increase.
(2) is much simpler to test than (1), so that's what I've done.
agentfs has a similar bug with link count only, so the Dirent layer
informs the Inode if this is a replacing rename.
Fixes #133
PiperOrigin-RevId: 239105698
Change-Id: I4450af2462d8ae3339def812287213d2cbeebde0
|
|
p9.Twalk.handle() with a non-empty path also stats the walked-to path
anyway, so the preceding GetAttr is completely wasted.
PiperOrigin-RevId: 238440645
Change-Id: I7fbc7536f46b8157639d0d1f491e6aaa9ab688a3
|
|
PiperOrigin-RevId: 238360231
Change-Id: I5eaf8d26f8892f77d71c7fbd6c5225ef471cedf1
|
|
PiperOrigin-RevId: 235053594
Change-Id: Ie3d7b11843d0710184a2463886c7034e8f5305d1
|
|
PiperOrigin-RevId: 233802562
Change-Id: I40e1b13fd571daaf241b00f8df4bcedd034dc3f1
|
|
fs/gofer/inodeOperations.Release does some asynchronous work. Previously it
was calling fs.Async with an anonymous function, which caused the function to
be allocated on the heap. Because Release is relatively hot, this results in a
lot of small allocations and increased GC pressure, noticeable in perf profiles.
This CL adds a new function, AsyncWithContext, which is just like Async, but
passes a context to the async function. It avoids the need for an extra
anonymous function in fs/gofer/inodeOperations.Release. The Async function
itself still requires a single anonymous function.
PiperOrigin-RevId: 233141763
Change-Id: I1dce4a883a7be9a8a5b884db01e654655f16d19c
|
|
Dirty should be set only when the attribute is changed in the cache
only. Instances where the change was also sent to the backing file
doesn't need to dirty the attribute.
Also remove size update during WriteOut as writing dirty page would
naturaly grow the file if needed.
RELNOTES: relnotes is needed for the parent CL.
PiperOrigin-RevId: 232068978
Change-Id: I00ba54693a2c7adc06efa9e030faf8f2e8e7f188
|
|
This changed required making fsutil.HostMappable use
a backing file to ensure the correct FD would be used
for read/write operations.
RELNOTES: relnotes is needed for the parent CL.
PiperOrigin-RevId: 231836164
Change-Id: I8ae9639715529874ea7d80a65e2c711a5b4ce254
|
|
Nothing reads them and they can simply get stale.
Generated with:
$ sed -i "s/licenses(\(.*\)).*/licenses(\1)/" **/BUILD
PiperOrigin-RevId: 231818945
Change-Id: Ibc3f9838546b7e94f13f217060d31f4ada9d4bf0
|
|
When file size changes outside the sandbox, page cache was not
refreshing file size which is required for cacheRemoteRevalidating.
In fact, cacheRemoteRevalidating should be skipping the cache
completely since it's not really benefiting from it. The cache is
cache is already bypassed for unstable attributes (see
cachePolicy.cacheUAttrs). And althought the cache is called to
map pages, they will always miss the cache and map directly from
the host.
Created a HostMappable struct that maps directly to the host and
use it for files with cacheRemoteRevalidating.
Closes #124
PiperOrigin-RevId: 230998440
Change-Id: Ic5f632eabe33b47241e05e98c95e9b2090ae08fc
|
|
More helper structs have been added to the fsutil package to make it easier to
implement fs.InodeOperations and fs.FileOperations.
PiperOrigin-RevId: 229305982
Change-Id: Ib6f8d3862f4216745116857913dbfa351530223b
|
|
We must wait for all lazy resources to be released before closing the rootFile.
PiperOrigin-RevId: 226419499
Change-Id: I1d4d961a92b3816e02690cf3eaf0a88944d730cc
|
|
Replaces the WaitGroup with a RWMutex. Calls to Async hold the mutex for
reading, while AsyncBarrier takes the lock for writing. This ensures that all
executing Async work finishes before AsyncBarrier returns.
Also pushes the Async() call from Inode.Release into
gofer/InodeOperations.Release(). This removes a recursive Async call which
should not have been allowed in the first place. The gofer Release call is the
slow one (since it may make RPCs to the gofer), so putting the Async call there
makes sense.
PiperOrigin-RevId: 223093067
Change-Id: I116da7b20fce5ebab8d99c2ab0f27db7c89d890e
|
|
Previously this code used the tcpip error space. Since it is no longer part of
netstack, it can use the sentry's error space (except for a few cases where
there is still some shared code. This reduces the number of error space
conversions required for hot Unix socket operations.
PiperOrigin-RevId: 218541611
Change-Id: I3d13047006a8245b5dfda73364d37b8a453784bb
|
|
This change also adds extensive testing to the p9 package via mocks. The sanity
checks and type checks are moved from the gofer into the core package, where
they can be more easily validated.
PiperOrigin-RevId: 218296768
Change-Id: I4fc3c326e7bf1e0e140a454cbacbcc6fd617ab55
|
|
PiperOrigin-RevId: 217951017
Change-Id: Ie08bf6987f98467d07457bcf35b5f1ff6e43c035
|
|
PiperOrigin-RevId: 217557656
Change-Id: I63d27635b1a6c12877279995d2d9847b6a19da9b
|
|
host.endpoint already has the check, but it is missing from
host.ConnectedEndpoint.
PiperOrigin-RevId: 214962762
Change-Id: I88bb13a5c5871775e4e7bf2608433df8a3d348e6
|
|
When revalidating a Dirent, if the inode id is the same, then we don't need to
throw away the entire Dirent. We can just update the unstable attributes in
place.
If the inode id has changed, then the remote file has been deleted or moved,
and we have no choice but to throw away the dirent we have a look up another.
In this case, we may still end up losing a mounted dirent that is a child of
the revalidated dirent. However, that seems appropriate here because the entire
mount point has been pulled out from underneath us.
Because gVisor's overlay is at the Inode level rather than the Dirent level, we
must pass the parent Inode and name along with the Inode that is being
revalidated.
PiperOrigin-RevId: 210431270
Change-Id: I705caef9c68900234972d5aac4ae3a78c61c7d42
|
|
PiperOrigin-RevId: 209817767
Change-Id: Iddf2b8441bc44f31f9a8cf6f2bd8e7a5b824b487
|
|
When an inode file state failed to load asynchronuously, we want to report
the error instead of potentially panicing in another async loading goroutine
incorrectly unblocked.
PiperOrigin-RevId: 209683977
Change-Id: I591cde97710bbe3cdc53717ee58f1d28bbda9261
|
|
Previously, gofer filesystems were configured with the default "fscache"
policy, which caches filesystem metadata and contents aggressively. While this
setting is best for performance, it means that changes from inside the sandbox
may not be immediately propagated outside the sandbox, and vice-versa.
This CL changes volumes and the root fs configuration to use a new
"remote-revalidate" cache policy which tries to retain as much caching as
possible while still making fs changes visible across the sandbox boundary.
This cache policy is enabled by default for the root filesystem. The default
value for the "--file-access" flag is still "proxy", but the behavior is
changed to use the new cache policy.
A new value for the "--file-access" flag is added, called "proxy-exclusive",
which turns on the previous aggressive caching behavior. As the name implies,
this flag should be used when the sandbox has "exclusive" access to the
filesystem.
All volume mounts are configured to use the new cache policy, since it is
safest and most likely to be correct. There is not currently a way to change
this behavior, but it's possible to add such a mechanism in the future. The
configurability is a smaller issue for volumes, since most of the expensive
application fs operations (walking + stating files) will likely served by the
root fs.
PiperOrigin-RevId: 208735037
Change-Id: Ife048fab1948205f6665df8563434dbc6ca8cfc9
|
|
Previously, an overlay would panic if either the upper or lower fs required
revalidation for a given Dirent. Now, we allow revalidation from the upper
file, but not the lower.
If a cached overlay inode does need revalidation (because the upper needs
revalidation), then the entire overlay Inode will be discarded and a new
overlay Inode will be built with a fresh copy of the upper file.
As a side effect of this change, Revalidate must take an Inode instead of a
Dirent, since an overlay needs to revalidate individual Inodes.
PiperOrigin-RevId: 208293638
Change-Id: Ic8f8d1ffdc09114721745661a09522b54420c5f1
|
|
The cache policy determines whether Lookup should return a negative dirent, or
just ENOENT. This CL fixes one spot where we returned a negative dirent without
first consulting the policy.
PiperOrigin-RevId: 208280230
Change-Id: I8f963bbdb45a95a74ad0ecc1eef47eff2092d3a4
|
|
Previously, processes which used file-system Unix Domain Sockets could not be
checkpoint-ed in runsc because the sockets were saved with their inode
numbers which do not necessarily remain the same upon restore. Now,
the sockets are also saved with their paths so that the new inodes
can be determined for the sockets based on these paths after restoring.
Tests for cases with UDS use are included. Test cleanup to come.
PiperOrigin-RevId: 208268781
Change-Id: Ieaa5d5d9a64914ca105cae199fd8492710b1d7ec
|
|
This CL adds a new cache-policy for gofer filesystems that uses the host page
cache, but causes dirents to be reloaded on each Walk, and does not cache
readdir results.
This policy is useful when the remote filesystem may change out from underneath
us, as any remote changes will be reflected on the next Walk.
Importantly, this cache policy is only consistent if we do not use gVisor's
internal page cache, since that page cache is tied to the Inode and may be
thrown away upon Revalidation.
This cache policy should only be used when the gofer supports donating host
FDs, since then gVisor will make use of the host kernel page cache, which will
be consistent for all open files in the gofer. In fact, a panic will be raised
if a file is opened without a donated FD.
PiperOrigin-RevId: 207752937
Change-Id: I233cb78b4695bbe00a4605ae64080a47629329b8
|
|
PiperOrigin-RevId: 207125440
Change-Id: I6c572afb4d693ee72a0c458a988b0e96d191cd49
|
|
PiperOrigin-RevId: 207037226
Change-Id: I8b5f1a056d4f3eab17846f2e0193bb737ecb5428
|