Age | Commit message (Collapse) | Author |
|
Update/remove most syserror errors to linuxerr equivalents. For list
of removed errors, see //pkg/syserror/syserror.go.
PiperOrigin-RevId: 382574582
|
|
Update all instances of the above errors to the faster linuxerr implementation.
With the temporary linuxerr.Equals(), no logical changes are made.
PiperOrigin-RevId: 382306655
|
|
Add Equals method to compare syserror and unix.Errno errors to linuxerr errors.
This will facilitate removal of syserror definitions in a followup, and
finding needed conversions from unix.Errno to linuxerr.
PiperOrigin-RevId: 380909667
|
|
It's in VFS1 code, so we probably will not do it.
PiperOrigin-RevId: 378474174
|
|
PiperOrigin-RevId: 375780659
|
|
This metric is replaced by /cloud/gvisor/sandbox/sentry/suspicious_operations
metric with field value opened_write_execute_file.
PiperOrigin-RevId: 374509823
|
|
The new metric contains fields and will replace the below existing metric:
- opened_write_execute_file
PiperOrigin-RevId: 373884604
|
|
Split usermem package to help remove syserror dependency in go_marshal.
New hostarch package contains code not dependent on syserror.
PiperOrigin-RevId: 365651233
|
|
PiperOrigin-RevId: 364370595
|
|
This validates that struct fields if annotated with "// checklocks:mu" where
"mu" is a mutex field in the same struct then access to the field is only
done with "mu" locked.
All types that are guarded by a mutex must be annotated with
// +checklocks:<mutex field name>
For more details please refer to README.md.
PiperOrigin-RevId: 360729328
|
|
The syscall package has been deprecated in favor of golang.org/x/sys.
Note that syscall is still used in the following places:
- pkg/sentry/socket/hostinet/stack.go: some netlink related functionalities
are not yet available in golang.org/x/sys.
- syscall.Stat_t is still used in some places because os.FileInfo.Sys() still
returns it and not unix.Stat_t.
Updates #214
PiperOrigin-RevId: 360701387
|
|
This improves type-assertion safety.
PiperOrigin-RevId: 353931228
|
|
open() has to return ENXIO in this case.
O_PATH isn't supported by vfs1.
PiperOrigin-RevId: 348820478
|
|
PiperOrigin-RevId: 347047550
|
|
We would like to track locks ordering to detect ordering violations. Detecting
violations is much simpler if mutexes must be unlocked by the same goroutine
that locked them.
Thus, as a first step to tracking lock ordering, add this lock/unlock
requirement to gVisor's sync.Mutex. This is more strict than the Go standard
library's sync.Mutex, but initial testing indicates only a single lock that is
used across goroutines. The new sync.CrossGoroutineMutex relaxes the
requirement (but will not provide lock order checking).
Due to the additional overhead, enforcement is only enabled with the
"checklocks" build tag. Build with this tag using:
bazel build --define=gotags=checklocks ...
From my spot-checking, this has no changed inlining properties when disabled.
Updates #4804
PiperOrigin-RevId: 343370200
|
|
PiperOrigin-RevId: 343196927
|
|
The default pipe size already matched linux, and is unchanged.
Furthermore `atomicIOBytes` is made a proper constant (as it is in Linux). We
were plumbing usermem.PageSize everywhere, so this is no functional change.
PiperOrigin-RevId: 340497006
|
|
context is passed to DecRef() and Release() which is
needed for SO_LINGER implementation.
PiperOrigin-RevId: 324672584
|
|
When the file closes, it attempts to write dirty cached
attributes to the file. This should not be done when the
mount is readonly.
PiperOrigin-RevId: 315585058
|
|
Linux 4.18 and later make reads and writes coherent between pre-copy-up and
post-copy-up FDs representing the same file on an overlay filesystem. However,
memory mappings remain incoherent:
- Documentation/filesystems/overlayfs.rst, "Non-standard behavior": "If a file
residing on a lower layer is opened for read-only and then memory mapped with
MAP_SHARED, then subsequent changes to the file are not reflected in the
memory mapping."
- fs/overlay/file.c:ovl_mmap() passes through to the underlying FD without any
management of coherence in the overlay.
- Experimentally on Linux 5.2:
```
$ cat mmap_cat_page.c
#include <err.h>
#include <fcntl.h>
#include <stdio.h>
#include <string.h>
#include <sys/mman.h>
#include <unistd.h>
int main(int argc, char **argv) {
if (argc < 2) {
errx(1, "syntax: %s [FILE]", argv[0]);
}
const int fd = open(argv[1], O_RDONLY);
if (fd < 0) {
err(1, "open(%s)", argv[1]);
}
const size_t page_size = sysconf(_SC_PAGE_SIZE);
void* page = mmap(NULL, page_size, PROT_READ, MAP_SHARED, fd, 0);
if (page == MAP_FAILED) {
err(1, "mmap");
}
for (;;) {
write(1, page, strnlen(page, page_size));
if (getc(stdin) == EOF) {
break;
}
}
return 0;
}
$ gcc -O2 -o mmap_cat_page mmap_cat_page.c
$ mkdir lowerdir upperdir workdir overlaydir
$ echo old > lowerdir/file
$ sudo mount -t overlay -o "lowerdir=lowerdir,upperdir=upperdir,workdir=workdir" none overlaydir
$ ./mmap_cat_page overlaydir/file
old
^Z
[1]+ Stopped ./mmap_cat_page overlaydir/file
$ echo new > overlaydir/file
$ cat overlaydir/file
new
$ fg
./mmap_cat_page overlaydir/file
old
```
Therefore, while the VFS1 gofer client's behavior of reopening read FDs is only
necessary pre-4.18, replacing existing memory mappings (in both sentry and
application address spaces) with mappings of the new FD is required regardless
of kernel version, and this latter behavior is common to both VFS1 and VFS2.
Re-document accordingly, and change the runsc flag to enabled by default.
New test:
- Before this CL: https://source.cloud.google.com/results/invocations/5b222d2c-e918-4bae-afc4-407f5bac509b
- After this CL: https://source.cloud.google.com/results/invocations/f28c747e-d89c-4d8c-a461-602b33e71aab
PiperOrigin-RevId: 311361267
|
|
Fixes #2651.
PiperOrigin-RevId: 311193661
|
|
Synthetic sockets do not have the race condition issue in VFS2, and we will
get rid of privateunixsocket as well.
Fixes #1200.
PiperOrigin-RevId: 310386474
|
|
p9.NoUID/GID (== uint32(-1) == auth.NoID) is not a valid auth.KUID/KGID; in
particular, using it for file ownership causes capabilities to be ineffective
since file capabilities require that the file's KUID and KGID are mapped into
the capability holder's user namespace [1], and auth.NoID is not mapped into
any user namespace. Map p9.NoUID/GID to a different, valid KUID/KGID; in the
unlikely case that an application actually using the overflow KUID/KGID
attempts an operation that is consequently permitted by client permission
checks, the remote operation will still fail with EPERM.
Since this changes the VFS2 gofer client to no longer ignore the invalid IDs
entirely, this CL both permits and requires that we change synthetic mount point
creation to use root credentials.
[1] See fs.Inode.CheckCapability or vfs.GenericCheckPermissions.
PiperOrigin-RevId: 309856455
|
|
Named pipes and sockets can be represented in two ways in gofer fs:
1. As a file on the remote filesystem. In this case, all file operations are
passed through 9p.
2. As a synthetic file that is internal to the sandbox. In this case, the
dentry stores an endpoint or VFSPipe for sockets and pipes respectively,
which replaces interactions with the remote fs through the gofer.
In gofer.filesystem.MknodAt, we attempt to call mknod(2) through 9p,
and if it fails, fall back to the synthetic version.
Updates #1200.
PiperOrigin-RevId: 308828161
|
|
Sentry metrics with nanoseconds units are labeled as such, and non-cumulative
sentry metrics are supported.
PiperOrigin-RevId: 307621080
|
|
The comments in the ticket indicate that this behavior
is fine and that the ticket should be closed, so we shouldn't
need pointers to the ticket.
PiperOrigin-RevId: 306266071
|
|
The sentry doesn't allow execve, but it's a good defense
in-depth measure.
PiperOrigin-RevId: 305958737
|
|
PiperOrigin-RevId: 305588941
|
|
Determine system time from within the sentry rather than relying on the remote
filesystem to prevent inconsistencies.
Resolve related TODOs; the time discrepancies in question don't exist anymore.
PiperOrigin-RevId: 305557099
|
|
PiperOrigin-RevId: 294295852
|
|
Note that these are only implemented for tmpfs, and other impls will still
return EOPNOTSUPP.
PiperOrigin-RevId: 293899385
|
|
Internal pipes are supported similarly to how internal UDS is done.
It is also controlled by the same flag.
Fixes #1102
PiperOrigin-RevId: 293150045
|
|
Because the abi will depend on the core types for marshalling (usermem,
context, safemem, safecopy), these need to be flattened from the sentry
directory. These packages contain no sentry-specific details.
PiperOrigin-RevId: 291811289
|
|
PiperOrigin-RevId: 291745021
|
|
There was a very bare get/setxattr in the InodeOperations interface. Add
context.Context to both, size to getxattr, and flags to setxattr.
Note that extended attributes are passed around as strings in this
implementation, so size is automatically encoded into the value. Size is
added in getxattr so that implementations can return ERANGE if a value is larger
than can fit in the user-allocated buffer. This prevents us from unnecessarily
passing around an arbitrarily large xattr when the user buffer is actually too
small.
Don't use the existing xattrwalk and xattrcreate messages and define our
own, mainly for the sake of simplicity.
Extended attributes will be implemented in future commits.
PiperOrigin-RevId: 290121300
|
|
* Rename syncutil to sync.
* Add aliases to sync types.
* Replace existing usage of standard library sync package.
This will make it easier to swap out synchronization primitives. For example,
this will allow us to use primitives from github.com/sasha-s/go-deadlock to
check for lock ordering violations.
Updates #1472
PiperOrigin-RevId: 289033387
|
|
After the finalizer optimize in 76039f895995c3fe0deef5958f843868685ecc38
commit, clientFile needs to closed before finalizer release it.
The clientFile is not closed if it is created via
gofer.(*inodeOperations).Bind, this will cause fd leak which is hold
by gofer process.
Fixes #1396
Signed-off-by: Yong He <chenglang.hy@antfin.com>
Signed-off-by: Jianfeng Tan <henry.tjf@antfin.com>
|
|
PiperOrigin-RevId: 284606233
|
|
PiperOrigin-RevId: 282382564
|
|
Note that the Sentry still calls Truncate() on the file before calling Open.
A new p9 version check was added to ensure that the p9 server can handle the
the OpenTruncate flag. If not, then the flag is stripped before sending.
PiperOrigin-RevId: 281609112
|
|
This patch also include a minor change to replace syscall.Dup2
with syscall.Dup3 which was missed in a previous commit(ref a25a976).
Signed-off-by: Haibo Xu <haibo.xu@arm.com>
Change-Id: I00beb9cc492e44c762ebaa3750201c63c1f7c2f3
|
|
PiperOrigin-RevId: 275139066
|
|
Linux kernel before 4.19 doesn't implement a feature that updates
open FD after a file is open for write (and is copied to the upper
layer). Already open FD will continue to read the old file content
until they are reopened. This is especially problematic for gVisor
because it caches open files.
Flag was added to force readonly files to be reopenned when the
same file is open for write. This is only needed if using kernels
prior to 4.19.
Closes #1006
It's difficult to really test this because we never run on tests
on older kernels. I'm adding a test in GKE which uses kernels
with the overlayfs problem for 1.14 and lower.
PiperOrigin-RevId: 275115289
|
|
When any of these flags are set, all writes will trigger a subsequent fsync
call. This behavior already existed for "write-through" mounts.
O_DIRECT is treated as an alias for O_SYNC. Better support coming soon.
PiperOrigin-RevId: 275114392
|
|
The gofer's CachingInodeOperations implementation contains an optimization for
the common open-read-close pattern when we have a host FD. In this case, the
host kernel will update the timestamp for us to a reasonably close time, so we
don't need an extra RPC to the gofer.
However, when the app explicitly sets the timestamps (via futimes or similar)
then we actually DO need to update the timestamps, because the host kernel
won't do it for us.
To fix this, a new boolean `forceSetTimestamps` was added to
CachineInodeOperations.SetMaskedAttributes. It is only set by
gofer.InodeOperations.SetTimestamps.
PiperOrigin-RevId: 272048146
|
|
They are no-ops, so the standard rule works fine.
PiperOrigin-RevId: 268776264
|
|
PiperOrigin-RevId: 266177409
|
|
PiperOrigin-RevId: 255711454
|
|
Addresses obvious typos, in the documentation only.
COPYBARA_INTEGRATE_REVIEW=https://github.com/google/gvisor/pull/443 from Pixep:fix/documentation-spelling 4d0688164eafaf0b3010e5f4824b35d1e7176d65
PiperOrigin-RevId: 255477779
|
|
Currently, the overlay dirCache is only used for a single logical use of
getdents. i.e., it is discard when the FD is closed or seeked back to
the beginning.
But the initial work of getting the directory contents can be quite
expensive (particularly sorting large directories), so we should keep it
as long as possible.
This is very similar to the readdirCache in fs/gofer.
Since the upper filesystem does not have to allow caching readdir
entries, the new CacheReaddir MountSourceOperations method controls this
behavior.
This caching should be trivially movable to all Inodes if desired,
though that adds an additional copy step for non-overlay Inodes.
(Overlay Inodes already do the extra copy).
PiperOrigin-RevId: 255477592
|