Age | Commit message (Collapse) | Author |
|
For comparison:
```
$ docker run --rm -it ubuntu:focal bash -c 'cat /proc/self/status'
Name: cat
Umask: 0022
State: R (running)
Tgid: 1
Ngid: 0
Pid: 1
PPid: 0
TracerPid: 0
Uid: 0 0 0 0
Gid: 0 0 0 0
FDSize: 64
Groups:
NStgid: 1
NSpid: 1
NSpgid: 1
NSsid: 1
VmPeak: 2660 kB
VmSize: 2660 kB
VmLck: 0 kB
VmPin: 0 kB
VmHWM: 528 kB
VmRSS: 528 kB
...
$ docker run --runtime=runsc-vfs2 --rm -it ubuntu:focal bash -c 'cat /proc/self/status'
Name: cat
State: R (running)
Tgid: 1
Pid: 1
PPid: 0
TracerPid: 0
Uid: 0 0 0 0
Gid: 0 0 0 0
FDSize: 4
Groups:
VmSize: 10708 kB
VmRSS: 3124 kB
VmData: 316 kB
...
```
Fixes #6374
PiperOrigin-RevId: 387465655
|
|
PiperOrigin-RevId: 387442805
|
|
PiperOrigin-RevId: 387427887
|
|
Make hasSlot scan allocated slot, rather than the whole slice.
It is supposed to store physicalStart in usedSlot.
Signed-off-by: Howard Zhang <howard.zhang@arm.com>
|
|
PiperOrigin-RevId: 386577891
|
|
Reported-by: syzbot+beb099a67f670386a367@syzkaller.appspotmail.com
PiperOrigin-RevId: 386521361
|
|
PiperOrigin-RevId: 386511818
|
|
We opted to move forward with FUSE instead.
PiperOrigin-RevId: 386344258
|
|
PiperOrigin-RevId: 386323389
|
|
PiperOrigin-RevId: 386312456
|
|
- Creates new metric "/tcp/segments_acked_with_dsack" to count the number of
segments acked with DSACK.
- Added check to verify the metric is getting incremented when a DSACK is sent
in the unit tests.
PiperOrigin-RevId: 386135949
|
|
Reported-by: syzbot+59550b48e06cc0d3b638@syzkaller.appspotmail.com
PiperOrigin-RevId: 386075453
|
|
PiperOrigin-RevId: 385919423
|
|
PiperOrigin-RevId: 385894869
|
|
fs.renameMu is released and reacquired in `dentry.destroyLocked()` allowing
a dentry to be in `fs.syncableDentries` with a negative reference count.
Fixes #5263
PiperOrigin-RevId: 385054337
|
|
TCP is fully supported. As with SO_RCVBUF, other transport protocols perform
no-ops per DefaultSocketOptionsHandler.OnSetReceiveBufferSize.
PiperOrigin-RevId: 385023239
|
|
PiperOrigin-RevId: 384586164
|
|
Remove the hack in gVisor vfs that allows verity to bypass the O_PATH
check, since ioctl is not allowed on fds opened with O_PATH in linux.
Verity still opens the lowerFD with O_PATH to open it as a symlink, but
the API no longer expects O_PATH to open a fd to be verity enabled.
Now only O_FOLLOW should be specified when opening and enabling verity
features.
PiperOrigin-RevId: 384567833
|
|
Add support for msgget, and msgctl(IPC_RMID), and enable msgqueue
syscall tests.
Updates #135
|
|
Remove implements the behaviour or msgctl(IPC_RMID).
Updates #135
|
|
FindOrCreate implements the behaviour of msgget(2).
Updates #135
|
|
Create package msgqueue, define primitives to be used for message
queues, and add a msgqueue.Registry to IPCNamespace.
Updates #135
|
|
Create ipc.Registry to hold fields, and define functionality common to
all SysV registries, and have registries use it.
|
|
Create ipc.Object to define fields and functionality used in SysV
mechanisms, and have them use it.
|
|
- Keeps Linux-specific behavior out of //pkg/tcpip
- Makes it clearer that clamping is done only for setsockopt calls from users
- Removes code duplication
PiperOrigin-RevId: 384389809
|
|
Kernfs provides an internal mechanism to defer calls to `DecRef()` because
on the last reference `Filesystem.mu` must be held and most places that
need to call `DecRef()` are inside the lock. The same can be true for
filesystems that extend kernfs. procfs needs to look up files and `DecRef()`
them inside the `kernfs.Filesystem.mu`. If the files happen to be procfs
files, it can deadlock trying to decrement if it's the last reference.
This change extends the mechanism to external callers to defer DecRefs
to `vfs.FileDescription` and `vfs.VirtualDentries`.
PiperOrigin-RevId: 384361647
|
|
Set stdio ownership based on the container's user to ensure the
user can open/read/write to/from stdios.
1. stdios in the host are changed to have the owner be the same
uid/gid of the process running the sandbox. This ensures that the
sandbox has full control over it.
2. stdios owner owner inside the sandbox is changed to match the
container's user to give access inside the container and make it
behave the same as runc.
Fixes #6180
PiperOrigin-RevId: 384347009
|
|
Update the following from syserror to the linuxerr equivalent:
EEXIST
EFAULT
ENOTDIR
ENOTTY
EOPNOTSUPP
ERANGE
ESRCH
PiperOrigin-RevId: 384329869
|
|
PiperOrigin-RevId: 384305599
|
|
Go 1.17 adds a new register-based calling convention. While transparent for
most applications, the KVM platform needs special work in a few cases.
First of all, we need the actual address of some assembly functions, rather
than the address of a wrapper. See http://gvisor.dev/pr/5832 for complete
discussion of this.
More relevant to this CL is that ABI0-to-ABIInternal wrappers (i.e., calls from
assembly to Go) access the G via FS_BASE. The KVM quite fast-and-loose about
the Go environment, often calling into (nosplit) Go functions with
uninitialized FS_BASE.
That will no longer work in Go 1.17, so this CL changes the platform to
consistently restore FS_BASE before calling into Go code.
This CL does not affect arm64 code. Go 1.17 does not support the register-based
calling convention for arm64 (it will come in 1.18), but arm64 also does not
use a non-standard register like FS_BASE for TLS, so it may not require any
changes.
PiperOrigin-RevId: 384234305
|
|
PiperOrigin-RevId: 383940663
|
|
- LockOSThread() around prctl(PR_SET_NO_NEW_PRIVS) => seccomp(). go:nosplit
"mostly" prevents async preemption, but IIUC preemption is still permitted
during function prologues:
funcpctab "".seccomp [valfunc=pctopcdata]
0 -1 00000 (gvisor/pkg/seccomp/seccomp_unsafe.go:110) TEXT "".seccomp(SB), NOSPLIT|ABIInternal, $72-32
0 00000 (gvisor/pkg/seccomp/seccomp_unsafe.go:110) TEXT "".seccomp(SB), NOSPLIT|ABIInternal, $72-32
0 -1 00000 (gvisor/pkg/seccomp/seccomp_unsafe.go:110) SUBQ $72, SP
4 00004 (gvisor/pkg/seccomp/seccomp_unsafe.go:110) MOVQ BP, 64(SP)
9 00009 (gvisor/pkg/seccomp/seccomp_unsafe.go:110) LEAQ 64(SP), BP
e 00014 (gvisor/pkg/seccomp/seccomp_unsafe.go:110) FUNCDATA $0, gclocals·ba30782f8935b28ed1adaec603e72627(SB)
e 00014 (gvisor/pkg/seccomp/seccomp_unsafe.go:110) FUNCDATA $1, gclocals·663f8c6bfa83aa777198789ce63d9ab4(SB)
e 00014 (gvisor/pkg/seccomp/seccomp_unsafe.go:110) FUNCDATA $2, "".seccomp.stkobj(SB)
e 00014 (gvisor/pkg/seccomp/seccomp_unsafe.go:111) PCDATA $0, $-2
e -2 00014 (gvisor/pkg/seccomp/seccomp_unsafe.go:111) MOVQ "".ptr+88(SP), AX
(-1 is objabi.PCDATA_UnsafePointSafe and -2 is objabi.PCDATA_UnsafePointUnsafe,
from Go's cmd/internal/objabi.)
- Handle non-errno failures from seccomp() with SECCOMP_FILTER_FLAG_TSYNC.
PiperOrigin-RevId: 383757580
|
|
PiperOrigin-RevId: 383705129
|
|
PiperOrigin-RevId: 383684320
|
|
PiperOrigin-RevId: 382788878
|
|
This change makes the checklocks analyzer considerable more powerful, adding:
* The ability to traverse complex structures, e.g. to have multiple nested
fields as part of the annotation.
* The ability to resolve simple anonymous functions and closures, and perform
lock analysis across these invocations. This does not apply to closures that
are passed elsewhere, since it is not possible to know the context in which
they might be invoked.
* The ability to annotate return values in addition to receivers and other
parameters, with the same complex structures noted above.
* Ignoring locking semantics for "fresh" objects, i.e. objects that are
allocated in the local frame (typically a new-style function).
* Sanity checking of locking state across block transitions and returns, to
ensure that no unexpected locks are held.
Note that initially, most of these findings are excluded by a comprehensive
nogo.yaml. The findings that are included are fundamental lock violations.
The changes here should be relatively low risk, minor refactorings to either
include necessary annotations to simplify the code structure (in general
removing closures in favor of methods) so that the analyzer can be easily
track the lock state.
This change additional includes two changes to nogo itself:
* Sanity checking of all types to ensure that the binary and ast-derived
types have a consistent objectpath, to prevent the bug above from occurring
silently (and causing much confusion). This also requires a trick in
order to ensure that serialized facts are consumable downstream. This can
be removed with https://go-review.googlesource.com/c/tools/+/331789 merged.
* A minor refactoring to isolation the objdump settings in its own package.
This was originally used to implement the sanity check above, but this
information is now being passed another way. The minor refactor is preserved
however, since it cleans up the code slightly and is minimal risk.
PiperOrigin-RevId: 382613300
|
|
PiperOrigin-RevId: 382603592
|
|
Update/remove most syserror errors to linuxerr equivalents. For list
of removed errors, see //pkg/syserror/syserror.go.
PiperOrigin-RevId: 382574582
|
|
Update all instances of the above errors to the faster linuxerr implementation.
With the temporary linuxerr.Equals(), no logical changes are made.
PiperOrigin-RevId: 382306655
|
|
The unordered map may generate different hash due to its order. The
children map needs to be sorted each time before hashing to avoid false
verification failure due to the map.
Store the sorted children map in verity dentry to avoid sorting it each
time verification happens.
Also serialize the whole VerityDescriptor struct to hash now that the
map is removed from it.
PiperOrigin-RevId: 382201560
|
|
PiperOrigin-RevId: 382194711
|
|
Remove three syserror entries duplicated in linuxerr. Because of the
linuxerr.Equals method, this is a mere change of return values from
syserror to linuxerr definitions.
Done with only these three errnos as CLs removing all grow to a significantly
large size.
PiperOrigin-RevId: 382173835
|
|
In Linux the list entries command returns the name of the input interface assigned to the iptable rule.
iptables -S
> -A FORWARD -i docker0 -o docker0 -j ACCEPT
Meanwhile, in gVsior this interface name is ignored.
iptables -S
> -A FORWARD -o docker0 -j ACCEPT
|
|
PiperOrigin-RevId: 381982257
|
|
PiperOrigin-RevId: 381375705
|
|
- These metrics are replaced with WeirdnessMetric with fields
watchdog_stuck_startup and watchdog_stuck_tasks.
PiperOrigin-RevId: 381365617
|
|
A caller of CreateProcessGroup looks up a thread group without locks, so
the target process can exit before CreateProcessGroup will be called.
Reported-by: syzbot+6abb7c34663dacbd55a8@syzkaller.appspotmail.com
PiperOrigin-RevId: 381351069
|
|
PiperOrigin-RevId: 381145216
|
|
Compare
if (!thread_group_leader(tracee))
tracee = rcu_dereference(tracee->group_leader);
in security/yama/yama_lsm.c:ptracer_exception_found().
PiperOrigin-RevId: 381074242
|
|
Add Equals method to compare syserror and unix.Errno errors to linuxerr errors.
This will facilitate removal of syserror definitions in a followup, and
finding needed conversions from unix.Errno to linuxerr.
PiperOrigin-RevId: 380909667
|