Age | Commit message (Collapse) | Author |
|
Go 1.17 adds a new register-based calling convention. While transparent for
most applications, the KVM platform needs special work in a few cases.
First of all, we need the actual address of some assembly functions, rather
than the address of a wrapper. See http://gvisor.dev/pr/5832 for complete
discussion of this.
More relevant to this CL is that ABI0-to-ABIInternal wrappers (i.e., calls from
assembly to Go) access the G via FS_BASE. The KVM quite fast-and-loose about
the Go environment, often calling into (nosplit) Go functions with
uninitialized FS_BASE.
That will no longer work in Go 1.17, so this CL changes the platform to
consistently restore FS_BASE before calling into Go code.
This CL does not affect arm64 code. Go 1.17 does not support the register-based
calling convention for arm64 (it will come in 1.18), but arm64 also does not
use a non-standard register like FS_BASE for TLS, so it may not require any
changes.
PiperOrigin-RevId: 384234305
|
|
PiperOrigin-RevId: 383940663
|
|
- LockOSThread() around prctl(PR_SET_NO_NEW_PRIVS) => seccomp(). go:nosplit
"mostly" prevents async preemption, but IIUC preemption is still permitted
during function prologues:
funcpctab "".seccomp [valfunc=pctopcdata]
0 -1 00000 (gvisor/pkg/seccomp/seccomp_unsafe.go:110) TEXT "".seccomp(SB), NOSPLIT|ABIInternal, $72-32
0 00000 (gvisor/pkg/seccomp/seccomp_unsafe.go:110) TEXT "".seccomp(SB), NOSPLIT|ABIInternal, $72-32
0 -1 00000 (gvisor/pkg/seccomp/seccomp_unsafe.go:110) SUBQ $72, SP
4 00004 (gvisor/pkg/seccomp/seccomp_unsafe.go:110) MOVQ BP, 64(SP)
9 00009 (gvisor/pkg/seccomp/seccomp_unsafe.go:110) LEAQ 64(SP), BP
e 00014 (gvisor/pkg/seccomp/seccomp_unsafe.go:110) FUNCDATA $0, gclocals·ba30782f8935b28ed1adaec603e72627(SB)
e 00014 (gvisor/pkg/seccomp/seccomp_unsafe.go:110) FUNCDATA $1, gclocals·663f8c6bfa83aa777198789ce63d9ab4(SB)
e 00014 (gvisor/pkg/seccomp/seccomp_unsafe.go:110) FUNCDATA $2, "".seccomp.stkobj(SB)
e 00014 (gvisor/pkg/seccomp/seccomp_unsafe.go:111) PCDATA $0, $-2
e -2 00014 (gvisor/pkg/seccomp/seccomp_unsafe.go:111) MOVQ "".ptr+88(SP), AX
(-1 is objabi.PCDATA_UnsafePointSafe and -2 is objabi.PCDATA_UnsafePointUnsafe,
from Go's cmd/internal/objabi.)
- Handle non-errno failures from seccomp() with SECCOMP_FILTER_FLAG_TSYNC.
PiperOrigin-RevId: 383757580
|
|
PiperOrigin-RevId: 383705129
|
|
PiperOrigin-RevId: 383684320
|
|
Commit 16b751b6c610ec2c5a913cb8a818e9239ee7da71 introduced a bug where writes of
zero size would end up queueing a zero sized segment which will cause the
sandbox to panic when trying to send a zero sized segment(e.g. after an RTO) as
netstack asserts that the all non FIN segments have size > 0.
This change adds the check for a zero sized payload back to avoid queueing
such segments. The associated test panics without the fix and passes with it.
PiperOrigin-RevId: 383677884
|
|
PiperOrigin-RevId: 383481745
|
|
PiperOrigin-RevId: 383472507
|
|
PiperOrigin-RevId: 383426091
|
|
PiperOrigin-RevId: 382788878
|
|
More-specific route discovery allows hosts to pick a more appropriate
router for off-link destinations.
Fixes #6172.
PiperOrigin-RevId: 382779880
|
|
This change makes the checklocks analyzer considerable more powerful, adding:
* The ability to traverse complex structures, e.g. to have multiple nested
fields as part of the annotation.
* The ability to resolve simple anonymous functions and closures, and perform
lock analysis across these invocations. This does not apply to closures that
are passed elsewhere, since it is not possible to know the context in which
they might be invoked.
* The ability to annotate return values in addition to receivers and other
parameters, with the same complex structures noted above.
* Ignoring locking semantics for "fresh" objects, i.e. objects that are
allocated in the local frame (typically a new-style function).
* Sanity checking of locking state across block transitions and returns, to
ensure that no unexpected locks are held.
Note that initially, most of these findings are excluded by a comprehensive
nogo.yaml. The findings that are included are fundamental lock violations.
The changes here should be relatively low risk, minor refactorings to either
include necessary annotations to simplify the code structure (in general
removing closures in favor of methods) so that the analyzer can be easily
track the lock state.
This change additional includes two changes to nogo itself:
* Sanity checking of all types to ensure that the binary and ast-derived
types have a consistent objectpath, to prevent the bug above from occurring
silently (and causing much confusion). This also requires a trick in
order to ensure that serialized facts are consumable downstream. This can
be removed with https://go-review.googlesource.com/c/tools/+/331789 merged.
* A minor refactoring to isolation the objdump settings in its own package.
This was originally used to implement the sanity check above, but this
information is now being passed another way. The minor refactor is preserved
however, since it cleans up the code slightly and is minimal risk.
PiperOrigin-RevId: 382613300
|
|
In gVisor today its possible that when trying to bind a TCP socket
w/ SO_REUSEADDR specified and requesting the kernel pick a port by
setting port to zero can result in a previously bound port being
returned. This behaviour is incorrect as the user is clearly requesting
a free port. The behaviour is fine when the user explicity specifies
a port.
This change now checks if the user specified a port when making a port
reservation for a TCP port and only returns unbound ports even if
SO_REUSEADDR was specified.
Fixes #6209
PiperOrigin-RevId: 382607638
|
|
PiperOrigin-RevId: 382603592
|
|
Update/remove most syserror errors to linuxerr equivalents. For list
of removed errors, see //pkg/syserror/syserror.go.
PiperOrigin-RevId: 382574582
|
|
PiperOrigin-RevId: 382427879
|
|
Update all instances of the above errors to the faster linuxerr implementation.
With the temporary linuxerr.Equals(), no logical changes are made.
PiperOrigin-RevId: 382306655
|
|
This change prepares for a later change which supports the NDP
Route Information option to discover more-specific routes, as
per RFC 4191.
Updates #6172.
PiperOrigin-RevId: 382225812
|
|
PiperOrigin-RevId: 382202462
|
|
The unordered map may generate different hash due to its order. The
children map needs to be sorted each time before hashing to avoid false
verification failure due to the map.
Store the sorted children map in verity dentry to avoid sorting it each
time verification happens.
Also serialize the whole VerityDescriptor struct to hash now that the
map is removed from it.
PiperOrigin-RevId: 382201560
|
|
PiperOrigin-RevId: 382194711
|
|
Remove three syserror entries duplicated in linuxerr. Because of the
linuxerr.Equals method, this is a mere change of return values from
syserror to linuxerr definitions.
Done with only these three errnos as CLs removing all grow to a significantly
large size.
PiperOrigin-RevId: 382173835
|
|
The PID files are not used after they are read, so there is
no point in keeping them around until the shim is deleted.
Updates #6225
PiperOrigin-RevId: 382169916
|
|
This is to ensure that Go 1.13 error wrapping is correctly
translated to gRPC errors before returning from the shim.
Updates #6225
PiperOrigin-RevId: 382120441
|
|
In Linux the list entries command returns the name of the input interface assigned to the iptable rule.
iptables -S
> -A FORWARD -i docker0 -o docker0 -j ACCEPT
Meanwhile, in gVsior this interface name is ignored.
iptables -S
> -A FORWARD -o docker0 -j ACCEPT
|
|
When TUN is created with IFF_NO_PI flag, there will be no Ethernet header and no packet info, therefore, both read and write will fail.
This commit fix this bug.
|
|
PiperOrigin-RevId: 381982257
|
|
There was a race wherein Accept() could fail, then the handshake would complete,
and then a waiter would be created to listen for the handshake. In such cases,
no notification was ever sent and the test timed out.
PiperOrigin-RevId: 381913041
|
|
PiperOrigin-RevId: 381561785
|
|
sndQueue made sense when the worker goroutine and the syscall context held
different locks. Now both lock the endpoint lock before doing anything which
means adding to sndQueue is pointless as we move it to writeList immediately
after that in endpoint.Write() by calling e.drainSendQueue.
PiperOrigin-RevId: 381523177
|
|
...instead of calculating a fresh checksum to avoid re-calcalculating
a checksum on unchanged bytes.
Fixes #5340.
PiperOrigin-RevId: 381403888
|
|
This change prepares for a later change which supports the NDP
Route Information option to discover more-specific routes, as
per RFC 4191.
The newly introduced off-link route state will be used to hold
both the state for default routers (which is a default (off-link)
route through the router, and more-specific routes (which are
routes through some router to some destination subnet more specific
than the IPv6 empty subnet).
Updates #6172.
PiperOrigin-RevId: 381403761
|
|
PiperOrigin-RevId: 381375705
|
|
- These metrics are replaced with WeirdnessMetric with fields
watchdog_stuck_startup and watchdog_stuck_tasks.
PiperOrigin-RevId: 381365617
|
|
A caller of CreateProcessGroup looks up a thread group without locks, so
the target process can exit before CreateProcessGroup will be called.
Reported-by: syzbot+6abb7c34663dacbd55a8@syzkaller.appspotmail.com
PiperOrigin-RevId: 381351069
|
|
puppetlabs:fix-shim-pid-leaking-on-stopped-processes
PiperOrigin-RevId: 381341920
|
|
PiperOrigin-RevId: 381145216
|
|
PiperOrigin-RevId: 381100861
|
|
Compare
if (!thread_group_leader(tracee))
tracee = rcu_dereference(tracee->group_leader);
in security/yama/yama_lsm.c:ptracer_exception_found().
PiperOrigin-RevId: 381074242
|
|
While #6204 addressed the stopped state for handling signals in the main
process, it did not update exec processes in the same way. This change
mirrors that adjustment for exec processes.
|
|
This change wraps containerd's errdefs.ToGRPC function with one that
understands Go 1.13-style error wrapping style, which is used
pervasively throughout the shim. With this change, errors that have been
marked with, e.g., `errdefs.ErrNotFound`, will be correctly propagated
back to the containerd server.
|
|
PiperOrigin-RevId: 380967023
|
|
There are unnecessarily short timeouts in several places.
Note: a later change will switch tcp_test to fake clocks intead of the built-in
`time` package.
PiperOrigin-RevId: 380935400
|
|
Add Equals method to compare syserror and unix.Errno errors to linuxerr errors.
This will facilitate removal of syserror definitions in a followup, and
finding needed conversions from unix.Errno to linuxerr.
PiperOrigin-RevId: 380909667
|
|
PiperOrigin-RevId: 380904249
|
|
The typical sequence of calls to start a container looks like this
ct, err := container.New(conf, containerArgs)
defer ct.Destroy()
ct.Start(conf)
ws, err := ct.Wait()
For the root container, ct.Destroy() kills the sandbox process. This
doesn't look like a right wait to stop it. For example, all ongoing rpc
calls are aborted in this case. If everything is going alright, we can
just wait and it will exit itself.
Reported-by: syzbot+084fca334720887441e7@syzkaller.appspotmail.com
Signed-off-by: Andrei Vagin <avagin@gmail.com>
|
|
Fixes #2726
PiperOrigin-RevId: 380753516
|
|
tcpdump is largely supported. We've also chose not to implement writeable
AF_PACKET sockets, and there's a bug specifically for promiscuous mode (#3333).
Fixes #173.
PiperOrigin-RevId: 380733686
|
|
Getting state of a stopped container would fail and could lead containerd
to not detecting that the container had actually stopped. Now stopped and
deleted containers return `stopped` state.
Also makes other messages more consistent when container is stopped. Some
where still sending messages to runsc and failing in different ways. Now
they go through `initState` state machine like the other messages.
There are a few changes to improve debugability with it as well.
Fixes #5861
PiperOrigin-RevId: 380698513
|
|
Updates #5940.
PiperOrigin-RevId: 380668609
|