Age | Commit message (Collapse) | Author |
|
|
|
PiperOrigin-RevId: 399560357
|
|
|
|
PiperOrigin-RevId: 398572735
|
|
Signed-off-by: Andrei Vagin <avagin@google.com>
|
|
We install seccomp rules so that the SIGSYS signal is generated for
each mmap system call. Then our signal handler executes the real mmap
syscall and if a new regions is created, it maps it to the guest.
Signed-off-by: Andrei Vagin <avagin@google.com>
|
|
|
|
|
|
|
|
PiperOrigin-RevId: 392554743
|
|
Right now, the first slot starts with an address of a memory region and its size is faultBlockSize,
but the second slot starts with (physicalStart + faultBlockSize) & faultBlockMask.
It means they will overlap if a start address of a memory region are not aligned to faultBlockSize.
The kernel doesn't allow to add overlapped regions, but we ignore the EEXIST error.
Signed-off-by: Andrei Vagin <avagin@google.com>
|
|
|
|
Right now, it contains the code:
origState := atomic.LoadUint32(&c.state)
atomicbitops.AndUint32(&c.state, ^vCPUUser)
The problem here is that vCPU.bounce that is called from another thread can add
vCPUWaiter when origState has been read but vCPUUser isn't cleared yet. In this
case, vCPU.unlock doesn't notify other threads about changes and c.bounce will
be stuck in the futex_wait call.
PiperOrigin-RevId: 389697411
|
|
|
|
This CL introduces a 'checklinkname' analyzer, which provides rudimentary
type-checking that verifies that function signatures on the local and remote
sides of //go:linkname directives match expected values.
If the Go standard library changes the definitions of any of these function,
checklinkname will flag the change as a finding, providing an error informing
the gVisor team to adapt to the upstream changes. This allows us to eliminate
the majority of gVisor's forward-looking negative build tags, as we can catch
mismatches in testing [1].
The remaining forward-looking negative build tags are covering shared struct
definitions, which I hope to add to checklinkname in a future CL.
[1] Of course, semantics/requirements can change without the signature
changing, so we still must be careful, but this covers the common case.
PiperOrigin-RevId: 387873847
|
|
|
|
Make hasSlot scan allocated slot, rather than the whole slice.
It is supposed to store physicalStart in usedSlot.
Signed-off-by: Howard Zhang <howard.zhang@arm.com>
|
|
|
|
|
|
PiperOrigin-RevId: 385919423
|
|
|
|
PiperOrigin-RevId: 385894869
|
|
|
|
PiperOrigin-RevId: 384305599
|
|
|
|
Go 1.17 adds a new register-based calling convention. While transparent for
most applications, the KVM platform needs special work in a few cases.
First of all, we need the actual address of some assembly functions, rather
than the address of a wrapper. See http://gvisor.dev/pr/5832 for complete
discussion of this.
More relevant to this CL is that ABI0-to-ABIInternal wrappers (i.e., calls from
assembly to Go) access the G via FS_BASE. The KVM quite fast-and-loose about
the Go environment, often calling into (nosplit) Go functions with
uninitialized FS_BASE.
That will no longer work in Go 1.17, so this CL changes the platform to
consistently restore FS_BASE before calling into Go code.
This CL does not affect arm64 code. Go 1.17 does not support the register-based
calling convention for arm64 (it will come in 1.18), but arm64 also does not
use a non-standard register like FS_BASE for TLS, so it may not require any
changes.
PiperOrigin-RevId: 384234305
|
|
|
|
- LockOSThread() around prctl(PR_SET_NO_NEW_PRIVS) => seccomp(). go:nosplit
"mostly" prevents async preemption, but IIUC preemption is still permitted
during function prologues:
funcpctab "".seccomp [valfunc=pctopcdata]
0 -1 00000 (gvisor/pkg/seccomp/seccomp_unsafe.go:110) TEXT "".seccomp(SB), NOSPLIT|ABIInternal, $72-32
0 00000 (gvisor/pkg/seccomp/seccomp_unsafe.go:110) TEXT "".seccomp(SB), NOSPLIT|ABIInternal, $72-32
0 -1 00000 (gvisor/pkg/seccomp/seccomp_unsafe.go:110) SUBQ $72, SP
4 00004 (gvisor/pkg/seccomp/seccomp_unsafe.go:110) MOVQ BP, 64(SP)
9 00009 (gvisor/pkg/seccomp/seccomp_unsafe.go:110) LEAQ 64(SP), BP
e 00014 (gvisor/pkg/seccomp/seccomp_unsafe.go:110) FUNCDATA $0, gclocals·ba30782f8935b28ed1adaec603e72627(SB)
e 00014 (gvisor/pkg/seccomp/seccomp_unsafe.go:110) FUNCDATA $1, gclocals·663f8c6bfa83aa777198789ce63d9ab4(SB)
e 00014 (gvisor/pkg/seccomp/seccomp_unsafe.go:110) FUNCDATA $2, "".seccomp.stkobj(SB)
e 00014 (gvisor/pkg/seccomp/seccomp_unsafe.go:111) PCDATA $0, $-2
e -2 00014 (gvisor/pkg/seccomp/seccomp_unsafe.go:111) MOVQ "".ptr+88(SP), AX
(-1 is objabi.PCDATA_UnsafePointSafe and -2 is objabi.PCDATA_UnsafePointUnsafe,
from Go's cmd/internal/objabi.)
- Handle non-errno failures from seccomp() with SECCOMP_FILTER_FLAG_TSYNC.
PiperOrigin-RevId: 383757580
|
|
|
|
PiperOrigin-RevId: 380904249
|
|
When sentry is running in guest ring0, the goroutine stack is changing
and it will not be the stack when bluepill() is called.
If PMU interrupt hits when the CPU is in host ring 0, the perf handler
will try to get the stack of the kernel and the userspace(sentry). It
can travel back to sighandler() and try to continue to the stack of
the goroutine with the outdated frame pointer if sentry has been running
in the guest. The perf handler can't record correct addresses
from the outdated and wrong frames. Those addresses are often
irresolvable, and even if it is resolvable accidentally, it would
be misleading names.
To fix the problem, we just set the frame pointer(%RBP) to zero and
disconnect the link when the zeroed frame pointer is saved in the
frame in bluepillHandler().
Signed-off-by: Lai Jiangshan <jiangshan.ljs@antfin.com>
|
|
|
|
UpperHalf is shared with all address spaces.
PiperOrigin-RevId: 379790539
|
|
|
|
PiperOrigin-RevId: 379337677
|
|
|
|
Fixes #214
PiperOrigin-RevId: 378680466
|
|
Signed-off-by: Robin Luk <lubin.lu@antgroup.com>
|
|
|
|
on Arm64 platform, we can use TLBI with 'IS' instead of IPI operation.
According to my understanding, the logic in invalidate() is much like
an IPI operation.
On Arm64, we can simply perform vmalle1is invalidation here, not
use IPI.
Reference:
https://github.com/torvalds/linux/blob/v5.12/arch/arm64/kvm/mmu.c#L81
Signed-off-by: Robin Luk <lubin.lu@antgroup.com>
|
|
|
|
PiperOrigin-RevId: 372608247
|
|
This patch is to solve problem that vCPU timer mess up when
adding vCPU dynamically on ARM64, for detailed information
please refer to:
https://github.com/google/gvisor/issues/5739
There is no influence on x86 and here are main changes for
ARM64:
1. create maxVCPUs number of vCPU in machine initialization
2. we want to sync gvisor vCPU number with host CPU number,
so use smaller number between runtime.NumCPU and
KVM_CAP_MAX_VCPUS to be maxVCPUS
3. put unused vCPUs into architecture-specific map initialvCPUs
4. When machine need to bind a new vCPU with tid, rather
than creating new one, it would pick a vCPU from map initalvCPUs
5. change the setSystemTime function. When vCPU number increasing,
the time cost for function setTSC(use syscall to set cntvoff) is
liner growth from around 300 ns to 100000 ns, and this leads to
the function setSystemTimeLegacy can not get correct offset
value.
6. initializing StdioFDs and goferFD before a platform to avoid
StdioFDs confects with vCPU fds
Signed-off-by: howard zhang <howard.zhang@arm.com>
|
|
|
|
The root table physical page has to be mapped to not fault in iret or sysret
after switching into a user address space. sysret and iret are in the upper
half that is global and so page tables of lower levels are already mapped.
Fixes #5742
PiperOrigin-RevId: 371458644
|
|
|
|
PiperOrigin-RevId: 369758655
|
|
If the host doesn't have TSC scaling feature, then scaling down TSC to
the lowest value will fail, and we will fall back to legacy logic
anyway, but we leave an ugly log message in host's kernel log.
kernel: user requested TSC rate below hardware speed
Instead, check for KVM_CAP_TSC_CONTROL when initializing KVM, and fall
back to legacy logic early if host's cpu doesn't support that.
Signed-off-by: Daniel Dao <dqminh89@gmail.com>
|
|
|
|
Go 1.17 is adding a new register-based calling convention [1] ("ABIInternal"),
which used is when calling between Go functions. Assembly functions are still
written using the old ABI ("ABI0"). That is, they still accept arguments on the
stack, and pass arguments to other functions on the stack. The call rules look
approximately like this:
1. Direct call from Go function to Go function: compiler emits direct
ABIInternal call.
2. Indirect call from Go function to Go function: compiler emits indirect
ABIInternal call.
3. Direct call from Go function to assembly function: compiler emits direct
ABI0 call.
4. Indirect call from Go function to assembly function: compiler emits indirect
ABIInternal call to ABI conversion wrapper function.
5. Direct or indirect call from assembly function to assembly function:
assembly/linker emits call to original ABI0 function.
6. Direct or indirect call from assembly function to Go function:
assembly/linker emits ABI0 call to ABI conversion wrapper function.
Case 4 is the interesting one here. Since the compiler can't know the ABI of an
indirect call, all indirect calls are made with ABIInternal. In order to
support indirect ABI0 assembly function calls, a wrapper is generated that
translates ABIInternal arguments to ABI0 arguments, calls the target function,
and then converts results back.
When the address of an ABI0 function is taken from Go code, it evaluates to the
address of this wrapper function rather than the target function so that later
indirect calls will work as expected.
This is normally fine, but gVisor does more than just call some of the assembly
functions we take the address of: either noting the start and end address for
future reference from a signal handler (safecopy), or copying the function text
to a new mapping (platforms).
Both of these fail with wrappers enabled (currently, this is Go tip with
GOEXPERIMENT=regabiwrappers) because these operations end up operating on the
wrapper instead of the target function.
We work around this issue by taking advantage of case 5: references to assembly
symbols from other assembly functions resolve directly to the desired target
symbol. Thus, rather than using reflect to get the address of a Go reference to
the functions, we create assembly stubs that return the address of the
function. This approach works just as well on current versions of Go, so the
change can be made immediately and doesn't require any build tags.
[1] https://go.googlesource.com/go/+/refs/heads/master/src/cmd/compile/abi-internal.md
PiperOrigin-RevId: 368505655
|