Age | Commit message (Collapse) | Author |
|
PiperOrigin-RevId: 392554743
|
|
Right now, the first slot starts with an address of a memory region and its size is faultBlockSize,
but the second slot starts with (physicalStart + faultBlockSize) & faultBlockMask.
It means they will overlap if a start address of a memory region are not aligned to faultBlockSize.
The kernel doesn't allow to add overlapped regions, but we ignore the EEXIST error.
Signed-off-by: Andrei Vagin <avagin@google.com>
|
|
Right now, it contains the code:
origState := atomic.LoadUint32(&c.state)
atomicbitops.AndUint32(&c.state, ^vCPUUser)
The problem here is that vCPU.bounce that is called from another thread can add
vCPUWaiter when origState has been read but vCPUUser isn't cleared yet. In this
case, vCPU.unlock doesn't notify other threads about changes and c.bounce will
be stuck in the futex_wait call.
PiperOrigin-RevId: 389697411
|
|
This CL introduces a 'checklinkname' analyzer, which provides rudimentary
type-checking that verifies that function signatures on the local and remote
sides of //go:linkname directives match expected values.
If the Go standard library changes the definitions of any of these function,
checklinkname will flag the change as a finding, providing an error informing
the gVisor team to adapt to the upstream changes. This allows us to eliminate
the majority of gVisor's forward-looking negative build tags, as we can catch
mismatches in testing [1].
The remaining forward-looking negative build tags are covering shared struct
definitions, which I hope to add to checklinkname in a future CL.
[1] Of course, semantics/requirements can change without the signature
changing, so we still must be careful, but this covers the common case.
PiperOrigin-RevId: 387873847
|
|
Make hasSlot scan allocated slot, rather than the whole slice.
It is supposed to store physicalStart in usedSlot.
Signed-off-by: Howard Zhang <howard.zhang@arm.com>
|
|
PiperOrigin-RevId: 385919423
|
|
PiperOrigin-RevId: 385894869
|
|
Go 1.17 adds a new register-based calling convention. While transparent for
most applications, the KVM platform needs special work in a few cases.
First of all, we need the actual address of some assembly functions, rather
than the address of a wrapper. See http://gvisor.dev/pr/5832 for complete
discussion of this.
More relevant to this CL is that ABI0-to-ABIInternal wrappers (i.e., calls from
assembly to Go) access the G via FS_BASE. The KVM quite fast-and-loose about
the Go environment, often calling into (nosplit) Go functions with
uninitialized FS_BASE.
That will no longer work in Go 1.17, so this CL changes the platform to
consistently restore FS_BASE before calling into Go code.
This CL does not affect arm64 code. Go 1.17 does not support the register-based
calling convention for arm64 (it will come in 1.18), but arm64 also does not
use a non-standard register like FS_BASE for TLS, so it may not require any
changes.
PiperOrigin-RevId: 384234305
|
|
PiperOrigin-RevId: 380904249
|
|
When sentry is running in guest ring0, the goroutine stack is changing
and it will not be the stack when bluepill() is called.
If PMU interrupt hits when the CPU is in host ring 0, the perf handler
will try to get the stack of the kernel and the userspace(sentry). It
can travel back to sighandler() and try to continue to the stack of
the goroutine with the outdated frame pointer if sentry has been running
in the guest. The perf handler can't record correct addresses
from the outdated and wrong frames. Those addresses are often
irresolvable, and even if it is resolvable accidentally, it would
be misleading names.
To fix the problem, we just set the frame pointer(%RBP) to zero and
disconnect the link when the zeroed frame pointer is saved in the
frame in bluepillHandler().
Signed-off-by: Lai Jiangshan <jiangshan.ljs@antfin.com>
|
|
UpperHalf is shared with all address spaces.
PiperOrigin-RevId: 379790539
|
|
PiperOrigin-RevId: 379337677
|
|
Fixes #214
PiperOrigin-RevId: 378680466
|
|
Signed-off-by: Robin Luk <lubin.lu@antgroup.com>
|
|
on Arm64 platform, we can use TLBI with 'IS' instead of IPI operation.
According to my understanding, the logic in invalidate() is much like
an IPI operation.
On Arm64, we can simply perform vmalle1is invalidation here, not
use IPI.
Reference:
https://github.com/torvalds/linux/blob/v5.12/arch/arm64/kvm/mmu.c#L81
Signed-off-by: Robin Luk <lubin.lu@antgroup.com>
|
|
PiperOrigin-RevId: 372608247
|
|
This patch is to solve problem that vCPU timer mess up when
adding vCPU dynamically on ARM64, for detailed information
please refer to:
https://github.com/google/gvisor/issues/5739
There is no influence on x86 and here are main changes for
ARM64:
1. create maxVCPUs number of vCPU in machine initialization
2. we want to sync gvisor vCPU number with host CPU number,
so use smaller number between runtime.NumCPU and
KVM_CAP_MAX_VCPUS to be maxVCPUS
3. put unused vCPUs into architecture-specific map initialvCPUs
4. When machine need to bind a new vCPU with tid, rather
than creating new one, it would pick a vCPU from map initalvCPUs
5. change the setSystemTime function. When vCPU number increasing,
the time cost for function setTSC(use syscall to set cntvoff) is
liner growth from around 300 ns to 100000 ns, and this leads to
the function setSystemTimeLegacy can not get correct offset
value.
6. initializing StdioFDs and goferFD before a platform to avoid
StdioFDs confects with vCPU fds
Signed-off-by: howard zhang <howard.zhang@arm.com>
|
|
The root table physical page has to be mapped to not fault in iret or sysret
after switching into a user address space. sysret and iret are in the upper
half that is global and so page tables of lower levels are already mapped.
Fixes #5742
PiperOrigin-RevId: 371458644
|
|
PiperOrigin-RevId: 369758655
|
|
If the host doesn't have TSC scaling feature, then scaling down TSC to
the lowest value will fail, and we will fall back to legacy logic
anyway, but we leave an ugly log message in host's kernel log.
kernel: user requested TSC rate below hardware speed
Instead, check for KVM_CAP_TSC_CONTROL when initializing KVM, and fall
back to legacy logic early if host's cpu doesn't support that.
Signed-off-by: Daniel Dao <dqminh89@gmail.com>
|
|
Go 1.17 is adding a new register-based calling convention [1] ("ABIInternal"),
which used is when calling between Go functions. Assembly functions are still
written using the old ABI ("ABI0"). That is, they still accept arguments on the
stack, and pass arguments to other functions on the stack. The call rules look
approximately like this:
1. Direct call from Go function to Go function: compiler emits direct
ABIInternal call.
2. Indirect call from Go function to Go function: compiler emits indirect
ABIInternal call.
3. Direct call from Go function to assembly function: compiler emits direct
ABI0 call.
4. Indirect call from Go function to assembly function: compiler emits indirect
ABIInternal call to ABI conversion wrapper function.
5. Direct or indirect call from assembly function to assembly function:
assembly/linker emits call to original ABI0 function.
6. Direct or indirect call from assembly function to Go function:
assembly/linker emits ABI0 call to ABI conversion wrapper function.
Case 4 is the interesting one here. Since the compiler can't know the ABI of an
indirect call, all indirect calls are made with ABIInternal. In order to
support indirect ABI0 assembly function calls, a wrapper is generated that
translates ABIInternal arguments to ABI0 arguments, calls the target function,
and then converts results back.
When the address of an ABI0 function is taken from Go code, it evaluates to the
address of this wrapper function rather than the target function so that later
indirect calls will work as expected.
This is normally fine, but gVisor does more than just call some of the assembly
functions we take the address of: either noting the start and end address for
future reference from a signal handler (safecopy), or copying the function text
to a new mapping (platforms).
Both of these fail with wrappers enabled (currently, this is Go tip with
GOEXPERIMENT=regabiwrappers) because these operations end up operating on the
wrapper instead of the target function.
We work around this issue by taking advantage of case 5: references to assembly
symbols from other assembly functions resolve directly to the desired target
symbol. Thus, rather than using reflect to get the address of a Go reference to
the functions, we create assembly stubs that return the address of the
function. This approach works just as well on current versions of Go, so the
change can be made immediately and doesn't require any build tags.
[1] https://go.googlesource.com/go/+/refs/heads/master/src/cmd/compile/abi-internal.md
PiperOrigin-RevId: 368505655
|
|
PiperOrigin-RevId: 367730917
|
|
PiperOrigin-RevId: 367523491
|
|
Goruntime sets mxcsr once and never changes it.
Reported-by: syzbot+ec55cea6e57ec083b7a6@syzkaller.appspotmail.com
Fixes: #5754
|
|
Split usermem package to help remove syserror dependency in go_marshal.
New hostarch package contains code not dependent on syserror.
PiperOrigin-RevId: 365651233
|
|
It is enough to invalidate the tlb of local vcpu in switch().
TLBI with inner-sharable will invalidate the tlb in other vcpu.
Arm64 hardware supports at least 256 pcid, so I think it's ok
to set the length of pcid pool to 128.
Signed-off-by: Robin Luk <lubin.lu@antgroup.com>
|
|
Signed-off-by: Howard Zhang <howard.zhang@arm.com>
|
|
PiperOrigin-RevId: 364728696
|
|
This change is inspired by Adin's cl/355256448.
PiperOrigin-RevId: 364695931
|
|
If physical pages of a memory region are not mapped yet, the kernel will
trigger KVM_EXIT_MMIO and we will map physical pages in bluepillHandler().
An instruction that triggered a fault will not be re-executed, it
will be emulated in the kernel, but it can't emulate complex
instructions like xsave, xrstor. We can touch the memory with
simple instructions to workaround this problem.
|
|
The syscall package has been deprecated in favor of golang.org/x/sys.
Note that syscall is still used in the following places:
- pkg/sentry/socket/hostinet/stack.go: some netlink related functionalities
are not yet available in golang.org/x/sys.
- syscall.Stat_t is still used in some places because os.FileInfo.Sys() still
returns it and not unix.Stat_t.
Updates #214
PiperOrigin-RevId: 360701387
|
|
These are bumped to allow early testing of Go 1.17. Use will be audited closer
to the 1.17 release.
PiperOrigin-RevId: 358278615
|
|
Implement basic lazy save and restore for FPSIMD registers, which only
restore FPSIMD state on el0_fpsimd_acc and save FPSIMD state in switch().
Signed-off-by: Robin Luk <lubin.lu@antgroup.com>
|
|
This allows the package to serve as a general purpose ring0 support package, as
opposed to being bound to specific sentry platforms.
Updates #5039
PiperOrigin-RevId: 355220044
|
|
PiperOrigin-RevId: 351638451
|
|
PiperOrigin-RevId: 350862699
|
|
global
In order to improve the performance, some kpti related codes(TCR.A1) have
been reverted, and set kernel pagetable as global.
Signed-off-by: Robin Luk <lubin.lu@antgroup.com>
|
|
Add more comments and more handling for exceptions.
Signed-off-by: Robin Luk <lubin.lu@antgroup.com>
|
|
If no vild syndrome(data abort outside memslots) was reported by kvm, let userspace to do the
ext_dabt injection to bail out this issue.
Signed-off-by: Robin Luk <lubin.lu@antgroup.com>
|
|
I added 2 unified processing functions for all exceptions of el/el0
Signed-off-by: Robin Luk <lubin.lu@antgroup.com>
|
|
Signed-off-by: Robin Luk <lubin.lu@alibaba-inc.com>
|
|
PiperOrigin-RevId: 340484823
|
|
Use an sErr injection to trigger sigbus when we receive EFAULT from the
run ioctl.
After applying this patch, mmap_test_runsc_kvm will be passed on
Arm64.
Signed-off-by: Bin Lu <bin.lu@arm.com>
COPYBARA_INTEGRATE_REVIEW=https://github.com/google/gvisor/pull/4542 from lubinszARM:pr_kvm_mmap_1 f81bd42466d1d60a581e5fb34de18b78878c68c1
PiperOrigin-RevId: 340461239
|
|
Fixes: #509
Signed-off-by: Lai Jiangshan <jiangshan.ljs@antfin.com>
Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
|
|
PiperOrigin-RevId: 339921446
|
|
I have added support for setSystemTimeLegacy() by setting cntvoff.
With this pr, TestRdtsc and other kvm syscall test cases(nanosleep,
wait...) can be passed on Arm64.
TO-DO: Add precise synchronization to KVM for Arm64.
Reference PR: https://github.com/google/gvisor/pull/4397
Signed-off-by: Bin Lu <bin.lu@arm.com>
|
|
PiperOrigin-RevId: 338321125
|
|
Consistent with the linux kernel, bad regs.Sp
return SIGSEGV
Signed-off-by: Howard Zhang <howard.zhang@arm.com>
|
|
Consistent with the linux approach, we will produce a sigill to handle
el0_undef.
After applying this patch, exec_binary_test_runsc_kvm will be passed on
Arm64.
Signed-off-by: Bin Lu <bin.lu@arm.com>
|
|
PiperOrigin-RevId: 337544656
|