summaryrefslogtreecommitdiffhomepage
path: root/pkg/sentry/platform
AgeCommit message (Collapse)Author
2021-10-26Merge release-20211019.0-42-g722d7ca74 (automated)gVisor bot
2021-10-26platform/kvm: map vdso and vvar into a guest address spaceAndrei Vagin
Right now, each vdso call triggers vmexit. VDSO and VVAR pages are mapped with VM_IO and get_user_pages fails for such vma-s. KVM was not able to handle this case up to the v4.8 kernel. This problem was fixed by add6a0cd1c5ba ("KVM: MMU: try to fix up page faults before giving up"). For some unknown reasons, it still doesn't work in case of nested virtualization. Before: BenchmarkKernelVDSO-6 252519 4598 ns/op After: BenchmarkKernelVDSO-6 34431957 34.91 ns/op PiperOrigin-RevId: 405715941
2021-10-19Fix typo in FIXMEFabricio Voznika
PiperOrigin-RevId: 404400399
2021-10-09Merge release-20210927.0-53-g3f1642e4b (automated)gVisor bot
2021-10-08Remove ring0 floating point save/load functions on amd64.Jamie Liu
ring0.Save/LoadFloatingPoint() are only usable if the caller can ensure that Go will not clobber floating point registers before/after calling them respectively. Due to regabig in Go 1.17, this is no longer the case; regabig (among other things) maintains a zeroed XMM15 during ABIInternal execution, including by zeroing it after ABI0-to-ABIInternal transitions. In ring0.sysenter/exception, this happens in ring0.kernelSyscall/kernelException.abi0 respectively; in ring0.CPU.SwitchToUser, this happens after returning from ring0.sysret/iret.abi0. Delete these functions and do floating point save/load in assembly. While arm64 doesn't appear to be immediately affected (so this CL permits us to resume usage of Go 1.17), its use of Save/LoadFloatingPoint() still seems to be incorrect for the same fundamental reason (Go code can't sanely assume what registers the Go compiler will or won't use) and should be fixed eventually. PiperOrigin-RevId: 401895658
2021-10-07Merge release-20210927.0-47-g710e51372 (automated)gVisor bot
2021-10-07tests: use a proper path to the kvm deviceAndrei Vagin
PiperOrigin-RevId: 401624134
2021-09-29Merge release-20210921.0-39-g65698b627 (automated)gVisor bot
2021-09-28Move `safecopy.ReplaceSignalHandler` into `sighandling` package.Etienne Perot
PiperOrigin-RevId: 399560357
2021-09-23Merge release-20210921.0-25-g93ac15577 (automated)gVisor bot
2021-09-23Merge pull request #6573 from avagin:kvm-seccomp-mmapgVisor bot
PiperOrigin-RevId: 398572735
2021-09-22kvm: check that safecopy is handled correctly in the guest ring0Andrei Vagin
Signed-off-by: Andrei Vagin <avagin@google.com>
2021-09-22kvm: trap mmap syscalls to map new regions to the guestAndrei Vagin
We install seccomp rules so that the SIGSYS signal is generated for each mmap system call. Then our signal handler executes the real mmap syscall and if a new regions is created, it maps it to the guest. Signed-off-by: Andrei Vagin <avagin@google.com>
2021-09-22kvm/arm: calculate virtual-to-physical mappings only onceAndrei Vagin
2021-09-22kvm: fix tests on arm64AV
2021-08-24Merge release-20210816.0-29-g2c3d7cb07 (automated)gVisor bot
2021-08-23Merge pull request #6491 from avagin:kvm-mem-slot-overlapgVisor bot
PiperOrigin-RevId: 392554743
2021-08-21platform/kvm: set physical slots without overlappingAndrei Vagin
Right now, the first slot starts with an address of a memory region and its size is faultBlockSize, but the second slot starts with (physicalStart + faultBlockSize) & faultBlockMask. It means they will overlap if a start address of a memory region are not aligned to faultBlockSize. The kernel doesn't allow to add overlapped regions, but we ignore the EEXIST error. Signed-off-by: Andrei Vagin <avagin@google.com>
2021-08-09Merge release-20210726.0-45-g14d6cb443 (automated)gVisor bot
2021-08-09platform/kvm: fix a race condition in vCPU.unlock()Andrei Vagin
Right now, it contains the code: origState := atomic.LoadUint32(&c.state) atomicbitops.AndUint32(&c.state, ^vCPUUser) The problem here is that vCPU.bounce that is called from another thread can add vCPUWaiter when origState has been read but vCPUUser isn't cleared yet. In this case, vCPU.unlock doesn't notify other threads about changes and c.bounce will be stuck in the futex_wait call. PiperOrigin-RevId: 389697411
2021-07-30Merge release-20210726.0-12-g62ea5c0a2 (automated)gVisor bot
2021-07-30checklinkname: rudimentary type-checking of linkname directivesMichael Pratt
This CL introduces a 'checklinkname' analyzer, which provides rudimentary type-checking that verifies that function signatures on the local and remote sides of //go:linkname directives match expected values. If the Go standard library changes the definitions of any of these function, checklinkname will flag the change as a finding, providing an error informing the gVisor team to adapt to the upstream changes. This allows us to eliminate the majority of gVisor's forward-looking negative build tags, as we can catch mismatches in testing [1]. The remaining forward-looking negative build tags are covering shared struct definitions, which I hope to add to checklinkname in a future CL. [1] Of course, semantics/requirements can change without the signature changing, so we still must be careful, but this covers the common case. PiperOrigin-RevId: 387873847
2021-07-28Merge release-20210720.0-43-g01f7dd442 (automated)gVisor bot
2021-07-28tunning hasSlot function and fix store wrong value in usedSlotsHoward Zhang
Make hasSlot scan allocated slot, rather than the whole slice. It is supposed to store physicalStart in usedSlot. Signed-off-by: Howard Zhang <howard.zhang@arm.com>
2021-07-28Merge release-20210720.0-39-g964fb3ca7 (automated)gVisor bot
2021-07-21Merge release-20210712.0-31-g49d9ef498 (automated)gVisor bot
2021-07-20Merge pull request #6220 from laijs:disconnect-fpgVisor bot
PiperOrigin-RevId: 385919423
2021-07-20Merge release-20210712.0-29-g1ad382220 (automated)gVisor bot
2021-07-20Add go:build directives as required by Go 1.17's gofmt.Jamie Liu
PiperOrigin-RevId: 385894869
2021-07-12Merge release-20210705.0-10-gebe99977a (automated)gVisor bot
2021-07-12Mark all functions that are called from a forked child with go:noraceAndrei Vagin
PiperOrigin-RevId: 384305599
2021-07-12Merge release-20210628.0-35-g36a17a814 (automated)gVisor bot
2021-07-12Go 1.17 support for the KVM platformMichael Pratt
Go 1.17 adds a new register-based calling convention. While transparent for most applications, the KVM platform needs special work in a few cases. First of all, we need the actual address of some assembly functions, rather than the address of a wrapper. See http://gvisor.dev/pr/5832 for complete discussion of this. More relevant to this CL is that ABI0-to-ABIInternal wrappers (i.e., calls from assembly to Go) access the G via FS_BASE. The KVM quite fast-and-loose about the Go environment, often calling into (nosplit) Go functions with uninitialized FS_BASE. That will no longer work in Go 1.17, so this CL changes the platform to consistently restore FS_BASE before calling into Go code. This CL does not affect arm64 code. Go 1.17 does not support the register-based calling convention for arm64 (it will come in 1.18), but arm64 also does not use a non-standard register like FS_BASE for TLS, so it may not require any changes. PiperOrigin-RevId: 384234305
2021-07-09Merge release-20210628.0-33-gde29d8d41 (automated)gVisor bot
2021-07-08Fix some //pkg/seccomp bugs.Jamie Liu
- LockOSThread() around prctl(PR_SET_NO_NEW_PRIVS) => seccomp(). go:nosplit "mostly" prevents async preemption, but IIUC preemption is still permitted during function prologues: funcpctab "".seccomp [valfunc=pctopcdata] 0 -1 00000 (gvisor/pkg/seccomp/seccomp_unsafe.go:110) TEXT "".seccomp(SB), NOSPLIT|ABIInternal, $72-32 0 00000 (gvisor/pkg/seccomp/seccomp_unsafe.go:110) TEXT "".seccomp(SB), NOSPLIT|ABIInternal, $72-32 0 -1 00000 (gvisor/pkg/seccomp/seccomp_unsafe.go:110) SUBQ $72, SP 4 00004 (gvisor/pkg/seccomp/seccomp_unsafe.go:110) MOVQ BP, 64(SP) 9 00009 (gvisor/pkg/seccomp/seccomp_unsafe.go:110) LEAQ 64(SP), BP e 00014 (gvisor/pkg/seccomp/seccomp_unsafe.go:110) FUNCDATA $0, gclocals·ba30782f8935b28ed1adaec603e72627(SB) e 00014 (gvisor/pkg/seccomp/seccomp_unsafe.go:110) FUNCDATA $1, gclocals·663f8c6bfa83aa777198789ce63d9ab4(SB) e 00014 (gvisor/pkg/seccomp/seccomp_unsafe.go:110) FUNCDATA $2, "".seccomp.stkobj(SB) e 00014 (gvisor/pkg/seccomp/seccomp_unsafe.go:111) PCDATA $0, $-2 e -2 00014 (gvisor/pkg/seccomp/seccomp_unsafe.go:111) MOVQ "".ptr+88(SP), AX (-1 is objabi.PCDATA_UnsafePointSafe and -2 is objabi.PCDATA_UnsafePointUnsafe, from Go's cmd/internal/objabi.) - Handle non-errno failures from seccomp() with SECCOMP_FILTER_FLAG_TSYNC. PiperOrigin-RevId: 383757580
2021-06-22Merge release-20210614.0-13-g01bcd55c3 (automated)gVisor bot
2021-06-22Merge pull request #5051 from lubinszARM:pr_escapes_1gVisor bot
PiperOrigin-RevId: 380904249
2021-06-22Disconnect call-chain between sighandler() and bluepill().Lai Jiangshan
When sentry is running in guest ring0, the goroutine stack is changing and it will not be the stack when bluepill() is called. If PMU interrupt hits when the CPU is in host ring 0, the perf handler will try to get the stack of the kernel and the userspace(sentry). It can travel back to sighandler() and try to continue to the stack of the goroutine with the outdated frame pointer if sentry has been running in the guest. The perf handler can't record correct addresses from the outdated and wrong frames. Those addresses are often irresolvable, and even if it is resolvable accidentally, it would be misleading names. To fix the problem, we just set the frame pointer(%RBP) to zero and disconnect the link when the zeroed frame pointer is saved in the frame in bluepillHandler(). Signed-off-by: Lai Jiangshan <jiangshan.ljs@antfin.com>
2021-06-16Merge release-20210607.0-50-g47149b7c4 (automated)gVisor bot
2021-06-16kvm: mark UpperHalf PTE-s as globalAndrei Vagin
UpperHalf is shared with all address spaces. PiperOrigin-RevId: 379790539
2021-06-14Merge release-20210607.0-42-gb9db1c031 (automated)gVisor bot
2021-06-14Fix typoMichael Pratt
PiperOrigin-RevId: 379337677
2021-06-10Merge release-20210601.0-39-g9ede1a605 (automated)gVisor bot
2021-06-10[op] Move SignalInfo to abi/linux package.Ayush Ranjan
Fixes #214 PiperOrigin-RevId: 378680466
2021-06-01 Fix errors for noescape casesRobin Luk
Signed-off-by: Robin Luk <lubin.lu@antgroup.com>
2021-05-27Merge release-20210518.0-56-g62ec2422a (automated)gVisor bot
2021-05-24arm64 kvm:use TLBI with "Inner Shareable" instead of IPI operationRobin Luk
on Arm64 platform, we can use TLBI with 'IS' instead of IPI operation. According to my understanding, the logic in invalidate() is much like an IPI operation. On Arm64, we can simply perform vmalle1is invalidation here, not use IPI. Reference: https://github.com/torvalds/linux/blob/v5.12/arch/arm64/kvm/mmu.c#L81 Signed-off-by: Robin Luk <lubin.lu@antgroup.com>
2021-05-07Merge release-20210419.0-79-ge691004e0 (automated)gVisor bot
2021-05-07Merge pull request #5758 from zhlhahaha:2125gVisor bot
PiperOrigin-RevId: 372608247
2021-05-07Init all vCPU when initializing machine on ARM64howard zhang
This patch is to solve problem that vCPU timer mess up when adding vCPU dynamically on ARM64, for detailed information please refer to: https://github.com/google/gvisor/issues/5739 There is no influence on x86 and here are main changes for ARM64: 1. create maxVCPUs number of vCPU in machine initialization 2. we want to sync gvisor vCPU number with host CPU number, so use smaller number between runtime.NumCPU and KVM_CAP_MAX_VCPUS to be maxVCPUS 3. put unused vCPUs into architecture-specific map initialvCPUs 4. When machine need to bind a new vCPU with tid, rather than creating new one, it would pick a vCPU from map initalvCPUs 5. change the setSystemTime function. When vCPU number increasing, the time cost for function setTSC(use syscall to set cntvoff) is liner growth from around 300 ns to 100000 ns, and this leads to the function setSystemTimeLegacy can not get correct offset value. 6. initializing StdioFDs and goferFD before a platform to avoid StdioFDs confects with vCPU fds Signed-off-by: howard zhang <howard.zhang@arm.com>