summaryrefslogtreecommitdiffhomepage
path: root/pkg/sentry/platform
AgeCommit message (Collapse)Author
2020-10-22arm64 kvm: added the implementation of setSystemTimeLegacy()Bin Lu
I have added support for setSystemTimeLegacy() by setting cntvoff. With this pr, TestRdtsc and other kvm syscall test cases(nanosleep, wait...) can be passed on Arm64. TO-DO: Add precise synchronization to KVM for Arm64. Reference PR: https://github.com/google/gvisor/pull/4397 Signed-off-by: Bin Lu <bin.lu@arm.com>
2020-10-21Merge pull request #4535 from lubinszARM:pr_kvm_exec_binary_1gVisor bot
PiperOrigin-RevId: 338321125
2020-10-20Merge pull request #4524 from lemin9538:lemin_arm64gVisor bot
PiperOrigin-RevId: 338126491
2020-10-18arm64 kvm: handle exception from accessing undefined instructionBin Lu
Consistent with the linux approach, we will produce a sigill to handle el0_undef. After applying this patch, exec_binary_test_runsc_kvm will be passed on Arm64. Signed-off-by: Bin Lu <bin.lu@arm.com>
2020-10-16Merge pull request #4387 from lubinszARM:pr_tls_host_sentry_1gVisor bot
PiperOrigin-RevId: 337544656
2020-10-15arm64: the ASID offset of TTBR register is 48Min Le
Signed-off-by: Min Le <lemin.lm@antgroup.com>
2020-10-13Merge pull request #4482 from lemin9538:lemin_arm64gVisor bot
PiperOrigin-RevId: 336976081
2020-10-13Merge pull request #4386 from lubinszARM:pr_testutil_tls_usrgVisor bot
PiperOrigin-RevId: 336970511
2020-10-13Merge pull request #4374 from lubinszARM:pr_ffmpeg_kvm_01gVisor bot
PiperOrigin-RevId: 336962937
2020-10-13Avoid excessive Tgkill and wait operations.Adin Scannell
The required states may simply not be observed by the thread running bounce, so track guest and user generations to ensure that at least one of the desired state transitions happens. Fixes #3532 PiperOrigin-RevId: 336908216
2020-10-12Merge pull request #4072 from adamliyi:droppt_fixgVisor bot
PiperOrigin-RevId: 336719900
2020-10-11arm64 kvm: add tls-usr supportBin Lu
The tls of guest-el1-sentry and host-el0-sentry may be different on Arm64. I added a solution for it. Signed-off-by: Bin Lu <bin.lu@arm.com>
2020-10-10arm64: set DZE bit to make EL0 can use DC ZVAMin Le
Signed-off-by: Min Le <lemin.lm@antgroup.com>
2020-10-09platform/kvm: remove the unused fieldAndrei Vagin
PiperOrigin-RevId: 336366624
2020-10-09Merge pull request #4040 from lemin9538:lemin_arm64gVisor bot
PiperOrigin-RevId: 336362818
2020-10-08arm64: the mair_el1 value is wrongMin Le
the correct value needed is 0xbbff440c0400 but the const defined is 0x000000000000ffc0 due to the operator error in _MT_EL1_INIT, both kernel and user space memory attribute should be Normal memory not DEVICE_nGnRE Signed-off-by: Min Le <lemin.lm@antgroup.com>
2020-10-07Merge pull request #4376 from lubinszARM:pr_usr_tls_newgVisor bot
PiperOrigin-RevId: 335930035
2020-10-07Add precise synchronization to KVM.Adin Scannell
By using TSC scaling as a hack, we can trick the kernel into setting an offset of exactly zero. Huzzah! PiperOrigin-RevId: 335922019
2020-10-06Implement membarrier(2) commands other than *_SYNC_CORE.Jamie Liu
Updates #267 PiperOrigin-RevId: 335713923
2020-10-05Merge pull request #4079 from lemin9538:arm64_fixgVisor bot
PiperOrigin-RevId: 335532690
2020-10-02kvm/x86: handle a case when interrupts are enabled in the kernel spaceAndrei Vagin
Before we thought that interrupts are always disabled in the kernel space, but here is a case when goruntime switches on a goroutine which has been saved in the host mode. On restore, the popf instruction is used to restore flags and this means that all flags what the goroutine has in the host mode will be restored in the kernel mode. And in the host mode, interrupts are always enabled. The long story short, we can't use the IF flag for determine whether a tasks is running in user or kernel mode. This patch reworks the code so that in userspace, the first bit of the IOPL flag will be always set. This doesn't give any new privilidges for a task because CPL in userspace is always 3. But then we can use this flag to distinguish user and kernel modes. The IOPL flag is never set in the kernel and host modes. Reported-by: syzbot+5036b325a8eb15c030cf@syzkaller.appspotmail.com Reported-by: syzbot+034d580e89ad67b8dc75@syzkaller.appspotmail.com Signed-off-by: Andrei Vagin <avagin@gmail.com>
2020-09-30arm64 kvm: fix panic in kvm.dropPageTablesYi Li
Related with issue #3019, #4056. When running hello-world with gvisor-kvm, there is panic when exits: " panic: runtime error: invalid memory address or nil pointer dereference [signal SIGSEGV: segmentation violation code=0x1 addr=0x3c0 pc=0x7c3f18] goroutine 284 [running]: ... ... gvisor.dev/gvisor/pkg/sentry/platform/kvm.(*machine).dropPageTables(0x4000166840, 0x400032a040) pkg/sentry/platform/kvm/machine_arm64.go:111 +0x88 fp=0x4000479e00 sp=0x4000479da0 pc=0x7c3f18 " Also make dropPageTables() arch independent.
2020-09-30Merge pull request #2256 from laijs:kptigVisor bot
PiperOrigin-RevId: 334674481
2020-09-30arm64 kvm: add a test case for kernel-tls checkingBin Lu
Signed-off-by: Bin Lu <bin.lu@arm.com>
2020-09-29arm64 kvm: keep sentry-tls and usr-tls separatelyBin Lu
Currently there is a problem with the preservation of usr-tls, which leads to the contamination of sentry tls. Signed-off-by: Bin Lu <bin.lu@arm.com>
2020-09-29arm64 kvm: remove some redundant codes to improve the preformanceBin Lu
Signed-off-by: Bin Lu <bin.lu@arm.com>
2020-09-25make sure use the kernel space after change ASIDMin Le
after the SWITCH_TO_APP_PAGETABLE, the ASID is changed to the application ASID, but there are still some instruction before ERET, since these instruction is not use the kernel address space, it may use the application's TLB, which will cause fault, this patch can make sure that after SWITCH_TO_APP_PAGETABLE sentry is still use kernel address space which is mapped as Global. Signed-off-by: Min Le <lemin.lm@antgroup.com>
2020-09-22arm64: set SCTLR_UCI bit in SCTLR_EL1Min Le
some application such as openjdk will excute DC CVAU at el0, if SCTLR_UCI is not set, it will trap to EL1 which will cause panic. Signed-off-by: Min Le <lemin.lm@antgroup.com>
2020-09-16Merge pull request #3893 from lubinszARM:pr_n1_03gVisor bot
PiperOrigin-RevId: 332069743
2020-09-15Add support for OCI seccomp filters in the sandbox.Ian Lewis
OCI configuration includes support for specifying seccomp filters. In runc, these filter configurations are converted into seccomp BPF programs and loaded into the kernel via libseccomp. runsc needs to be a static binary so, for runsc, we cannot rely on a C library and need to implement the functionality in Go. The generator added here implements basic support for taking OCI seccomp configuration and converting it into a seccomp BPF program with the same behavior as a program generated by libseccomp. - New conditional operations were added to pkg/seccomp to support operations available in OCI. - AllowAny and AllowValue were renamed to MatchAny and EqualTo to better reflect that syscalls matching the conditionals result in the provided action not simply SCMP_RET_ALLOW. - BuildProgram in pkg/seccomp no longer panics if provided an empty list of rules. It now builds a program with the architecture sanity check only. - ProgramBuilder now allows adding labels that are unused. However, backwards jumps are still not permitted. Fixes #510 PiperOrigin-RevId: 331938697
2020-09-11arm64 mm: asid and tlb supportBin Lu
Some optimizations in this pr: 1, Move ASID from TTBR0 to TTBR1 2, tlb_flush_all Signed-off-by: Bin Lu <bin.lu@arm.com>
2020-09-10arm64:place an SB sequence following an ERET instructionBin Lu
Some CPUs(eg: ampere-emag) can speculate past an ERET instruction and potentially perform speculative accesses to memory before processing the exception return. Since the register state is often controlled by a lower privilege level at the point of an ERET, this could potentially be used as part of a side-channel attack. Signed-off-by: Bin Lu <bin.lu@arm.com>
2020-09-09Don't sched_setaffinity in ptrace platform.Jamie Liu
PiperOrigin-RevId: 330777900
2020-08-26Merge pull request #3742 from lubinszARM:pr_n1_1gVisor bot
PiperOrigin-RevId: 328639254
2020-08-26Support stdlib analyzers with nogo.Adin Scannell
This immediately revealed an escape analysis violation (!), where the sync.Map was being used in a context that escapes were not allowed. This is a relatively minor fix and is included. PiperOrigin-RevId: 328611237
2020-08-24Device major number greater than 2 digits in /proc/self/maps on arm64 N1 machineBin Lu
Signed-off-by: Bin Lu <bin.lu@arm.com>
2020-08-24Bump build constraints to 1.17Michael Pratt
This enables pre-release testing with 1.16. The intention is to replace these with a nogo check before the next release. PiperOrigin-RevId: 328193911
2020-08-20Consistent precondition formattingMichael Pratt
Our "Preconditions:" blocks are very useful to determine the input invariants, but they are bit inconsistent throughout the codebase, which makes them harder to read (particularly cases with 5+ conditions in a single paragraph). I've reformatted all of the cases to fit in simple rules: 1. Cases with a single condition are placed on a single line. 2. Cases with multiple conditions are placed in a bulleted list. This format has been added to the style guide. I've also mentioned "Postconditions:", though those are much less frequently used, and all uses already match this style. PiperOrigin-RevId: 327687465
2020-08-12Running hello-world on Thunderx2 with kvmBin Lu
Signed-off-by: Bin Lu <bin.lu@arm.com>
2020-08-07Add context.FullStateChanged()Andrei Vagin
It indicates that the Sentry has changed the state of the thread and next calls of PullFullState() has to do nothing. PiperOrigin-RevId: 325567415
2020-08-07Merge pull request #3069 from lubinszARM:pr_serr_injection2gVisor bot
PiperOrigin-RevId: 325546308
2020-08-06amd64: implement KPTI for gvisorLai Jiangshan
Actually, gvisor has KPTI (Kernel PageTable Isolation) between gr0 and gr3. But the upper half of the userCR3 contains the whole sentry kernel which makes the kernel vulnerable to gr3 APP through CPU bugs. This patch implement full KPTI functionality for gvisor. It doesn't map the whole kernel in the upper. It maps only the text section of the binary and the entry area required by the ISA. The entry area contains the global idt, the percpu gdt/tss etc. The entry area packs all these together which is less than 350k for 512 vCPUs. The text section is normally nonsensitive. It is possible to map only the entry functions (interrupt handler etc.) only. But it requires some hacks. Signed-off-by: Lai Jiangshan <jiangshan.ljs@antfin.com> Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
2020-08-05amd64: introduce kernelEntryLai Jiangshan
kernelEntry is split from CPU that contains minimal CPU-specific arch state that can be mapped at the upper of the address space. It is prepared for KPTI for gvisor. Signed-off-by: Lai Jiangshan <jiangshan.ljs@antfin.com> Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
2020-08-05amd64: don't check vcpu in bluepill()Lai Jiangshan
m.Get() has guaranteed that if any OS thread TID is in guest, m.vCPUs[TID] points to the vCPU in which the OS thread TID is running. So if m.Get() returns with the corrent context in guest, the vCPU of it must be the same as what Get() returns. So bluepill() doesn't need to check if the vCPU is matched or not. The check need to access to %gs register which will not points to vCPU later when KPTI for gvisor is enabled. We can still fetch the vCPU pointer from %gs later (when %gs points to kernelEntry), but it needs the ENTRY_CPU_SELF which is generated by ring0/offset_amd64.go. So we just simply remove the check. Signed-off-by: Lai Jiangshan <jiangshan.ljs@antfin.com> Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
2020-08-04amd64: less code and data in the upper halfLai Jiangshan
Call jumpToKernel() in sysret()/iret() so that there is less code and data in the upper half, and, especially, current goroutine's stack and user regs will not be accessed from the upper half (also with the help from previous patches which make less code in userCR3 context). jumpToUser() will not be needed, because current goroutine's stack and return value in the stack is lower half address. It is prepared for KPTI for gvisor. Signed-off-by: Lai Jiangshan <jiangshan.ljs@antfin.com> Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
2020-08-04amd64: switch to kernelCR3 when just return to gr0Lai Jiangshan
KernelCR3 takes effect as early as possible so that less code is in the userCR3 environment. It is prepared for the next patches that make less code and data in the upper half, which is prepared for KPTI for gvisor. Signed-off-by: Lai Jiangshan <jiangshan.ljs@antfin.com> Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
2020-08-04amd64: switch to userCR3 just before return to gr3Lai Jiangshan
UserCR3 takes effect as late as possible so that less code is in the userCR3 environment. It is prepared for the next patches that make less code and data in the upper half, which is prepared for KPTI for gvisor. Signed-off-by: Lai Jiangshan <jiangshan.ljs@antfin.com> Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
2020-08-03Add callbacks to support lazy loading/restoring thread statesAndrei Vagin
PiperOrigin-RevId: 324748508
2020-07-31Merge pull request #3300 from lubinszARM:pr_fpsimd_usrgVisor bot
PiperOrigin-RevId: 324309862
2020-07-30Merge pull request #3448 from lubinszARM:pr_tls_testsgVisor bot
PiperOrigin-RevId: 324127810