Age | Commit message (Collapse) | Author |
|
PiperOrigin-RevId: 340484823
|
|
Use an sErr injection to trigger sigbus when we receive EFAULT from the
run ioctl.
After applying this patch, mmap_test_runsc_kvm will be passed on
Arm64.
Signed-off-by: Bin Lu <bin.lu@arm.com>
COPYBARA_INTEGRATE_REVIEW=https://github.com/google/gvisor/pull/4542 from lubinszARM:pr_kvm_mmap_1 f81bd42466d1d60a581e5fb34de18b78878c68c1
PiperOrigin-RevId: 340461239
|
|
Fixes: #509
Signed-off-by: Lai Jiangshan <jiangshan.ljs@antfin.com>
Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
|
|
PiperOrigin-RevId: 339921446
|
|
PiperOrigin-RevId: 339540747
|
|
Signed-off-by: Min Le <lemin.lm@antgroup.com>
|
|
I have added support for setSystemTimeLegacy() by setting cntvoff.
With this pr, TestRdtsc and other kvm syscall test cases(nanosleep,
wait...) can be passed on Arm64.
TO-DO: Add precise synchronization to KVM for Arm64.
Reference PR: https://github.com/google/gvisor/pull/4397
Signed-off-by: Bin Lu <bin.lu@arm.com>
|
|
PiperOrigin-RevId: 338321125
|
|
PiperOrigin-RevId: 338126491
|
|
Consistent with the linux kernel, bad regs.Sp
return SIGSEGV
Signed-off-by: Howard Zhang <howard.zhang@arm.com>
|
|
Consistent with the linux approach, we will produce a sigill to handle
el0_undef.
After applying this patch, exec_binary_test_runsc_kvm will be passed on
Arm64.
Signed-off-by: Bin Lu <bin.lu@arm.com>
|
|
PiperOrigin-RevId: 337544656
|
|
Signed-off-by: Min Le <lemin.lm@antgroup.com>
|
|
PiperOrigin-RevId: 336976081
|
|
PiperOrigin-RevId: 336970511
|
|
PiperOrigin-RevId: 336962937
|
|
The required states may simply not be observed by the thread running bounce, so
track guest and user generations to ensure that at least one of the desired
state transitions happens.
Fixes #3532
PiperOrigin-RevId: 336908216
|
|
PiperOrigin-RevId: 336719900
|
|
The tls of guest-el1-sentry and host-el0-sentry may be different on Arm64.
I added a solution for it.
Signed-off-by: Bin Lu <bin.lu@arm.com>
|
|
Signed-off-by: Min Le <lemin.lm@antgroup.com>
|
|
PiperOrigin-RevId: 336366624
|
|
PiperOrigin-RevId: 336362818
|
|
the correct value needed is 0xbbff440c0400 but the const
defined is 0x000000000000ffc0 due to the operator error
in _MT_EL1_INIT, both kernel and user space memory
attribute should be Normal memory not DEVICE_nGnRE
Signed-off-by: Min Le <lemin.lm@antgroup.com>
|
|
PiperOrigin-RevId: 335930035
|
|
By using TSC scaling as a hack, we can trick the kernel into setting an offset
of exactly zero. Huzzah!
PiperOrigin-RevId: 335922019
|
|
Updates #267
PiperOrigin-RevId: 335713923
|
|
PiperOrigin-RevId: 335532690
|
|
Before we thought that interrupts are always disabled in the kernel
space, but here is a case when goruntime switches on a goroutine which
has been saved in the host mode. On restore, the popf instruction is
used to restore flags and this means that all flags what the goroutine
has in the host mode will be restored in the kernel mode. And in the
host mode, interrupts are always enabled.
The long story short, we can't use the IF flag for determine whether a
tasks is running in user or kernel mode.
This patch reworks the code so that in userspace, the first bit of the
IOPL flag will be always set. This doesn't give any new privilidges for
a task because CPL in userspace is always 3. But then we can use this
flag to distinguish user and kernel modes. The IOPL flag is never set in
the kernel and host modes.
Reported-by: syzbot+5036b325a8eb15c030cf@syzkaller.appspotmail.com
Reported-by: syzbot+034d580e89ad67b8dc75@syzkaller.appspotmail.com
Signed-off-by: Andrei Vagin <avagin@gmail.com>
|
|
Related with issue #3019, #4056.
When running hello-world with gvisor-kvm, there is panic when exits:
"
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x3c0 pc=0x7c3f18]
goroutine 284 [running]:
... ...
gvisor.dev/gvisor/pkg/sentry/platform/kvm.(*machine).dropPageTables(0x4000166840, 0x400032a040)
pkg/sentry/platform/kvm/machine_arm64.go:111 +0x88 fp=0x4000479e00 sp=0x4000479da0 pc=0x7c3f18
"
Also make dropPageTables() arch independent.
|
|
PiperOrigin-RevId: 334674481
|
|
Signed-off-by: Bin Lu <bin.lu@arm.com>
|
|
Currently there is a problem with the preservation of usr-tls, which leads
to the contamination of sentry tls.
Signed-off-by: Bin Lu <bin.lu@arm.com>
|
|
Signed-off-by: Bin Lu <bin.lu@arm.com>
|
|
after the SWITCH_TO_APP_PAGETABLE, the ASID is changed
to the application ASID, but there are still some
instruction before ERET, since these instruction is
not use the kernel address space, it may use the application's
TLB, which will cause fault, this patch can make sure that
after SWITCH_TO_APP_PAGETABLE sentry is still use kernel
address space which is mapped as Global.
Signed-off-by: Min Le <lemin.lm@antgroup.com>
|
|
some application such as openjdk will excute
DC CVAU at el0, if SCTLR_UCI is not set, it
will trap to EL1 which will cause panic.
Signed-off-by: Min Le <lemin.lm@antgroup.com>
|
|
PiperOrigin-RevId: 332069743
|
|
OCI configuration includes support for specifying seccomp filters. In runc,
these filter configurations are converted into seccomp BPF programs and loaded
into the kernel via libseccomp. runsc needs to be a static binary so, for
runsc, we cannot rely on a C library and need to implement the functionality
in Go.
The generator added here implements basic support for taking OCI seccomp
configuration and converting it into a seccomp BPF program with the same
behavior as a program generated by libseccomp.
- New conditional operations were added to pkg/seccomp to support operations
available in OCI.
- AllowAny and AllowValue were renamed to MatchAny and EqualTo to better reflect
that syscalls matching the conditionals result in the provided action not
simply SCMP_RET_ALLOW.
- BuildProgram in pkg/seccomp no longer panics if provided an empty list of
rules. It now builds a program with the architecture sanity check only.
- ProgramBuilder now allows adding labels that are unused. However, backwards
jumps are still not permitted.
Fixes #510
PiperOrigin-RevId: 331938697
|
|
Some optimizations in this pr:
1, Move ASID from TTBR0 to TTBR1
2, tlb_flush_all
Signed-off-by: Bin Lu <bin.lu@arm.com>
|
|
Some CPUs(eg: ampere-emag) can speculate past an ERET instruction and potentially perform
speculative accesses to memory before processing the exception return.
Since the register state is often controlled by a lower privilege level
at the point of an ERET, this could potentially be used as part of a
side-channel attack.
Signed-off-by: Bin Lu <bin.lu@arm.com>
|
|
PiperOrigin-RevId: 330777900
|
|
PiperOrigin-RevId: 328639254
|
|
This immediately revealed an escape analysis violation (!), where
the sync.Map was being used in a context that escapes were not
allowed. This is a relatively minor fix and is included.
PiperOrigin-RevId: 328611237
|
|
Signed-off-by: Bin Lu <bin.lu@arm.com>
|
|
This enables pre-release testing with 1.16. The intention is to replace these
with a nogo check before the next release.
PiperOrigin-RevId: 328193911
|
|
Our "Preconditions:" blocks are very useful to determine the input invariants,
but they are bit inconsistent throughout the codebase, which makes them harder
to read (particularly cases with 5+ conditions in a single paragraph).
I've reformatted all of the cases to fit in simple rules:
1. Cases with a single condition are placed on a single line.
2. Cases with multiple conditions are placed in a bulleted list.
This format has been added to the style guide.
I've also mentioned "Postconditions:", though those are much less frequently
used, and all uses already match this style.
PiperOrigin-RevId: 327687465
|
|
Signed-off-by: Bin Lu <bin.lu@arm.com>
|
|
It indicates that the Sentry has changed the state of the thread and
next calls of PullFullState() has to do nothing.
PiperOrigin-RevId: 325567415
|
|
PiperOrigin-RevId: 325546308
|
|
Actually, gvisor has KPTI (Kernel PageTable Isolation) between
gr0 and gr3. But the upper half of the userCR3 contains the
whole sentry kernel which makes the kernel vulnerable to
gr3 APP through CPU bugs.
This patch implement full KPTI functionality for gvisor. It doesn't
map the whole kernel in the upper. It maps only the text section
of the binary and the entry area required by the ISA. The entry area
contains the global idt, the percpu gdt/tss etc. The entry area
packs all these together which is less than 350k for 512 vCPUs.
The text section is normally nonsensitive. It is possible to
map only the entry functions (interrupt handler etc.) only.
But it requires some hacks.
Signed-off-by: Lai Jiangshan <jiangshan.ljs@antfin.com>
Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
|
|
kernelEntry is split from CPU that contains minimal CPU-specific
arch state that can be mapped at the upper of the address space.
It is prepared for KPTI for gvisor.
Signed-off-by: Lai Jiangshan <jiangshan.ljs@antfin.com>
Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
|