summaryrefslogtreecommitdiffhomepage
path: root/pkg/sentry
AgeCommit message (Collapse)Author
2020-10-12Change Merkle tree library to use ReaderAtChong Cai
Merkle tree library was originally using Read/Seek to access data and tree, since the parameters are io.ReadSeeker. This could cause race conditions if multiple threads accesses the same fd to read. Here we change to use ReaderAt, and implement it with PRead to make it thread safe. PiperOrigin-RevId: 336779260
2020-10-12[vfs] kernfs: Fix inode memory leak issue.Ayush Ranjan
This change aims to fix the memory leak issue reported inĀ #3933. Background: VFS2 kernfs kept accumulating invalid dentries if those dentries were not walked on. After substantial consideration of the problem by our team, we decided to have an LRU cache solution. This change is the first part to that solution, where we don't cache anything. The LRU cache can be added on top of this. What has changed: - Introduced the concept of an inode tree in kernfs.OrderedChildren. This is helpful is cases where the lifecycle of an inode is different from that of a dentry. - OrderedChildren now deals with initialized inodes instead of initialized dentries. It now implements Lookup() where it constructs a new dentry using the inode. - OrderedChildren holds a ref on all its children inodes. With this change, now an inode can "outlive" a dentry pointing to it. See comments in kernfs.OrderedChildren. - The kernfs dentry tree is solely maintained by kernfs only. Inode implementations can not modify the dentry tree. - Dentries that reach ref count 0 are removed from the dentry tree. - revalidateChildLocked now defer-DecRefs the newly created dentry from Inode.Lookup(), limiting its life to the current filesystem operation. If refs are picked on the dentry during the FS op (via an FD or something), then it will stick around and will be removed when the FD is closed. So there is essentially _no caching_ for Look()ed up dentries. - kernfs.DecRef does not have the precondition that fs.mu must be locked. Fixes #3933 PiperOrigin-RevId: 336768576
2020-10-12Merge pull request #4072 from adamliyi:droppt_fixgVisor bot
PiperOrigin-RevId: 336719900
2020-10-12[vfs2] Don't leak disconnected mounts.Dean Deng
PiperOrigin-RevId: 336694658
2020-10-09Include stat in Verity hashChong Cai
PiperOrigin-RevId: 336395445
2020-10-09platform/kvm: remove the unused fieldAndrei Vagin
PiperOrigin-RevId: 336366624
2020-10-09Merge pull request #4040 from lemin9538:lemin_arm64gVisor bot
PiperOrigin-RevId: 336362818
2020-10-09Reduce the cost of sysinfo(2).Jamie Liu
- sysinfo(2) does not actually require a fine-grained breakdown of memory usage. Accordingly, instead of calling pgalloc.MemoryFile.UpdateUsage() to update the sentry's fine-grained memory accounting snapshot, just use pgalloc.MemoryFile.TotalUsage() (which is a single fstat(), and therefore far cheaper). - Use the number of threads in the root PID namespace (i.e. globally) rather than in the task's PID namespace for consistency with Linux (which just reads global variable nr_threads), and add a new method to kernel.PIDNamespace to allow this to be read directly from an underlying map rather than requiring the allocation and population of an intermediate slice. PiperOrigin-RevId: 336353100
2020-10-09Automated rollback of changelist 336304024Ghanan Gowripalan
PiperOrigin-RevId: 336339194
2020-10-09syscalls: Don't leak a file on the error pathAndrei Vagin
Reported-by: syzbot+bb82fb556d5d0a43f632@syzkaller.appspotmail.com PiperOrigin-RevId: 336324720
2020-10-09Automated rollback of changelist 336185457Bhasker Hariharan
PiperOrigin-RevId: 336304024
2020-10-08Implement MEMBARRIER_CMD_PRIVATE_EXPEDITED_RSEQ.Jamie Liu
cf. 2a36ab717e8f "rseq/membarrier: Add MEMBARRIER_CMD_PRIVATE_EXPEDITED_RSEQ" PiperOrigin-RevId: 336186795
2020-10-08Do not resolve routes immediatelyGhanan Gowripalan
When a response needs to be sent to an incoming packet, the stack should consult its neighbour table to determine the remote address's link address. When an entry does not exist in the stack's neighbor table, the stack should queue the packet while link resolution completes. See comments. PiperOrigin-RevId: 336185457
2020-10-08arm64: the mair_el1 value is wrongMin Le
the correct value needed is 0xbbff440c0400 but the const defined is 0x000000000000ffc0 due to the operator error in _MT_EL1_INIT, both kernel and user space memory attribute should be Normal memory not DEVICE_nGnRE Signed-off-by: Min Le <lemin.lm@antgroup.com>
2020-10-07Add staticcheck and staticstyle analyzers.Adin Scannell
This change also adds support to go_stateify for detecting an appropriate receiver name, avoiding a large number of false positives. PiperOrigin-RevId: 335994587
2020-10-07Merge pull request #4376 from lubinszARM:pr_usr_tls_newgVisor bot
PiperOrigin-RevId: 335930035
2020-10-07Add precise synchronization to KVM.Adin Scannell
By using TSC scaling as a hack, we can trick the kernel into setting an offset of exactly zero. Huzzah! PiperOrigin-RevId: 335922019
2020-10-06Implement membarrier(2) commands other than *_SYNC_CORE.Jamie Liu
Updates #267 PiperOrigin-RevId: 335713923
2020-10-06[vfs2] Don't leak reference from Mountnamespace.Root().Dean Deng
PiperOrigin-RevId: 335583637
2020-10-05Simplify nil assignment in kcov.Dean Deng
PiperOrigin-RevId: 335548610
2020-10-05Merge pull request #4079 from lemin9538:arm64_fixgVisor bot
PiperOrigin-RevId: 335532690
2020-10-05Merge pull request #4368 from zhlhahaha:1979gVisor bot
PiperOrigin-RevId: 335492800
2020-10-03Fix kcov enabling and disabling procedures.Dean Deng
- When the KCOV_ENABLE_TRACE ioctl is called with the trace kind KCOV_TRACE_PC, the kcov mode should be set to KCOV_*MODE*_TRACE_PC. - When the owning task of kcov exits, the memory mapping should not be cleared so it can be used by other tasks. - Add more tests (also tested on native Linux kcov). PiperOrigin-RevId: 335202585
2020-10-02kvm/x86: handle a case when interrupts are enabled in the kernel spaceAndrei Vagin
Before we thought that interrupts are always disabled in the kernel space, but here is a case when goruntime switches on a goroutine which has been saved in the host mode. On restore, the popf instruction is used to restore flags and this means that all flags what the goroutine has in the host mode will be restored in the kernel mode. And in the host mode, interrupts are always enabled. The long story short, we can't use the IF flag for determine whether a tasks is running in user or kernel mode. This patch reworks the code so that in userspace, the first bit of the IOPL flag will be always set. This doesn't give any new privilidges for a task because CPL in userspace is always 3. But then we can use this flag to distinguish user and kernel modes. The IOPL flag is never set in the kernel and host modes. Reported-by: syzbot+5036b325a8eb15c030cf@syzkaller.appspotmail.com Reported-by: syzbot+034d580e89ad67b8dc75@syzkaller.appspotmail.com Signed-off-by: Andrei Vagin <avagin@gmail.com>
2020-10-02Convert uses of the binary package in kernel to go-marshal.Rahat Mahmood
PiperOrigin-RevId: 335077195
2020-10-02Merge pull request #4035 from lubinszARM:pr_misc_01gVisor bot
PiperOrigin-RevId: 335051794
2020-10-01Add a verity test for modified parent Merkle fileChong Cai
When a child's root hash or its Merkle path is modified in its parent's Merkle tree file, opening the file should fail, provided the directory is verity enabled. The test for this behavior is added. PiperOrigin-RevId: 334963690
2020-09-30Merge pull request #3824 from btw616:fix/issue-3823gVisor bot
PiperOrigin-RevId: 334721453
2020-09-30ip6tables: redirect supportKevin Krakauer
Adds support for the IPv6-compatible redirect target. Redirection is a limited form of DNAT, where the destination is always the localhost. Updates #3549. PiperOrigin-RevId: 334698344
2020-09-30arm64 kvm: fix panic in kvm.dropPageTablesYi Li
Related with issue #3019, #4056. When running hello-world with gvisor-kvm, there is panic when exits: " panic: runtime error: invalid memory address or nil pointer dereference [signal SIGSEGV: segmentation violation code=0x1 addr=0x3c0 pc=0x7c3f18] goroutine 284 [running]: ... ... gvisor.dev/gvisor/pkg/sentry/platform/kvm.(*machine).dropPageTables(0x4000166840, 0x400032a040) pkg/sentry/platform/kvm/machine_arm64.go:111 +0x88 fp=0x4000479e00 sp=0x4000479da0 pc=0x7c3f18 " Also make dropPageTables() arch independent.
2020-09-30Implement ioctl with measure in verity fsChong Cai
PiperOrigin-RevId: 334682753
2020-09-30Internal change.Chong Cai
PiperOrigin-RevId: 334678513
2020-09-30Merge pull request #2256 from laijs:kptigVisor bot
PiperOrigin-RevId: 334674481
2020-09-30[go-marshal] Port ext codebase to use go marshal.Ayush Ranjan
PiperOrigin-RevId: 334656292
2020-09-30Make all Target.Action implementation pointer receiversKevin Krakauer
PiperOrigin-RevId: 334652998
2020-09-30Add verity fs testsChong Cai
The tests confirms that when a file is opened in verity, the corresponding Merkle trees are generated. Also a normal read succeeds on verity enabled files, but fails if either the verity file or the Merkle tree file is modified. PiperOrigin-RevId: 334640331
2020-09-29iptables: remove unused min/max NAT range fieldsKevin Krakauer
PiperOrigin-RevId: 334531794
2020-09-29Replace remaining uses of reflection-based marshalling.Rahat Mahmood
- Rewrite arch.Stack.{Push,Pop}. For the most part, stack now implements marshal.CopyContext and can be used as the target of marshal operations. Stack.Push had some extra logic for automatically null-terminating slices. This was only used for two specific types of slices, and is now handled explicitly. - Delete usermem.CopyObject{In,Out}. - Replace most remaining uses of the encoding/binary package with go-marshal. Most of these were using the binary package to compute the size of a struct, which go-marshal can directly replace. ~3 uses of the binary package remain. These aren't reasonably replaceable by go-marshal: for example one use is to construct the syscall trampoline for systrap. - Fill out remaining convenience wrappers in the primitive package. PiperOrigin-RevId: 334502375
2020-09-29Don't allow broadcast/multicast source addressGhanan Gowripalan
As per relevant IP RFCS (see code comments), broadcast (for IPv4) and multicast addresses are not allowed. Currently checks for these are done at the transport layer, but since it is explicitly forbidden at the IP layers, check for them there. This change also removes the UDP.InvalidSourceAddress stat since there is no longer a need for it. Test: ip_test.TestSourceAddressValidation PiperOrigin-RevId: 334490971
2020-09-29Add /proc/[pid]/cwdFabricio Voznika
PiperOrigin-RevId: 334478850
2020-09-29iptables: refactor to make targets extendableKevin Krakauer
Like matchers, targets should use a module-like register/lookup system. This replaces the brittle switch statements we had before. The only behavior change is supporing IPT_GET_REVISION_TARGET. This makes it much easier to add IPv6 redirect in the next change. Updates #3549. PiperOrigin-RevId: 334469418
2020-09-29Merge pull request #3875 from btw616:fix/issue-3874gVisor bot
PiperOrigin-RevId: 334428344
2020-09-29arm64 kvm: keep sentry-tls and usr-tls separatelyBin Lu
Currently there is a problem with the preservation of usr-tls, which leads to the contamination of sentry tls. Signed-off-by: Bin Lu <bin.lu@arm.com>
2020-09-29add related arm64 syscall for vfs2Howard Zhang
arm64 vfs2: Add support for io_submit/fallocate/ sendfile/newfstatat/readahead/fadvise64 Signed-off-by: Howard Zhang <howard.zhang@arm.com>
2020-09-28Don't leak dentries returned by sockfs.NewDentry().Jamie Liu
PiperOrigin-RevId: 334263322
2020-09-28Support inotify in overlayfs.Dean Deng
Fixes #1479, #317. PiperOrigin-RevId: 334258052
2020-09-27Fix kernfs race condition.Dean Deng
Do not release dirMu between checking whether to create a child and actually inserting it. Also fixes a bug in fusefs which was causing it to deadlock under the new lock ordering. We do not need to call kernfs.Dentry.InsertChild from newEntry because it will always be called at the kernfs filesystem layer. Updates #1193. PiperOrigin-RevId: 334049264
2020-09-27Clean up kcov.Dean Deng
Previously, we did not check the kcov mode when performing task work. As a result, disabling kcov did not do anything. Also avoid expensive atomic RMW when consuming coverage data. We don't need the swap if the value is already zero (which is most of the time), and it is ok if there are slight inconsistencies due to a race between coverage data generation (incrementing the value) and consumption (reading a nonzero value and writing zero). PiperOrigin-RevId: 334049207
2020-09-25arm64: some minor changesBin Lu
This patch adds minor changes for Arm64 platform: 1, add SetRobustList/GetRobustList support for arm64 syscall module. 2, add newfstatat support for arm64 vfs2 syscall module. 3, add tls value in ProtoBuf. Signed-off-by: Bin Lu <bin.lu@arm.com>
2020-09-25make sure use the kernel space after change ASIDMin Le
after the SWITCH_TO_APP_PAGETABLE, the ASID is changed to the application ASID, but there are still some instruction before ERET, since these instruction is not use the kernel address space, it may use the application's TLB, which will cause fault, this patch can make sure that after SWITCH_TO_APP_PAGETABLE sentry is still use kernel address space which is mapped as Global. Signed-off-by: Min Le <lemin.lm@antgroup.com>