Age | Commit message (Collapse) | Author |
|
Merkle tree library was originally using Read/Seek to access data and
tree, since the parameters are io.ReadSeeker. This could cause race
conditions if multiple threads accesses the same fd to read. Here we
change to use ReaderAt, and implement it with PRead to make it thread
safe.
PiperOrigin-RevId: 336779260
|
|
This change aims to fix the memory leak issue reported inĀ #3933.
Background:
VFS2 kernfs kept accumulating invalid dentries if those dentries were not
walked on. After substantial consideration of the problem by our team, we
decided to have an LRU cache solution. This change is the first part to that
solution, where we don't cache anything. The LRU cache can be added on top of
this.
What has changed:
- Introduced the concept of an inode tree in kernfs.OrderedChildren.
This is helpful is cases where the lifecycle of an inode is different from
that of a dentry.
- OrderedChildren now deals with initialized inodes instead of initialized
dentries. It now implements Lookup() where it constructs a new dentry
using the inode.
- OrderedChildren holds a ref on all its children inodes. With this change,
now an inode can "outlive" a dentry pointing to it. See comments in
kernfs.OrderedChildren.
- The kernfs dentry tree is solely maintained by kernfs only. Inode
implementations can not modify the dentry tree.
- Dentries that reach ref count 0 are removed from the dentry tree.
- revalidateChildLocked now defer-DecRefs the newly created dentry from
Inode.Lookup(), limiting its life to the current filesystem operation. If
refs are picked on the dentry during the FS op (via an FD or something),
then it will stick around and will be removed when the FD is closed. So there
is essentially _no caching_ for Look()ed up dentries.
- kernfs.DecRef does not have the precondition that fs.mu must be locked.
Fixes #3933
PiperOrigin-RevId: 336768576
|
|
PiperOrigin-RevId: 336719900
|
|
PiperOrigin-RevId: 336694658
|
|
PiperOrigin-RevId: 336395445
|
|
PiperOrigin-RevId: 336366624
|
|
PiperOrigin-RevId: 336362818
|
|
- sysinfo(2) does not actually require a fine-grained breakdown of memory
usage. Accordingly, instead of calling pgalloc.MemoryFile.UpdateUsage() to
update the sentry's fine-grained memory accounting snapshot, just use
pgalloc.MemoryFile.TotalUsage() (which is a single fstat(), and therefore far
cheaper).
- Use the number of threads in the root PID namespace (i.e. globally) rather
than in the task's PID namespace for consistency with Linux (which just reads
global variable nr_threads), and add a new method to kernel.PIDNamespace to
allow this to be read directly from an underlying map rather than requiring
the allocation and population of an intermediate slice.
PiperOrigin-RevId: 336353100
|
|
PiperOrigin-RevId: 336339194
|
|
Reported-by: syzbot+bb82fb556d5d0a43f632@syzkaller.appspotmail.com
PiperOrigin-RevId: 336324720
|
|
PiperOrigin-RevId: 336304024
|
|
cf. 2a36ab717e8f "rseq/membarrier: Add MEMBARRIER_CMD_PRIVATE_EXPEDITED_RSEQ"
PiperOrigin-RevId: 336186795
|
|
When a response needs to be sent to an incoming packet, the stack should
consult its neighbour table to determine the remote address's link
address.
When an entry does not exist in the stack's neighbor table, the stack
should queue the packet while link resolution completes. See comments.
PiperOrigin-RevId: 336185457
|
|
the correct value needed is 0xbbff440c0400 but the const
defined is 0x000000000000ffc0 due to the operator error
in _MT_EL1_INIT, both kernel and user space memory
attribute should be Normal memory not DEVICE_nGnRE
Signed-off-by: Min Le <lemin.lm@antgroup.com>
|
|
This change also adds support to go_stateify for detecting an appropriate
receiver name, avoiding a large number of false positives.
PiperOrigin-RevId: 335994587
|
|
PiperOrigin-RevId: 335930035
|
|
By using TSC scaling as a hack, we can trick the kernel into setting an offset
of exactly zero. Huzzah!
PiperOrigin-RevId: 335922019
|
|
Updates #267
PiperOrigin-RevId: 335713923
|
|
PiperOrigin-RevId: 335583637
|
|
PiperOrigin-RevId: 335548610
|
|
PiperOrigin-RevId: 335532690
|
|
PiperOrigin-RevId: 335492800
|
|
- When the KCOV_ENABLE_TRACE ioctl is called with the trace kind KCOV_TRACE_PC,
the kcov mode should be set to KCOV_*MODE*_TRACE_PC.
- When the owning task of kcov exits, the memory mapping should not be cleared
so it can be used by other tasks.
- Add more tests (also tested on native Linux kcov).
PiperOrigin-RevId: 335202585
|
|
Before we thought that interrupts are always disabled in the kernel
space, but here is a case when goruntime switches on a goroutine which
has been saved in the host mode. On restore, the popf instruction is
used to restore flags and this means that all flags what the goroutine
has in the host mode will be restored in the kernel mode. And in the
host mode, interrupts are always enabled.
The long story short, we can't use the IF flag for determine whether a
tasks is running in user or kernel mode.
This patch reworks the code so that in userspace, the first bit of the
IOPL flag will be always set. This doesn't give any new privilidges for
a task because CPL in userspace is always 3. But then we can use this
flag to distinguish user and kernel modes. The IOPL flag is never set in
the kernel and host modes.
Reported-by: syzbot+5036b325a8eb15c030cf@syzkaller.appspotmail.com
Reported-by: syzbot+034d580e89ad67b8dc75@syzkaller.appspotmail.com
Signed-off-by: Andrei Vagin <avagin@gmail.com>
|
|
PiperOrigin-RevId: 335077195
|
|
PiperOrigin-RevId: 335051794
|
|
When a child's root hash or its Merkle path is modified in its parent's
Merkle tree file, opening the file should fail, provided the directory
is verity enabled. The test for this behavior is added.
PiperOrigin-RevId: 334963690
|
|
PiperOrigin-RevId: 334721453
|
|
Adds support for the IPv6-compatible redirect target. Redirection is a limited
form of DNAT, where the destination is always the localhost.
Updates #3549.
PiperOrigin-RevId: 334698344
|
|
Related with issue #3019, #4056.
When running hello-world with gvisor-kvm, there is panic when exits:
"
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x3c0 pc=0x7c3f18]
goroutine 284 [running]:
... ...
gvisor.dev/gvisor/pkg/sentry/platform/kvm.(*machine).dropPageTables(0x4000166840, 0x400032a040)
pkg/sentry/platform/kvm/machine_arm64.go:111 +0x88 fp=0x4000479e00 sp=0x4000479da0 pc=0x7c3f18
"
Also make dropPageTables() arch independent.
|
|
PiperOrigin-RevId: 334682753
|
|
PiperOrigin-RevId: 334678513
|
|
PiperOrigin-RevId: 334674481
|
|
PiperOrigin-RevId: 334656292
|
|
PiperOrigin-RevId: 334652998
|
|
The tests confirms that when a file is opened in verity, the
corresponding Merkle trees are generated. Also a normal read succeeds on
verity enabled files, but fails if either the verity file or the Merkle
tree file is modified.
PiperOrigin-RevId: 334640331
|
|
PiperOrigin-RevId: 334531794
|
|
- Rewrite arch.Stack.{Push,Pop}. For the most part, stack now
implements marshal.CopyContext and can be used as the target of
marshal operations. Stack.Push had some extra logic for
automatically null-terminating slices. This was only used for two
specific types of slices, and is now handled explicitly.
- Delete usermem.CopyObject{In,Out}.
- Replace most remaining uses of the encoding/binary package with
go-marshal. Most of these were using the binary package to compute
the size of a struct, which go-marshal can directly replace. ~3 uses
of the binary package remain. These aren't reasonably replaceable by
go-marshal: for example one use is to construct the syscall
trampoline for systrap.
- Fill out remaining convenience wrappers in the primitive package.
PiperOrigin-RevId: 334502375
|
|
As per relevant IP RFCS (see code comments), broadcast (for IPv4) and
multicast addresses are not allowed. Currently checks for these are
done at the transport layer, but since it is explicitly forbidden at
the IP layers, check for them there.
This change also removes the UDP.InvalidSourceAddress stat since there
is no longer a need for it.
Test: ip_test.TestSourceAddressValidation
PiperOrigin-RevId: 334490971
|
|
PiperOrigin-RevId: 334478850
|
|
Like matchers, targets should use a module-like register/lookup system. This
replaces the brittle switch statements we had before.
The only behavior change is supporing IPT_GET_REVISION_TARGET. This makes it
much easier to add IPv6 redirect in the next change.
Updates #3549.
PiperOrigin-RevId: 334469418
|
|
PiperOrigin-RevId: 334428344
|
|
Currently there is a problem with the preservation of usr-tls, which leads
to the contamination of sentry tls.
Signed-off-by: Bin Lu <bin.lu@arm.com>
|
|
arm64 vfs2: Add support for io_submit/fallocate/
sendfile/newfstatat/readahead/fadvise64
Signed-off-by: Howard Zhang <howard.zhang@arm.com>
|
|
PiperOrigin-RevId: 334263322
|
|
Fixes #1479, #317.
PiperOrigin-RevId: 334258052
|
|
Do not release dirMu between checking whether to create a child and actually
inserting it.
Also fixes a bug in fusefs which was causing it to deadlock under the new
lock ordering. We do not need to call kernfs.Dentry.InsertChild from newEntry
because it will always be called at the kernfs filesystem layer.
Updates #1193.
PiperOrigin-RevId: 334049264
|
|
Previously, we did not check the kcov mode when performing task work. As a
result, disabling kcov did not do anything.
Also avoid expensive atomic RMW when consuming coverage data. We don't need the
swap if the value is already zero (which is most of the time), and it is ok if
there are slight inconsistencies due to a race between coverage data generation
(incrementing the value) and consumption (reading a nonzero value and writing
zero).
PiperOrigin-RevId: 334049207
|
|
This patch adds minor changes for Arm64 platform:
1, add SetRobustList/GetRobustList support for arm64 syscall module.
2, add newfstatat support for arm64 vfs2 syscall module.
3, add tls value in ProtoBuf.
Signed-off-by: Bin Lu <bin.lu@arm.com>
|
|
after the SWITCH_TO_APP_PAGETABLE, the ASID is changed
to the application ASID, but there are still some
instruction before ERET, since these instruction is
not use the kernel address space, it may use the application's
TLB, which will cause fault, this patch can make sure that
after SWITCH_TO_APP_PAGETABLE sentry is still use kernel
address space which is mapped as Global.
Signed-off-by: Min Le <lemin.lm@antgroup.com>
|