summaryrefslogtreecommitdiffhomepage
path: root/pkg/sentry/kernel
AgeCommit message (Collapse)Author
2020-08-26Merge release-20200818.0-56-gdf3c105f4 (automated)gVisor bot
2020-08-25Use new reference count utility throughout gvisor.Dean Deng
This uses the refs_vfs2 template in vfs2 as well as objects common to vfs1 and vfs2. Note that vfs1-only refcounts are not replaced, since vfs1 will be deleted soon anyway. The following structs now use the new tool, with leak check enabled: devpts:rootInode fuse:inode kernfs:Dentry kernfs:dir kernfs:readonlyDir kernfs:StaticDirectory proc:fdDirInode proc:fdInfoDirInode proc:subtasksInode proc:taskInode proc:tasksInode vfs:FileDescription vfs:MountNamespace vfs:Filesystem sys:dir kernel:FSContext kernel:ProcessGroup kernel:Session shm:Shm mm:aioMappable mm:SpecialMappable transport:queue And the following use the template, but because they currently are not leak checked, a TODO is left instead of enabling leak check in this patch: kernel:FDTable tun:tunEndpoint Updates #1486. PiperOrigin-RevId: 328460377
2020-08-25Merge release-20200818.0-54-gcb573c8e0 (automated)gVisor bot
2020-08-25Expose basic coverage information to userspace through kcov interface.Dean Deng
In Linux, a kernel configuration is set that compiles the kernel with a custom function that is called at the beginning of every basic block, which updates the memory-mapped coverage information. The Go coverage tool does not allow us to inject arbitrary instructions into basic blocks, but it does provide data that we can convert to a kcov-like format and transfer them to userspace through a memory mapping. Note that this is not a strict implementation of kcov, which is especially tricky to do because we do not have the same coverage tools available in Go that that are available for the actual Linux kernel. In Linux, a kernel configuration is set that compiles the kernel with a custom function that is called at the beginning of every basic block to write program counters to the kcov memory mapping. In Go, however, coverage tools only give us a count of basic blocks as they are executed. Every time we return to userspace, we collect the coverage information and write out PCs for each block that was executed, providing userspace with the illusion that the kcov data is always up to date. For convenience, we also generate a unique synthetic PC for each block instead of using actual PCs. Finally, we do not provide thread-specific coverage data (each kcov instance only contains PCs executed by the thread owning it); instead, we will supply data for any file specified by -- instrumentation_filter. Also, fix issue in nogo that was causing pkg/coverage:coverage_nogo compilation to fail. PiperOrigin-RevId: 328426526
2020-08-21Merge release-20200810.0-84-g5f33fdf37 (automated)gVisor bot
2020-08-21Pass overlay credentials via context in copy up.Nicolas Lacasse
Some VFS operations (those which operate on FDs) get their credentials via the context instead of via an explicit creds param. For these cases, we must pass the overlay credentials on the context. PiperOrigin-RevId: 327881259
2020-08-20Merge release-20200810.0-74-g129018ab3 (automated)gVisor bot
2020-08-20Consistent precondition formattingMichael Pratt
Our "Preconditions:" blocks are very useful to determine the input invariants, but they are bit inconsistent throughout the codebase, which makes them harder to read (particularly cases with 5+ conditions in a single paragraph). I've reformatted all of the cases to fit in simple rules: 1. Cases with a single condition are placed on a single line. 2. Cases with multiple conditions are placed in a bulleted list. This format has been added to the style guide. I've also mentioned "Postconditions:", though those are much less frequently used, and all uses already match this style. PiperOrigin-RevId: 327687465
2020-08-19Merge release-20200810.0-57-gf2822da54 (automated)gVisor bot
2020-08-18Move ERESTART* error definitions to syserror package.Dean Deng
This is needed to avoid circular dependencies between the vfs and kernel packages. PiperOrigin-RevId: 327355524
2020-08-17Merge release-20200810.0-39-g3bd066d50 (automated)gVisor bot
2020-08-17Remove weak references from unix sockets.Dean Deng
The abstract socket namespace no longer holds any references on sockets. Instead, TryIncRef() is used when a socket is being retrieved in BoundEndpoint(). Abstract sockets are now responsible for removing themselves from the namespace they are in, when they are destroyed. Updates #1486. PiperOrigin-RevId: 327064173
2020-08-08Merge release-20200804.0-53-g13a8ae81b (automated)gVisor bot
2020-08-07Add context.FullStateChanged()Andrei Vagin
It indicates that the Sentry has changed the state of the thread and next calls of PullFullState() has to do nothing. PiperOrigin-RevId: 325567415
2020-08-04Merge release-20200622.1-318-g25798f214 (automated)gVisor bot
2020-08-03Add callbacks to support lazy loading/restoring thread statesAndrei Vagin
PiperOrigin-RevId: 324748508
2020-08-03Merge release-20200622.1-313-gb2ae7ea1b (automated)gVisor bot
2020-08-03Plumbing context.Context to DecRef() and Release().Nayana Bidari
context is passed to DecRef() and Release() which is needed for SO_LINGER implementation. PiperOrigin-RevId: 324672584
2020-07-27Merge release-20200622.1-232-gf347a578b (automated)gVisor bot
2020-07-27Move platform.File in memmapAndrei Vagin
The subsequent systrap changes will need to import memmap from the platform package. PiperOrigin-RevId: 323409486
2020-07-24Merge release-20200622.1-208-g82a5cada5 (automated)gVisor bot
2020-07-23Add AfterFunc to tcpip.ClockSam Balana
Changes the API of tcpip.Clock to also provide a method for scheduling and rescheduling work after a specified duration. This change also implements the AfterFunc method for existing implementations of tcpip.Clock. This is the groundwork required to mock time within tests. All references to CancellableTimer has been replaced with the tcpip.Job interface, allowing for custom implementations of scheduling work. This is a BREAKING CHANGE for clients that implement their own tcpip.Clock or use tcpip.CancellableTimer. Migration plan: 1. Add AfterFunc(d, f) to tcpip.Clock 2. Replace references of tcpip.CancellableTimer with tcpip.Job 3. Replace calls to tcpip.CancellableTimer#StopLocked with tcpip.Job#Cancel 4. Replace calls to tcpip.CancellableTimer#Reset with tcpip.Job#Schedule 5. Replace calls to tcpip.NewCancellableTimer with tcpip.NewJob. PiperOrigin-RevId: 322906897
2020-07-24Merge release-20200622.1-207-g4ec351633 (automated)gVisor bot
2020-07-23Implement get/set_robust_list.Nicolas Lacasse
PiperOrigin-RevId: 322904430
2020-07-23Merge release-20200622.1-203-g8fed97794 (automated)gVisor bot
2020-07-23Add task work mechanism.Dean Deng
Like task_work in Linux, this allows us to register callbacks to be executed before returning to userspace. This is needed for kcov support, which requires coverage information to be up-to-date whenever we are in user mode. We will provide coverage data through the kcov interface to enable coverage-directed fuzzing in syzkaller. One difference from Linux is that task work cannot queue work before the transition to userspace that it precedes; queued work will be picked up before the next transition. PiperOrigin-RevId: 322889984
2020-07-15Merge release-20200622.1-160-g8939fae0a (automated)gVisor bot
2020-07-15Merge pull request #3165 from ridwanmsharif:ridwanmsharif/fuse-off-by-defaultgVisor bot
PiperOrigin-RevId: 321411758
2020-07-13Merge release-20200622.1-108-g59a547940 (automated)gVisor bot
2020-07-13Disable debug time adjustment loggingFabricio Voznika
When --debug is enabled, the following log messages are printed every second filling up the log: D0430 18:04:42.823775 129561 parameters.go:238] Clock(Monotonic): error: 46 ns, adjusted frequency from 3591713733 Hz to 3591714196 Hz D0430 18:04:42.823870 129561 parameters.go:238] Clock(Realtime): error: 36 ns, adjusted frequency from 3591714003 Hz to 3591714169 Hz D0430 18:04:42.823892 129561 timekeeper.go:209] Updating VDSO parameters: {monotonicReady:1 monotonicBaseCycles:15758797714254696 monotonicBaseRef:29000233837 monotonicFrequency:3591714196 realtimeReady:1 realtimeBaseCycles:15758797714610880 realtimeBaseRef:1588269882823867374 realtimeFrequency:3591714169} Info and warning messages for larger changes are kept the same. PiperOrigin-RevId: 321048523
2020-07-09Gate FUSE behind a runsc flagRidwan Sharif
This change gates all FUSE commands (by gating /dev/fuse) behind a runsc flag. In order to use FUSE commands, use the --fuse flag with the --vfs2 flag. Check if FUSE is enabled by running dmesg in the sandbox.
2020-07-01Merge release-20200622.1-52-g6a90c88b9 (automated)gVisor bot
2020-07-01Port fallocate to VFS2.Zach Koopmans
PiperOrigin-RevId: 319283715
2020-07-01Merge release-20200622.1-49-gcda2979b6 (automated)gVisor bot
2020-07-01Complete async signal delivery support in vfs2.Dean Deng
- Support FIOASYNC, FIO{SET,GET}OWN, SIOC{G,S}PGRP (refactor getting/setting owner in the process). - Unset signal recipient when setting owner with pid == 0 and valid owner type. Updates #2923. PiperOrigin-RevId: 319231420
2020-06-28Merge release-20200622.1-39-ge8f1a5c1f (automated)gVisor bot
2020-06-27Port GETOWN, SETOWN fcntls to vfs2.Dean Deng
Also make some fixes to vfs1's F_SETOWN. The fcntl test now entirely passes on vfs2. Fixes #2920. PiperOrigin-RevId: 318669529
2020-06-26Merge release-20200622.1-31-g9cfc15497 (automated)gVisor bot
2020-06-26Require CAP_SYS_ADMIN in the root user namespace for TTY theftKevin Krakauer
PiperOrigin-RevId: 318563543
2020-06-25Merge release-20200622.1-22-g406946187 (automated)gVisor bot
2020-06-25Avoid an allocation in epollTamir Duberstein
PiperOrigin-RevId: 318346153
2020-06-24Merge release-20200608.0-121-g10930b0f8 (automated)gVisor bot
2020-06-24Remove waiter.Entry.ContextTamir Duberstein
This field is redundant since state can be stored in the callback. PiperOrigin-RevId: 318134855
2020-06-24Merge release-20200608.0-119-g364ac92ba (automated)gVisor bot
2020-06-23Support for saving pointers to fields in the state package.Adin Scannell
Previously, it was not possible to encode/decode an object graph which contained a pointer to a field within another type. This was because the encoder was previously unable to disambiguate a pointer to an object and a pointer within the object. This CL remedies this by constructing an address map tracking the full memory range object occupy. The encoded Refvalue message has been extended to allow references to children objects within another object. Because the encoding process may learn about object structure over time, we cannot encode any objects under the entire graph has been generated. This CL also updates the state package to use standard interfaces intead of reflection-based dispatch in order to improve performance overall. This includes a custom wire protocol to significantly reduce the number of allocations and take advantage of structure packing. As part of these changes, there are a small number of minor changes in other places of the code base: * The lists used during encoding are changed to use intrusive lists with the objectEncodeState directly, which required that the ilist Len() method is updated to work properly with the ElementMapper mechanism. * A bug is fixed in the list code wherein Remove() called on an element that is already removed can corrupt the list (removing the element if there's only a single element). Now the behavior is correct. * Standard error wrapping is introduced. * Compressio was updated to implement the new wire.Reader and wire.Writer inteface methods directly. The lack of a ReadByte and WriteByte caused issues not due to interface dispatch, but because underlying slices for a Read or Write call through an interface would always escape to the heap! * Statify has been updated to support the new APIs. See README.md for a description of how the new mechanism works. PiperOrigin-RevId: 318010298
2020-06-17Merge release-20200608.0-69-g96519e2c9 (automated)gVisor bot
2020-06-17Implement POSIX locksFabricio Voznika
- Change FileDescriptionImpl Lock/UnlockPOSIX signature to take {start,length,whence}, so the correct offset can be calculated in the implementations. - Create PosixLocker interface to make it possible to share the same locking code from different implementations. Closes #1480 PiperOrigin-RevId: 316910286
2020-06-16Merge release-20200608.0-63-g810748f5c (automated)gVisor bot
2020-06-16Port aio to VFS2.Nicolas Lacasse
In order to make sure all aio goroutines have stopped during S/R, a new WaitGroup was added to TaskSet, analagous to runningGoroutines. This WaitGroup is incremented with each aio goroutine, and waited on during kernel.Pause. The old VFS1 aio code was changed to use this new WaitGroup, rather than fs.Async. The only uses of fs.Async are now inode and mount Release operations, which do not call fs.Async recursively. This fixes a lock-ordering violation that can cause deadlocks. Updates #1035. PiperOrigin-RevId: 316689380
2020-06-16Merge release-20200608.0-62-g3b0b1f104 (automated)gVisor bot