summaryrefslogtreecommitdiffhomepage
path: root/pkg/sentry/kernel
AgeCommit message (Collapse)Author
2021-09-16Merge pull request #6579 from prattmic:runsc_do_profilegVisor bot
PiperOrigin-RevId: 397114051
2021-09-16runsc: add global profile collection flagsMichael Pratt
Add global flags -profile-{block,cpu,heap,mutex} and -trace which enable collection of the specified profile for the entire duration of a container execution. This provides a way to definitively start profiling before that application starts, rather than attempting to race with an out-of-band `runsc debug`. Note that only the main boot process is profiled. This exposed a bug in Task.traceExecEvent: a crash when tracing and -race are enabled. traceExecEvent is called off of the task goroutine, but uses the Task as a context, which is a violation of the Task contract. Switching to the AsyncContext fixes the issue. Fixes #220
2021-09-14Fix race on msgrcv(MSG_COPY).Rahat Mahmood
Previously, we weren't making a copy when a sysv message queue was receiving a message with the MSG_COPY flag. This flag indicates the message being received should be left in the queue and a copy of the message should be returned to userspace. Without the copy, a racing process can modify the original message while it's being marshalled to user memory. Reported-by: syzbot+cb15e644698b20ff4e17@syzkaller.appspotmail.com PiperOrigin-RevId: 396712856
2021-09-03Add //pkg/sentry/seccheck.Jamie Liu
This defines common infrastructure for dynamically-configured security checks, including an example usage in the clone(2) path. PiperOrigin-RevId: 394797270
2021-08-27Fix lock order violations: mm.mappingMu > Task.mu.Nicolas Lacasse
Document this ordering in mm/mm.go. PiperOrigin-RevId: 393413203
2021-08-24Merge pull request #6438 from gystemd:tcsetpgrp_SIGTTOUgVisor bot
PiperOrigin-RevId: 392774712
2021-08-17Merge pull request #6262 from sudo-sturbia:msgqueue/syscalls3gVisor bot
PiperOrigin-RevId: 391416650
2021-08-17Added a SIGTTOU block check in SetForegroundProcessGroupgystemd
2021-08-17Implement control operations on msgqueue.Zyad A. Ali
For IPCInfo, update value of MSGSEG constant in abi to avoid overflow in MsgInfo.MsgSeg. MSGSEG was originaly simplified in abi, and is unused (by us and within the kernel), so updating it is okay. Updates #135
2021-08-17Implement ipc.Object.Set and use it in ipc mechanisms.Zyad A. Ali
Set provides functionality of {sem,shm,msg}ctl(IPC_SET).
2021-08-16fix sending of SIGTTOU signal in SetForegroundProcessGroupgystemd
Changed sendSignal to sendSignalLocked because tg.pidns.owner.mu and tg.signalHandlers.mu are already locked in SetForegroundProcess Added a control to verify whether the calling process is ignoring SIGTTOU before sending the signal
2021-08-13[syserror] Remove pkg syserror.Zach Koopmans
Removes package syserror and moves still relevant code to either linuxerr or to syserr (to be later removed). Internal errors are converted from random types to *errors.Error types used in linuxerr. Internal errors are in linuxerr/internal.go. PiperOrigin-RevId: 390724202
2021-08-12[syserror] Convert remaining syserror definitions to linuxerr.Zach Koopmans
Convert remaining public errors (e.g. EINTR) from syserror to linuxerr. PiperOrigin-RevId: 390471763
2021-08-11Initial cgroupfs support for subcontainersRahat Mahmood
Allow creation and management of subcontainers through cgroupfs directory syscalls. Also add a mechanism to specify a default root container to start new jobs in. This implements the filesystem support for subcontainers, but doesn't implement hierarchical resource accounting or task migration. PiperOrigin-RevId: 390254870
2021-08-10fix missing SIGTTOU signal in SetForegroundProcessGroupgystemd
2021-08-05Correctly handle interruptions in blocking msgqueue syscalls.Rahat Mahmood
Reported-by: syzbot+63bde04529f701c76168@syzkaller.appspotmail.com Reported-by: syzbot+69866b9a16ec29993e6a@syzkaller.appspotmail.com PiperOrigin-RevId: 389084629
2021-08-03Implement MSG_COPY option for msgrcv(2).Zyad A. Ali
Implement Queue.Copy and add more tests for it. Updates #135
2021-08-03Implement stubs for msgsnd(2) and msgrcv(2).Zyad A. Ali
Add support for msgsnd and msgrcv and enable syscall tests. Updates #135
2021-08-03Implement Queue.Receive.Zyad A. Ali
Receive implements the behaviour of msgrcv(2) without the MSG_COPY flag. Updates #135
2021-08-03Implement Queue.Send.Zyad A. Ali
Send implements the functionality of msgsnd(2). Updates #135
2021-07-30Merge pull request #6257 from zhlhahaha:2193-1gVisor bot
PiperOrigin-RevId: 387885663
2021-07-27Don't create an extra fd bitmap to allocate a new fd.Andrei Vagin
2021-07-23Don't panic on user-controlled state in semaphore syscalls.Rahat Mahmood
Reported-by: syzbot+beb099a67f670386a367@syzkaller.appspotmail.com PiperOrigin-RevId: 386521361
2021-07-22Merge pull request #6108 from sudo-sturbia:msgqueue/syscallsgVisor bot
PiperOrigin-RevId: 386323389
2021-07-22Replace kernel package types for clone and unshare with linux package types.Jamie Liu
PiperOrigin-RevId: 386312456
2021-07-20Add go:build directives as required by Go 1.17's gofmt.Jamie Liu
PiperOrigin-RevId: 385894869
2021-07-13Implement stubs for msgget(2) and msgctl(IPC_RMID).Zyad A. Ali
Add support for msgget, and msgctl(IPC_RMID), and enable msgqueue syscall tests. Updates #135
2021-07-13Implement Registry.Remove.Zyad A. Ali
Remove implements the behaviour or msgctl(IPC_RMID). Updates #135
2021-07-13Implement Registry.FindOrCreate.Zyad A. Ali
FindOrCreate implements the behaviour of msgget(2). Updates #135
2021-07-13Create package msgqueue.Zyad A. Ali
Create package msgqueue, define primitives to be used for message queues, and add a msgqueue.Registry to IPCNamespace. Updates #135
2021-07-13Create ipc.Registry.Zyad A. Ali
Create ipc.Registry to hold fields, and define functionality common to all SysV registries, and have registries use it.
2021-07-13Create ipc package and ipc.Object.Zyad A. Ali
Create ipc.Object to define fields and functionality used in SysV mechanisms, and have them use it.
2021-07-13apply bitmap for fd_tableHoward Zhang
Apply bitmap in fd_table to record open file fd. It can accelerate the speed of allocating or removing fd from fdtable. Signed-off-by: Howard Zhang <howard.zhang@arm.com>
2021-07-12Fix deadlock in procfsFabricio Voznika
Kernfs provides an internal mechanism to defer calls to `DecRef()` because on the last reference `Filesystem.mu` must be held and most places that need to call `DecRef()` are inside the lock. The same can be true for filesystems that extend kernfs. procfs needs to look up files and `DecRef()` them inside the `kernfs.Filesystem.mu`. If the files happen to be procfs files, it can deadlock trying to decrement if it's the last reference. This change extends the mechanism to external callers to defer DecRefs to `vfs.FileDescription` and `vfs.VirtualDentries`. PiperOrigin-RevId: 384361647
2021-07-12[syserror] Update syserror to linuxerr for more errors.Zach Koopmans
Update the following from syserror to the linuxerr equivalent: EEXIST EFAULT ENOTDIR ENOTTY EOPNOTSUPP ERANGE ESRCH PiperOrigin-RevId: 384329869
2021-07-09Drop unnecessary checklocksignore.Adin Scannell
PiperOrigin-RevId: 383940663
2021-07-08Replace kernel.ExitStatus with linux.WaitStatus.Jamie Liu
PiperOrigin-RevId: 383705129
2021-07-01Mix checklocks and atomic analyzers.Adin Scannell
This change makes the checklocks analyzer considerable more powerful, adding: * The ability to traverse complex structures, e.g. to have multiple nested fields as part of the annotation. * The ability to resolve simple anonymous functions and closures, and perform lock analysis across these invocations. This does not apply to closures that are passed elsewhere, since it is not possible to know the context in which they might be invoked. * The ability to annotate return values in addition to receivers and other parameters, with the same complex structures noted above. * Ignoring locking semantics for "fresh" objects, i.e. objects that are allocated in the local frame (typically a new-style function). * Sanity checking of locking state across block transitions and returns, to ensure that no unexpected locks are held. Note that initially, most of these findings are excluded by a comprehensive nogo.yaml. The findings that are included are fundamental lock violations. The changes here should be relatively low risk, minor refactorings to either include necessary annotations to simplify the code structure (in general removing closures in favor of methods) so that the analyzer can be easily track the lock state. This change additional includes two changes to nogo itself: * Sanity checking of all types to ensure that the binary and ast-derived types have a consistent objectpath, to prevent the bug above from occurring silently (and causing much confusion). This also requires a trick in order to ensure that serialized facts are consumable downstream. This can be removed with https://go-review.googlesource.com/c/tools/+/331789 merged. * A minor refactoring to isolation the objdump settings in its own package. This was originally used to implement the sanity check above, but this information is now being passed another way. The minor refactor is preserved however, since it cleans up the code slightly and is minimal risk. PiperOrigin-RevId: 382613300
2021-07-01[syserror] Update several syserror errors to linuxerr equivalents.Zach Koopmans
Update/remove most syserror errors to linuxerr equivalents. For list of removed errors, see //pkg/syserror/syserror.go. PiperOrigin-RevId: 382574582
2021-06-30[syserror] Update syserror to linuxerr for EACCES, EBADF, and EPERM.Zach Koopmans
Update all instances of the above errors to the faster linuxerr implementation. With the temporary linuxerr.Equals(), no logical changes are made. PiperOrigin-RevId: 382306655
2021-06-29[syserror] Change syserror to linuxerr for E2BIG, EADDRINUSE, and EINVALZach Koopmans
Remove three syserror entries duplicated in linuxerr. Because of the linuxerr.Equals method, this is a mere change of return values from syserror to linuxerr definitions. Done with only these three errnos as CLs removing all grow to a significantly large size. PiperOrigin-RevId: 382173835
2021-06-24CreateProcessGroup has to check whether a target process stil exists or notAndrei Vagin
A caller of CreateProcessGroup looks up a thread group without locks, so the target process can exit before CreateProcessGroup will be called. Reported-by: syzbot+6abb7c34663dacbd55a8@syzkaller.appspotmail.com PiperOrigin-RevId: 381351069
2021-06-23Fix PR_SET_PTRACER applicability to non-leader threads.Jamie Liu
Compare if (!thread_group_leader(tracee)) tracee = rcu_dereference(tracee->group_leader); in security/yama/yama_lsm.c:ptracer_exception_found(). PiperOrigin-RevId: 381074242
2021-06-22[syserror] Add conversions to linuxerr with temporary Equals method.Zach Koopmans
Add Equals method to compare syserror and unix.Errno errors to linuxerr errors. This will facilitate removal of syserror definitions in a followup, and finding needed conversions from unix.Errno to linuxerr. PiperOrigin-RevId: 380909667
2021-06-17Move tcpip.Clock impl to TimekeeperTamir Duberstein
...and pass it explicitly. This reverts commit b63e61828d0652ad1769db342c17a3529d2d24ed. PiperOrigin-RevId: 380039167
2021-06-16[syserror] Refactor linuxerr and error package.Zach Koopmans
Move Error struct to pkg/errors package for use in multiple places. Move linuxerr static definitions under pkg/errors/linuxerr. Add a lookup list for quick lookup of *errors.Error by errno. This is useful when converting syserror errors and unix.Errno/syscall.Errrno values to *errors.Error. Update benchmarks routines to include conversions. The below benchmarks show *errors.Error usage to be comparable to using unix.Errno. BenchmarkAssignUnix BenchmarkAssignUnix-32 787875022 1.284 ns/op BenchmarkAssignLinuxerr BenchmarkAssignLinuxerr-32 1000000000 1.209 ns/op BenchmarkAssignSyserror BenchmarkAssignSyserror-32 759269229 1.429 ns/op BenchmarkCompareUnix BenchmarkCompareUnix-32 1000000000 1.310 ns/op BenchmarkCompareLinuxerr BenchmarkCompareLinuxerr-32 1000000000 1.241 ns/op BenchmarkCompareSyserror BenchmarkCompareSyserror-32 147196165 8.248 ns/op BenchmarkSwitchUnix BenchmarkSwitchUnix-32 373233556 3.664 ns/op BenchmarkSwitchLinuxerr BenchmarkSwitchLinuxerr-32 476323929 3.294 ns/op BenchmarkSwitchSyserror BenchmarkSwitchSyserror-32 39293408 29.62 ns/op BenchmarkReturnUnix BenchmarkReturnUnix-32 1000000000 0.5042 ns/op BenchmarkReturnLinuxerr BenchmarkReturnLinuxerr-32 1000000000 0.8152 ns/op BenchmarkConvertUnixLinuxerr BenchmarkConvertUnixLinuxerr-32 739948875 1.547 ns/op BenchmarkConvertUnixLinuxerrZero BenchmarkConvertUnixLinuxerrZero-32 977733974 1.489 ns/op PiperOrigin-RevId: 379806801
2021-06-14Cleanup iptables bug TODOsKevin Krakauer
There are many references to unimplemented iptables features that link to #170, but that bug is about Istio support specifically. Istio is supported, so the references should change. Some TODOs are addressed, some removed because they are not features requested by users, and some are left as implementation notes. Fixes #170. PiperOrigin-RevId: 379328488
2021-06-13Remove usermem dependency from marshalIan Lewis
Both marshal and usermem are depended on by many packages and a dependency on marshal can often create circular dependencies. marshal should consider adding internal dependencies carefully moving forward. Fixes #6160 PiperOrigin-RevId: 379199882
2021-06-10Report task exit in /proc/[pid]/{stat,status} before task goroutine exit.Jamie Liu
Between when runExitNotify.execute() returns nil (indicating that the task goroutine should exit) and when Task.run() advances Task.gosched.State to TaskGoroutineNonexistent (indicating that the task goroutine is exiting), there is a race window in which the Task is waitable (since TaskSet.mu is unlocked and Task.exitParentNotified is true) but will be reported by /proc/[pid]/status as running. Close the window by checking Task.exitState before task goroutine exit. PiperOrigin-RevId: 378711484
2021-06-10[op] Move SignalInfo to abi/linux package.Ayush Ranjan
Fixes #214 PiperOrigin-RevId: 378680466