summaryrefslogtreecommitdiffhomepage
path: root/pkg/sentry
AgeCommit message (Collapse)Author
2018-12-06Fix tcpip.Endpoint.Write contract regarding short writesIan Gudger
* Clarify tcpip.Endpoint.Write contract regarding short writes. * Enforce tcpip.Endpoint.Write contract regarding short writes. * Update relevant users of tcpip.Endpoint.Write. PiperOrigin-RevId: 224377586 Change-Id: I24299ecce902eb11317ee13dae3b8d8a7c5b097d
2018-12-06Add counters for memory events.Rahat Mahmood
Also ensure an event is emitted at startup. PiperOrigin-RevId: 224372065 Change-Id: I5f642b6d6b13c6468ee8f794effe285fcbbf29cf
2018-12-06Fixing O_TRUNC behavior to match Linux.Zach Koopmans
PiperOrigin-RevId: 224351139 Change-Id: I9453bd75e5a8d38db406bb47fdc01038ac60922e
2018-12-05Enforce directory accessibility before delete WalkMichael Pratt
By Walking before checking that the directory is writable and executable, MayDelete may return the Walk error (e.g., ENOENT) which would normally be masked by a permission error (EACCES). PiperOrigin-RevId: 224222453 Change-Id: I108a7f730e6bdaa7f277eaddb776267c00805475
2018-12-05Update MM.usageAS when mremap copies or moves a mapping.Jamie Liu
PiperOrigin-RevId: 224221509 Change-Id: I7aaea74629227d682786d3e435737364921249bf
2018-12-05Add context to mount errorsMichael Pratt
This makes it more obvious why a mount failed. PiperOrigin-RevId: 224203880 Change-Id: I7961774a7b6fdbb5493a791f8b3815c49b8f7631
2018-12-05Check for CAP_SYS_RESOURCE in prctl(PR_SET_MM, ...)Zach Koopmans
If sys_prctl is called with PR_SET_MM without CAP_SYS_RESOURCE, the syscall should return failure with errno set to EPERM. See: http://man7.org/linux/man-pages/man2/prctl.2.html PiperOrigin-RevId: 224182874 Change-Id: I630d1dd44af8b444dd16e8e58a0764a0cf1ad9a3
2018-12-04Remove initRegs arg from cloneMichael Pratt
It is always the same as t.initRegs. PiperOrigin-RevId: 224085550 Change-Id: I5cc4ddc3b481d4748c3c43f6f4bb50da1dbac694
2018-12-04Partial writes should loop in rpcinet.Brian Geffon
FileOperations.Write should return ErrWouldBlock to allow the upper layer to loop and sendmsg should continue writing where it left off on a partial write. PiperOrigin-RevId: 224081631 Change-Id: Ic61f6943ea6b7abbd82e4279decea215347eac48
2018-12-04Linkat(2) should sanity check flags.Brian Geffon
PiperOrigin-RevId: 224047765 Change-Id: I6f3c75b33c32bf8f8910ea3fab35406d7d672d87
2018-12-04Max link traversals should be for an entire path.Brian Geffon
The number of symbolic links that are allowed to be followed are for a full path and not just a chain of symbolic links. PiperOrigin-RevId: 224047321 Change-Id: I5e3c4caf66a93c17eeddcc7f046d1e8bb9434a40
2018-12-04sentry: save / restore netstack procfs configuration.Zhaozhong Ni
PiperOrigin-RevId: 224047120 Change-Id: Ia6cb17fa978595cd73857b6178c4bdba401e185e
2018-12-04Enforce name length restriction on paths.Brian Geffon
NAME_LENGTH must be enforced per component. PiperOrigin-RevId: 224046749 Change-Id: Iba8105b00d951f2509dc768af58e4110dafbe1c9
2018-12-04Fix mempolicy_test on bazel.Rahat Mahmood
Bazel runs multiple test cases on the same thread. Some of the test cases rely on the test thread starting with the default memory policy, while other tests modify the test thread's memory policy. This obviously breaks when the test framework doesn't run each test case on a new thread. Also fixing an incompatibility where set_mempolicy(2) was prevented from specifying an empty nodemask, which is allowed for some modes. PiperOrigin-RevId: 224038957 Change-Id: Ibf780766f2706ebc9b129dbc8cf1b85c2a275074
2018-12-04Fix data race caused by unlocked call of Dirent.descendantOf.Nicolas Lacasse
PiperOrigin-RevId: 224025363 Change-Id: I98864403c779832e9e1436f7d3c3f6fb2fba9904
2018-12-03Return an int32 for netlink SO_RCVBUFIan Gudger
Untyped integer constants default to type int and the binary package will panic if one tries to encode an int. PiperOrigin-RevId: 223890001 Change-Id: Iccc3afd6d74bad24c35d764508e450fd317b76ec
2018-11-27Fix data race in fs.Async.Nicolas Lacasse
Replaces the WaitGroup with a RWMutex. Calls to Async hold the mutex for reading, while AsyncBarrier takes the lock for writing. This ensures that all executing Async work finishes before AsyncBarrier returns. Also pushes the Async() call from Inode.Release into gofer/InodeOperations.Release(). This removes a recursive Async call which should not have been allowed in the first place. The gofer Release call is the slow one (since it may make RPCs to the gofer), so putting the Async call there makes sense. PiperOrigin-RevId: 223093067 Change-Id: I116da7b20fce5ebab8d99c2ab0f27db7c89d890e
2018-11-27Save shutdown flags first.Brian Geffon
With rpcinet if shutdown flags are not saved before making the rpc a race is possible where blocked threads are woken up before the flags have been persisted. This would mean that threads can block indefinitely in a recvmsg after a shutdown(SHUT_RD) has happened. PiperOrigin-RevId: 223089783 Change-Id: If595e7add12aece54bcdf668ab64c570910d061a
2018-11-27Add procid support for arm64 platformHaibo Xu
Change-Id: I7c3db8dfdf95a125d7384c1d67c3300dbb99a47e PiperOrigin-RevId: 223039923
2018-11-26Implementation of preadv2 for Linux 4.4 supportZach Koopmans
Implement RWF_HIPRI (4.6) silently passes the read call. Implement -1 offset calls readv. PiperOrigin-RevId: 222840324 Change-Id: If9ddc1e8d086e1a632bdf5e00bae08205f95b6b0
2018-11-20Use RET_KILL_PROCESS if available in kernelFabricio Voznika
RET_KILL_THREAD doesn't work well for Go because it will kill only the offending thread and leave the process hanging. RET_TRAP can be masked out and it's not guaranteed to kill the process. RET_KILL_PROCESS is available since 4.14. For older kernel, continue to use RET_TRAP as this is the best option (likely to kill process, easy to debug). PiperOrigin-RevId: 222357867 Change-Id: Icc1d7d731274b16c2125b7a1ba4f7883fbdb2cbd
2018-11-20Dumps stacks if watchdog thread is stuckFabricio Voznika
PiperOrigin-RevId: 222332703 Change-Id: Id5c3cf79591c5d2949895b4e323e63c48c679820
2018-11-20Fix recursive read lock taken on TaskSetFabricio Voznika
SyncSyscallFiltersToThreadGroup and Task.TheadID() both acquired TaskSet RWLock in R mode and could deadlock if a writer comes in between. PiperOrigin-RevId: 222313551 Change-Id: I4221057d8d46fec544cbfa55765c9a284fe7ebfa
2018-11-20Reference upstream licensesMichael Pratt
Include copyright notices and the referenced LICENSE file. PiperOrigin-RevId: 222171321 Change-Id: I0cc0b167ca51b536d1087bf1c4742fdf1430bc2a
2018-11-20Add unsupported syscall events for get/setsockoptFabricio Voznika
PiperOrigin-RevId: 222148953 Change-Id: I21500a9f08939c45314a6414e0824490a973e5aa
2018-11-20Parse the tmpfs mode before validating.Nicolas Lacasse
This gets rid of the problematic modeRegex. PiperOrigin-RevId: 221835959 Change-Id: I566b8d8a43579a4c30c0a08a620a964bbcd826dd
2018-11-20Update futex to use usermem abstractions.Adin Scannell
This eliminates the indirection that existed in task_futex. PiperOrigin-RevId: 221832498 Change-Id: Ifb4c926d493913aa6694e193deae91616a29f042
2018-11-15Advertise vsyscall support via /proc/<pid>/maps.Rahat Mahmood
Also update test utilities for probing vsyscall support and add a metric to see if vsyscalls are actually used in sandboxes. PiperOrigin-RevId: 221698834 Change-Id: I57870ecc33ea8c864bd7437833f21aa1e8117477
2018-11-15Allow setting sticky bit in tmpfs permissions.Nicolas Lacasse
PiperOrigin-RevId: 221683127 Change-Id: Ide6a9f41d75aa19d0e2051a05a1e4a114a4fb93c
2018-11-13Implement TCP_NODELAY and TCP_CORKIan Gudger
Previously, TCP_NODELAY was always enabled and we would lie about it being configurable. TCP_NODELAY is now disabled by default (to match Linux) in the socket layer so that non-gVisor users don't automatically start using this questionable optimization. PiperOrigin-RevId: 221368472 Change-Id: Ib0240f66d94455081f4e0ca94f09d9338b2c1356
2018-11-12Internal change.Googler
PiperOrigin-RevId: 221189534 Change-Id: Id20d318bed97d5226b454c9351df396d11251e1f
2018-11-08Implement sync_file_range()Andrei Vagin
sync_file_range - sync a file segment with disk In Linux, sync_file_range() accepts three flags: SYNC_FILE_RANGE_WAIT_BEFORE Wait upon write-out of all pages in the specified range that have already been submitted to the device driver for write-out before performing any write. SYNC_FILE_RANGE_WRITE Initiate write-out of all dirty pages in the specified range which are not presently submitted write-out. Note that even this may block if you attempt to write more than request queue size. SYNC_FILE_RANGE_WAIT_AFTER Wait upon write-out of all pages in the range after performing any write. In this implementation: SYNC_FILE_RANGE_WAIT_BEFORE without SYNC_FILE_RANGE_WAIT_AFTER isn't supported right now. SYNC_FILE_RANGE_WRITE is skipped. It should initiate write-out of all dirty pages, but it doesn't wait, so it should be safe to do nothing while nobody uses SYNC_FILE_RANGE_WAIT_BEFORE. SYNC_FILE_RANGE_WAIT_AFTER is equal to fdatasync(). In Linux, sync_file_range() doesn't writes out the file's meta-data, but fdatasync() does if a file size is changed. PiperOrigin-RevId: 220730840 Change-Id: Iae5dfb23c2c916967d67cf1a1ad32f25eb3f6286
2018-11-08Create stubs for syscalls upto Linux 4.4.Rahat Mahmood
Create syscall stubs for missing syscalls upto Linux 4.4 and advertise a kernel version of 4.4. PiperOrigin-RevId: 220667680 Change-Id: Idbdccde538faabf16debc22f492dd053a8af0ba7
2018-11-01Make error messages a bit more user friendly.Ian Lewis
Updated error messages so that it doesn't print full Go struct representations when running a new container in a sandbox. For example, this occurs frequently when commands are not found when doing a 'kubectl exec'. PiperOrigin-RevId: 219729141 Change-Id: Ic3a7bc84cd7b2167f495d48a1da241d621d3ca09
2018-11-01Prevent premature destruction of shm segments.Rahat Mahmood
Shm segments can be marked for lazy destruction via shmctl(IPC_RMID), which destroys a segment once it is no longer attached to any processes. We were unconditionally decrementing the segment refcount on shmctl(IPC_RMID) which allowed a user to force a segment to be destroyed by repeatedly calling shmctl(IPC_RMID), with outstanding memory maps to the segment. This is problematic because the memory released by a segment destroyed this way can be reused by a different process while remaining accessible by the process with outstanding maps to the segment. PiperOrigin-RevId: 219713660 Change-Id: I443ab838322b4fb418ed87b2722c3413ead21845
2018-11-01modify modeRegexp to adapt the default spec of containerdJuan
https://github.com/containerd/containerd/blob/master/oci/spec.go#L206, the mode=755 didn't match the pattern modeRegexp = regexp.MustCompile("0[0-7][0-7][0-7]"). Closes #112 Signed-off-by: Juan <xionghuan.cn@gmail.com> Change-Id: I469e0a68160a1278e34c9e1dbe4b7784c6f97e5a PiperOrigin-RevId: 219672525
2018-10-31kvm: simplify floating point logic.Adin Scannell
This reduces the number of floating point save/restore cycles required (since we don't need to restore immediately following the switch, this always happens in a known context) and allows the kernel hooks to capture state. This lets us remove calls like "Current()". PiperOrigin-RevId: 219552844 Change-Id: I7676fa2f6c18b9919718458aa888b832a7db8cab
2018-10-31kvm: add detailed traces on vCPU errors.Adin Scannell
This improves debuggability greatly. PiperOrigin-RevId: 219551560 Change-Id: I2ecaffdd1c17b0d9f25911538ea6f693e2bc699f
2018-10-31kvm: avoid siginfo allocations.Adin Scannell
PiperOrigin-RevId: 219492587 Change-Id: I47f6fc0b74a4907ab0aff03d5f26453bdb983bb5
2018-10-30kvm: use private futexes.Adin Scannell
Use private futexes for performance and to align with other runtime uses. PiperOrigin-RevId: 219422634 Change-Id: Ief2af5e8302847ea6dc246e8d1ee4d64684ca9dd
2018-10-24Use TRAP to simplify vsyscall emulation.Adin Scannell
PiperOrigin-RevId: 218592058 Change-Id: I373a2d813aa6cc362500dd5a894c0b214a1959d7
2018-10-24Convert Unix transport to syserrIan Gudger
Previously this code used the tcpip error space. Since it is no longer part of netstack, it can use the sentry's error space (except for a few cases where there is still some shared code. This reduces the number of error space conversions required for hot Unix socket operations. PiperOrigin-RevId: 218541611 Change-Id: I3d13047006a8245b5dfda73364d37b8a453784bb
2018-10-24Run ptrace stubs in their own session and process group.Nicolas Lacasse
Pseudoterminal job control signals are meant to be received and handled by the sandbox process, but if the ptrace stubs are running in the same process group, they will receive the signals as well and inject then into the sentry kernel. This can result in duplicate signals being delivered (often to the wrong process), or a sentry panic if the ptrace stub is inactive. This CL makes the ptrace stub run in a new session. PiperOrigin-RevId: 218536851 Change-Id: Ie593c5687439bbfbf690ada3b2197ea71ed60a0e
2018-10-23Fix panic on creation of zero-len shm segments.Rahat Mahmood
Attempting to create a zero-len shm segment causes a panic since we try to allocate a zero-len filemem region. The existing code had a guard to disallow this, but the check didn't encode the fact that requesting a private segment implies a segment creation regardless of whether IPC_CREAT is explicitly specified. PiperOrigin-RevId: 218405743 Change-Id: I30aef1232b2125ebba50333a73352c2f907977da
2018-10-23Track paths and provide a rename hook.Adin Scannell
This change also adds extensive testing to the p9 package via mocks. The sanity checks and type checks are moved from the gofer into the core package, where they can be more easily validated. PiperOrigin-RevId: 218296768 Change-Id: I4fc3c326e7bf1e0e140a454cbacbcc6fd617ab55
2018-10-20Refcount Unix transport queueIan Gudger
This allows us to release messages in the queue when all users close. PiperOrigin-RevId: 218033550 Change-Id: I2f6e87650fced87a3977e3b74c64775c7b885c1b
2018-10-20Add more unimplemented syscall eventsFabricio Voznika
Added events for *ctl syscalls that may have multiple different commands. For runsc, each syscall event is only logged once. For *ctl syscalls, use the cmd as identifier, not only the syscall number. PiperOrigin-RevId: 218015941 Change-Id: Ie3c19131ae36124861e9b492a7dbe1765d9e5e59
2018-10-19Use correct company name in copyright headerIan Gudger
PiperOrigin-RevId: 217951017 Change-Id: Ie08bf6987f98467d07457bcf35b5f1ff6e43c035
2018-10-17Use generic ilist in Unix transport queueIan Gudger
This should improve performance. PiperOrigin-RevId: 217610560 Change-Id: I370f196ea2396f1715a460b168ecbee197f94d6c
2018-10-17Check thread group CPU timers in the CPU clock ticker.Jamie Liu
This reduces the number of goroutines and runtime timers when ITIMER_VIRTUAL or ITIMER_PROF are enabled, or when RLIMIT_CPU is set. This also ensures that thread group CPU timers only advance if running tasks are observed at the time the CPU clock advances, mostly eliminating the possibility that a CPU timer expiration observes no running tasks and falls back to the group leader. PiperOrigin-RevId: 217603396 Change-Id: Ia24ce934d5574334857d9afb5ad8ca0b6a6e65f4