summaryrefslogtreecommitdiffhomepage
path: root/pkg/sentry/fs
AgeCommit message (Collapse)Author
2018-12-05Enforce directory accessibility before delete WalkMichael Pratt
By Walking before checking that the directory is writable and executable, MayDelete may return the Walk error (e.g., ENOENT) which would normally be masked by a permission error (EACCES). PiperOrigin-RevId: 224222453 Change-Id: I108a7f730e6bdaa7f277eaddb776267c00805475
2018-12-05Add context to mount errorsMichael Pratt
This makes it more obvious why a mount failed. PiperOrigin-RevId: 224203880 Change-Id: I7961774a7b6fdbb5493a791f8b3815c49b8f7631
2018-12-04Max link traversals should be for an entire path.Brian Geffon
The number of symbolic links that are allowed to be followed are for a full path and not just a chain of symbolic links. PiperOrigin-RevId: 224047321 Change-Id: I5e3c4caf66a93c17eeddcc7f046d1e8bb9434a40
2018-12-04sentry: save / restore netstack procfs configuration.Zhaozhong Ni
PiperOrigin-RevId: 224047120 Change-Id: Ia6cb17fa978595cd73857b6178c4bdba401e185e
2018-12-04Enforce name length restriction on paths.Brian Geffon
NAME_LENGTH must be enforced per component. PiperOrigin-RevId: 224046749 Change-Id: Iba8105b00d951f2509dc768af58e4110dafbe1c9
2018-12-04Fix data race caused by unlocked call of Dirent.descendantOf.Nicolas Lacasse
PiperOrigin-RevId: 224025363 Change-Id: I98864403c779832e9e1436f7d3c3f6fb2fba9904
2018-11-27Fix data race in fs.Async.Nicolas Lacasse
Replaces the WaitGroup with a RWMutex. Calls to Async hold the mutex for reading, while AsyncBarrier takes the lock for writing. This ensures that all executing Async work finishes before AsyncBarrier returns. Also pushes the Async() call from Inode.Release into gofer/InodeOperations.Release(). This removes a recursive Async call which should not have been allowed in the first place. The gofer Release call is the slow one (since it may make RPCs to the gofer), so putting the Async call there makes sense. PiperOrigin-RevId: 223093067 Change-Id: I116da7b20fce5ebab8d99c2ab0f27db7c89d890e
2018-11-20Parse the tmpfs mode before validating.Nicolas Lacasse
This gets rid of the problematic modeRegex. PiperOrigin-RevId: 221835959 Change-Id: I566b8d8a43579a4c30c0a08a620a964bbcd826dd
2018-11-15Allow setting sticky bit in tmpfs permissions.Nicolas Lacasse
PiperOrigin-RevId: 221683127 Change-Id: Ide6a9f41d75aa19d0e2051a05a1e4a114a4fb93c
2018-11-12Internal change.Googler
PiperOrigin-RevId: 221189534 Change-Id: Id20d318bed97d5226b454c9351df396d11251e1f
2018-11-08Create stubs for syscalls upto Linux 4.4.Rahat Mahmood
Create syscall stubs for missing syscalls upto Linux 4.4 and advertise a kernel version of 4.4. PiperOrigin-RevId: 220667680 Change-Id: Idbdccde538faabf16debc22f492dd053a8af0ba7
2018-11-01modify modeRegexp to adapt the default spec of containerdJuan
https://github.com/containerd/containerd/blob/master/oci/spec.go#L206, the mode=755 didn't match the pattern modeRegexp = regexp.MustCompile("0[0-7][0-7][0-7]"). Closes #112 Signed-off-by: Juan <xionghuan.cn@gmail.com> Change-Id: I469e0a68160a1278e34c9e1dbe4b7784c6f97e5a PiperOrigin-RevId: 219672525
2018-10-24Convert Unix transport to syserrIan Gudger
Previously this code used the tcpip error space. Since it is no longer part of netstack, it can use the sentry's error space (except for a few cases where there is still some shared code. This reduces the number of error space conversions required for hot Unix socket operations. PiperOrigin-RevId: 218541611 Change-Id: I3d13047006a8245b5dfda73364d37b8a453784bb
2018-10-23Track paths and provide a rename hook.Adin Scannell
This change also adds extensive testing to the p9 package via mocks. The sanity checks and type checks are moved from the gofer into the core package, where they can be more easily validated. PiperOrigin-RevId: 218296768 Change-Id: I4fc3c326e7bf1e0e140a454cbacbcc6fd617ab55
2018-10-20Add more unimplemented syscall eventsFabricio Voznika
Added events for *ctl syscalls that may have multiple different commands. For runsc, each syscall event is only logged once. For *ctl syscalls, use the cmd as identifier, not only the syscall number. PiperOrigin-RevId: 218015941 Change-Id: Ie3c19131ae36124861e9b492a7dbe1765d9e5e59
2018-10-19Use correct company name in copyright headerIan Gudger
PiperOrigin-RevId: 217951017 Change-Id: Ie08bf6987f98467d07457bcf35b5f1ff6e43c035
2018-10-17Fix typos in socket_testIan Gudger
PiperOrigin-RevId: 217576188 Change-Id: I82e45c306c5c9161e207311c7dbb8a983820c1df
2018-10-17Move Unix transport out of netstackIan Gudger
PiperOrigin-RevId: 217557656 Change-Id: I63d27635b1a6c12877279995d2d9847b6a19da9b
2018-10-15Refactor host.ConnectedEndpointIan Gudger
* Integrate recvMsg and sendMsg functions into Recv and Send respectively as they are no longer shared. * Clean up partial read/write error handling code. * Re-order code to make sense given that there is no longer a host.endpoint type. PiperOrigin-RevId: 217255072 Change-Id: Ib43fe9286452f813b8309d969be11f5fa40694cd
2018-10-15Merge host.endpoint into host.ConnectedEndpointIan Gudger
host.endpoint contained duplicated logic from the sockerpair implementation and host.ConnectedEndpoint. Remove host.endpoint in favor of a host.ConnectedEndpoint wrapped in a socketpair end. PiperOrigin-RevId: 217240096 Change-Id: I4a3d51e3fe82bdf30e2d0152458b8499ab4c987c
2018-10-15Clean up Rename and Unlink checks for EBUSY.Nicolas Lacasse
- Change Dirent.Busy => Dirent.isMountPoint. The function body is unchanged, and it is no longer exported. - fs.MayDelete now checks that the victim is not the process root. This aligns with Linux's namei.c:may_delete(). - Fix "is-ancestor" checks to actually compare all ancestors, not just the parents. - Fix handling of paths that end in dots, which are handled differently in Rename vs. Unlink. PiperOrigin-RevId: 217239274 Change-Id: I7a0eb768e70a1b2915017ce54f7f95cbf8edf1fb
2018-10-15sentry: save fs.Dirent deleted info.Zhaozhong Ni
PiperOrigin-RevId: 217155458 Change-Id: Id3265b1ec784787039e2131c80254ac4937330c7
2018-10-11sentry: allow saving of unlinked files with open fds on virtual fs.Zhaozhong Ni
PiperOrigin-RevId: 216733414 Change-Id: I33cd3eb818f0c39717d6656fcdfff6050b37ebb0
2018-10-10Enforce message size limits and avoid host calls with too many iovecsMichael Pratt
Currently, in the face of FileMem fragmentation and a large sendmsg or recvmsg call, host sockets may pass > 1024 iovecs to the host, which will immediately cause the host to return EMSGSIZE. When we detect this case, use a single intermediate buffer to pass to the kernel, copying to/from the src/dst buffer. To avoid creating unbounded intermediate buffers, enforce message size checks and truncation w.r.t. the send buffer size. The same functionality is added to netstack unix sockets for feature parity. PiperOrigin-RevId: 216590198 Change-Id: I719a32e71c7b1098d5097f35e6daf7dd5190eff7
2018-10-03Implement TIOCSCTTY ioctl as a noop.Nicolas Lacasse
PiperOrigin-RevId: 215658757 Change-Id: If63b33293f3e53a7f607ae72daa79e2b7ef6fcfd
2018-10-03Add S/R support for FIOASYNCIan Gudger
PiperOrigin-RevId: 215655197 Change-Id: I668b1bc7c29daaf2999f8f759138bcbb09c4de6f
2018-10-01runsc: Support job control signals in "exec -it".Nicolas Lacasse
Terminal support in runsc relies on host tty file descriptors that are imported into the sandbox. Application tty ioctls are sent directly to the host fd. However, those host tty ioctls are associated in the host kernel with a host process (in this case runsc), and the host kernel intercepts job control characters like ^C and send signals to the host process. Thus, typing ^C into a "runsc exec" shell will send a SIGINT to the runsc process. This change makes "runsc exec" handle all signals, and forward them into the sandbox via the "ContainerSignal" urpc method. Since the "runsc exec" is associated with a particular container process in the sandbox, the signal must be associated with the same container process. One big difficulty is that the signal should not necessarily be sent to the sandbox process started by "exec", but instead must be sent to the foreground process group for the tty. For example, we may exec "bash", and from bash call "sleep 100". A ^C at this point should SIGINT sleep, not bash. To handle this, tty files inside the sandbox must keep track of their foreground process group, which is set/get via ioctls. When an incoming ContainerSignal urpc comes in, we look up the foreground process group via the tty file. Unfortunately, this means we have to expose and cache the tty file in the Loader. Note that "runsc exec" now handles signals properly, but "runs run" does not. That will come in a later CL, as this one is complex enough already. Example: root@:/usr/local/apache2# sleep 100 ^C root@:/usr/local/apache2# sleep 100 ^Z [1]+ Stopped sleep 100 root@:/usr/local/apache2# fg sleep 100 ^C root@:/usr/local/apache2# PiperOrigin-RevId: 215334554 Change-Id: I53cdce39653027908510a5ba8d08c49f9cf24f39
2018-09-28Require AF_UNIX sockets from the goferMichael Pratt
host.endpoint already has the check, but it is missing from host.ConnectedEndpoint. PiperOrigin-RevId: 214962762 Change-Id: I88bb13a5c5871775e4e7bf2608433df8a3d348e6
2018-09-27Forward ioctl(TCSETSF) calls on host ttys to the host kernel.Nicolas Lacasse
We already forward TCSETS and TCSETSW. TCSETSF is roughly equivalent but discards pending input. The filters were relaxed to allow host ioctls with TCSETSF argument. This fixes programs like "passwd" that prevent user input from being displayed on the terminal. Before: root@b8a0240fc836:/# passwd Enter new UNIX password: 123 Retype new UNIX password: 123 passwd: password updated successfully After: root@ae6f5dabe402:/# passwd Enter new UNIX password: Retype new UNIX password: passwd: password updated successfully PiperOrigin-RevId: 214869788 Change-Id: I31b4d1373c1388f7b51d0f2f45ce40aa8e8b0b58
2018-09-18Short-circuit Readdir calls on overlay files when the dirent is frozen.Nicolas Lacasse
If we have an overlay file whose corresponding Dirent is frozen, then we should not bother calling Readdir on the upper or lower files, since DirentReaddir will calculate children based on the frozen Dirent tree. A test was added that fails without this change. PiperOrigin-RevId: 213531215 Change-Id: I4d6c98f1416541a476a34418f664ba58f936a81d
2018-09-17Allow kernel.(*Task).Block to accept an extract only channelIan Gudger
PiperOrigin-RevId: 213328293 Change-Id: I4164133e6f709ecdb89ffbb5f7df3324c273860a
2018-09-14Reuse readlink parameter, add sockaddr max.Michael Pratt
PiperOrigin-RevId: 213058623 Change-Id: I522598c655d633b9330990951ff1c54d1023ec29
2018-09-14Make gVisor hard link check match Linux's.Nicolas Lacasse
Linux permits hard-linking if the target is owned by the user OR the target has Read+Write permission. PiperOrigin-RevId: 213024613 Change-Id: If642066317b568b99084edd33ee4e8822ec9cbb3
2018-09-07runsc: Support multi-container exec.Nicolas Lacasse
We must use a context.Context with a Root Dirent that corresponds to the container's chroot. Previously we were using the root context, which does not have a chroot. Getting the correct context required refactoring some of the path-lookup code. We can't lookup the path without a context.Context, which requires kernel.CreateProcArgs, which we only get inside control.Execute. So we have to do the path lookup much later than we previously were. PiperOrigin-RevId: 212064734 Change-Id: I84a5cfadacb21fd9c3ab9c393f7e308a40b9b537
2018-09-05Imported FD in exec was leakingFabricio Voznika
Imported file needs to be closed after it's been imported. PiperOrigin-RevId: 211732472 Change-Id: Ia9249210558b77be076bcce465b832a22eed301f
2018-09-04/proc/PID/mounts is not tab-delimitedMichael Pratt
PiperOrigin-RevId: 211513847 Change-Id: Ib484dd2d921c3e5d70d0e410cd973d3bff4f6b73
2018-09-04Distinguish Element and Linker for ilist.Adin Scannell
Furthermore, allow for the specification of an ElementMapper. This allows a single "Element" type to exist on multiple inline lists, and work without having to embed the entry type. This is a requisite change for supporting a per-Inode list of Dirents. PiperOrigin-RevId: 211467497 Change-Id: If2768999b43e03fdaecf8ed15f435fe37518d163
2018-08-31Do not use fs.FileOwnerFromContext in fs/proc.file.UnstableAttr().Jamie Liu
From //pkg/sentry/context/context.go: // - It is *not safe* to retain a Context passed to a function beyond the scope // of that function call. Passing a stored kernel.Task as a context.Context to fs.FileOwnerFromContext violates this requirement. PiperOrigin-RevId: 211143021 Change-Id: I4c5b02bd941407be4c9cfdbcbdfe5a26acaec037
2018-08-30fs: Add empty dir at /sys/class/power_supply.Nicolas Lacasse
PiperOrigin-RevId: 210953512 Change-Id: I07d2d7fb0d268aa8eca26d81ef28b5b5c42289ee
2018-08-29fs: Fix renameMu lock recursion.Nicolas Lacasse
dirent.walk() takes renameMu, but is often called with renameMu already held, which can lead to a deadlock. Fix this by requiring renameMu to be held for reading when dirent.walk() is called. This causes walks and existence checks to block while a rename operation takes place, but that is what we were already trying to enforce by taking renameMu in walk() anyways. PiperOrigin-RevId: 210760780 Change-Id: Id61018e6e4adbeac53b9c1b3aa24ab77f75d8a54
2018-08-29fs: Drop reference to over-written file before renaming over it.Nicolas Lacasse
dirent.go:Rename() walks to the file being replaced and defers replaced.DecRef(). After the rename, the reference is dropped, triggering a writeout and SettAttr call to the gofer. Because of lazyOpenForWrite, the gofer opens the replaced file BY ITS OLD NAME and calls ftruncate on it. This CL changes Remove to drop the reference on replaced (and thus trigger writeout) before the actual rename call. PiperOrigin-RevId: 210756097 Change-Id: I01ea09a5ee6c2e2d464560362f09943641638e0f
2018-08-28fs: Don't bother saving negative dirents.Nicolas Lacasse
PiperOrigin-RevId: 210616454 Change-Id: I3f536e2b4d603e540cdd9a67c61b8ec3351f4ac3
2018-08-28fs: Add tests for dirent ref counting with an overlay.Nicolas Lacasse
PiperOrigin-RevId: 210614669 Change-Id: I408365ff6d6c7765ed7b789446d30e7079cbfc67
2018-08-28sentry: optimize dirent weakref map save / restore.Zhaozhong Ni
Weak references save / restore involves multiple interface indirection and cause material latency overhead when there are lots of dirents, each containing a weak reference map. The nil entries in the map should also be purged. PiperOrigin-RevId: 210593727 Change-Id: Ied6f4c3c0726fcc53a24b983d9b3a79121b6b758
2018-08-27Add /proc/sys/kernel/shm[all,max,mni].Brian Geffon
PiperOrigin-RevId: 210459956 Change-Id: I51859b90fa967631e0a54a390abc3b5541fbee66
2018-08-27fs: Fix remote-revalidate cache policy.Nicolas Lacasse
When revalidating a Dirent, if the inode id is the same, then we don't need to throw away the entire Dirent. We can just update the unstable attributes in place. If the inode id has changed, then the remote file has been deleted or moved, and we have no choice but to throw away the dirent we have a look up another. In this case, we may still end up losing a mounted dirent that is a child of the revalidated dirent. However, that seems appropriate here because the entire mount point has been pulled out from underneath us. Because gVisor's overlay is at the Inode level rather than the Dirent level, we must pass the parent Inode and name along with the Inode that is being revalidated. PiperOrigin-RevId: 210431270 Change-Id: I705caef9c68900234972d5aac4ae3a78c61c7d42
2018-08-27sentry: mark fsutil.DirFileOperations as savable.Zhaozhong Ni
PiperOrigin-RevId: 210405166 Change-Id: I252766015885c418e914007baf2fc058fec39b3e
2018-08-27runsc: Terminal resizing support.Kevin Krakauer
Implements the TIOCGWINSZ and TIOCSWINSZ ioctls, which allow processes to resize the terminal. This allows, for example, sshd to properly set the window size for ssh sessions. PiperOrigin-RevId: 210392504 Change-Id: I0d4789154d6d22f02509b31d71392e13ee4a50ba
2018-08-24runsc: Terminal support for "docker exec -ti".Nicolas Lacasse
This CL adds terminal support for "docker exec". We previously only supported consoles for the container process, but not exec processes. The SYS_IOCTL syscall was added to the default seccomp filter list, but only for ioctls that get/set winsize and termios structs. We need to allow these ioctl for all containers because it's possible to run "exec -ti" on a container that was started without an attached console, after the filters have been installed. Note that control-character signals are still not properly supported. Tested with: $ docker run --runtime=runsc -it alpine In another terminial: $ docker exec -it <containerid> /bin/sh PiperOrigin-RevId: 210185456 Change-Id: I6d2401e53a7697bb988c120a8961505c335f96d9
2018-08-24fs: Drop unused WaitGroup in Dirent.destroy.Nicolas Lacasse
PiperOrigin-RevId: 210182476 Change-Id: I655a2a801e2069108d30323f7f5ae76deb3ea3ec