summaryrefslogtreecommitdiffhomepage
path: root/pkg
AgeCommit message (Collapse)Author
2018-08-29fs: Drop reference to over-written file before renaming over it.Nicolas Lacasse
dirent.go:Rename() walks to the file being replaced and defers replaced.DecRef(). After the rename, the reference is dropped, triggering a writeout and SettAttr call to the gofer. Because of lazyOpenForWrite, the gofer opens the replaced file BY ITS OLD NAME and calls ftruncate on it. This CL changes Remove to drop the reference on replaced (and thus trigger writeout) before the actual rename call. PiperOrigin-RevId: 210756097 Change-Id: I01ea09a5ee6c2e2d464560362f09943641638e0f
2018-08-28fasync: don't keep mutex after returnIan Gudger
PiperOrigin-RevId: 210637533 Change-Id: I3536c3f9efb54732a0d8ada8bc299142b2c1682f
2018-08-28fs: Don't bother saving negative dirents.Nicolas Lacasse
PiperOrigin-RevId: 210616454 Change-Id: I3f536e2b4d603e540cdd9a67c61b8ec3351f4ac3
2018-08-28fs: Add tests for dirent ref counting with an overlay.Nicolas Lacasse
PiperOrigin-RevId: 210614669 Change-Id: I408365ff6d6c7765ed7b789446d30e7079cbfc67
2018-08-28sentry: optimize dirent weakref map save / restore.Zhaozhong Ni
Weak references save / restore involves multiple interface indirection and cause material latency overhead when there are lots of dirents, each containing a weak reference map. The nil entries in the map should also be purged. PiperOrigin-RevId: 210593727 Change-Id: Ied6f4c3c0726fcc53a24b983d9b3a79121b6b758
2018-08-28Bump to Go 1.11Michael Pratt
The procid offset is unchanged. PiperOrigin-RevId: 210551969 Change-Id: I33ba1ce56c2f5631b712417d870aa65ef24e6022
2018-08-28sentry: avoid double counting map objects in save / restore stats.Zhaozhong Ni
PiperOrigin-RevId: 210551929 Change-Id: Idd05935bffc63b39166cc3751139aff61b689faa
2018-08-27Add command-line parameter to trigger panic on signalFabricio Voznika
This is to troubleshoot problems with a hung process that is not responding to 'runsc debug --stack' command. PiperOrigin-RevId: 210483513 Change-Id: I4377b210b4e51bc8a281ad34fd94f3df13d9187d
2018-08-27Add /proc/sys/kernel/shm[all,max,mni].Brian Geffon
PiperOrigin-RevId: 210459956 Change-Id: I51859b90fa967631e0a54a390abc3b5541fbee66
2018-08-27Add various statisticsTamir Duberstein
PiperOrigin-RevId: 210442599 Change-Id: I9498351f461dc69c77b7f815d526c5693bec8e4a
2018-08-27fs: Fix remote-revalidate cache policy.Nicolas Lacasse
When revalidating a Dirent, if the inode id is the same, then we don't need to throw away the entire Dirent. We can just update the unstable attributes in place. If the inode id has changed, then the remote file has been deleted or moved, and we have no choice but to throw away the dirent we have a look up another. In this case, we may still end up losing a mounted dirent that is a child of the revalidated dirent. However, that seems appropriate here because the entire mount point has been pulled out from underneath us. Because gVisor's overlay is at the Inode level rather than the Dirent level, we must pass the parent Inode and name along with the Inode that is being revalidated. PiperOrigin-RevId: 210431270 Change-Id: I705caef9c68900234972d5aac4ae3a78c61c7d42
2018-08-27sentry: mark fsutil.DirFileOperations as savable.Zhaozhong Ni
PiperOrigin-RevId: 210405166 Change-Id: I252766015885c418e914007baf2fc058fec39b3e
2018-08-27runsc: Terminal resizing support.Kevin Krakauer
Implements the TIOCGWINSZ and TIOCSWINSZ ioctls, which allow processes to resize the terminal. This allows, for example, sshd to properly set the window size for ssh sessions. PiperOrigin-RevId: 210392504 Change-Id: I0d4789154d6d22f02509b31d71392e13ee4a50ba
2018-08-25Upstreaming DHCP changes from FuchsiaTamir Duberstein
PiperOrigin-RevId: 210221388 Change-Id: Ic82d592b8c4778855fa55ba913f6b9a10b2d511f
2018-08-24runsc: Terminal support for "docker exec -ti".Nicolas Lacasse
This CL adds terminal support for "docker exec". We previously only supported consoles for the container process, but not exec processes. The SYS_IOCTL syscall was added to the default seccomp filter list, but only for ioctls that get/set winsize and termios structs. We need to allow these ioctl for all containers because it's possible to run "exec -ti" on a container that was started without an attached console, after the filters have been installed. Note that control-character signals are still not properly supported. Tested with: $ docker run --runtime=runsc -it alpine In another terminial: $ docker exec -it <containerid> /bin/sh PiperOrigin-RevId: 210185456 Change-Id: I6d2401e53a7697bb988c120a8961505c335f96d9
2018-08-24fs: Drop unused WaitGroup in Dirent.destroy.Nicolas Lacasse
PiperOrigin-RevId: 210182476 Change-Id: I655a2a801e2069108d30323f7f5ae76deb3ea3ec
2018-08-24compressio: support optional hashing and eliminate hashio.Zhaozhong Ni
Compared to previous compressio / hashio nesting, there is up to 100% speedup. PiperOrigin-RevId: 210161269 Change-Id: I481aa9fe980bb817fe465fe34d32ea33fc8abf1c
2018-08-24SyscallRules merge and add were dropping AllowAny rulesFabricio Voznika
PiperOrigin-RevId: 210131001 Change-Id: I285707c5143b3e4c9a6948c1d1a452b6f16e65b7
2018-08-23Implement POSIX per-process interval timers.Jamie Liu
PiperOrigin-RevId: 210021612 Change-Id: If7c161e6fd08cf17942bfb6bc5a8d2c4e271c61e
2018-08-23netstack: make listening tcp socket close state setting and cleanup atomic.Zhaozhong Ni
Otherwise the socket saving logic might find workers still running for closed sockets unexpectedly. PiperOrigin-RevId: 210018905 Change-Id: I443a04d355613f5f9983252cc6863bff6e0eda3a
2018-08-23sentry: mark idMapSeqHandle as savable.Zhaozhong Ni
PiperOrigin-RevId: 209994384 Change-Id: I16186cf79cb4760a134f3968db30c168a5f4340e
2018-08-23Encapsulate netstack metricsIan Gudger
PiperOrigin-RevId: 209943212 Change-Id: I96dcbc7c2ab2426e510b94a564436505256c5c79
2018-08-22Add separate Recycle method for allocator.Adin Scannell
This improves debugging for pagetable-related issues. PiperOrigin-RevId: 209827795 Change-Id: I4cfa11664b0b52f26f6bc90a14c5bb106f01e038
2018-08-22Allow building on !linuxGoogler
PiperOrigin-RevId: 209819644 Change-Id: I329d054bf8f4999e7db0dcd95b13f7793c65d4e2
2018-08-22sentry: mark S/R stating errors as save rejections / fs corruptions.Zhaozhong Ni
PiperOrigin-RevId: 209817767 Change-Id: Iddf2b8441bc44f31f9a8cf6f2bd8e7a5b824b487
2018-08-22Always add AT_BASE even if there is no interpreter.Brian Geffon
Linux will ALWAYS add AT_BASE even for a static binary, expect it will be set to 0 [1]. 1. https://github.com/torvalds/linux/blob/master/fs/binfmt_elf.c#L253 PiperOrigin-RevId: 209811129 Change-Id: I92cc66532f23d40f24414a921c030bd3481e12a0
2018-08-22fs: Hold Dirent.mu when calling Dirent.flush().Nicolas Lacasse
As required by the contract in Dirent.flush(). Also inline Dirent.freeze() into Dirent.Freeze(), since it is only called from there. PiperOrigin-RevId: 209783626 Change-Id: Ie6de4533d93dd299ffa01dabfa257c9cc259b1f4
2018-08-21sentry: do not release gofer inode file state loading lock upon error.Zhaozhong Ni
When an inode file state failed to load asynchronuously, we want to report the error instead of potentially panicing in another async loading goroutine incorrectly unblocked. PiperOrigin-RevId: 209683977 Change-Id: I591cde97710bbe3cdc53717ee58f1d28bbda9261
2018-08-21binary: append slicesIan Gudger
A new optimization in Go 1.11 improves the efficiency of slice extension: "The compiler now optimizes slice extension of the form append(s, make([]T, n)...)." https://tip.golang.org/doc/go1.11#performance-compiler Before: BenchmarkMarshalUnmarshal-12 2000000 664 ns/op 0 B/op 0 allocs/op BenchmarkReadWrite-12 500000 2395 ns/op 304 B/op 24 allocs/op After: BenchmarkMarshalUnmarshal-12 2000000 628 ns/op 0 B/op 0 allocs/op BenchmarkReadWrite-12 500000 2411 ns/op 304 B/op 24 allocs/op BenchmarkMarshalUnmarshal benchmarks the code in this package, BenchmarkReadWrite benchmarks the code in the standard library. PiperOrigin-RevId: 209679979 Change-Id: I51c6302e53f60bf79f84576b1ead4d36658897cb
2018-08-21Expose route tableGoogler
PiperOrigin-RevId: 209670528 Change-Id: I2890bcdef36f0b5f24b372b42cf628b38dd5764e
2018-08-21Build PCAP file with atomic blocking writesIan Gudger
The previous use of non-blocking writes could result in corrupt PCAP files if a partial write occurs. Using (*os.File).Write solves this problem by not allowing partial writes. This change does not increase allocations (in one path it actually reduces them), but does add additional copying. PiperOrigin-RevId: 209652974 Change-Id: I4b1cf2eda4cfd7f237a4245aceb7391b3055a66c
2018-08-21Fix races in kernel.(*Task).Value()Ian Gudger
PiperOrigin-RevId: 209627180 Change-Id: Idc84afd38003427e411df6e75abfabd9174174e1
2018-08-20Fix handling of abstract Unix socket addressesIan Gudger
* Don't truncate abstract addresses at second null. * Properly handle abstract addresses with length < 108 bytes. PiperOrigin-RevId: 209502703 Change-Id: I49053f2d18b5a78208c3f640c27dbbdaece4f1a9
2018-08-20getdents should return type=DT_DIR for SpecialDirectories.Nicolas Lacasse
It was returning DT_UNKNOWN, and this was breaking numpy. PiperOrigin-RevId: 209459351 Change-Id: Ic6f548e23aa9c551b2032b92636cb5f0df9ccbd4
2018-08-20sysfs: Add (empty) cpu directories for each cpu in /sys/devices/system/cpu.Nicolas Lacasse
Numpy needs these. Also added the "present" directory, since the contents are the same as possible and online. PiperOrigin-RevId: 209451777 Change-Id: I2048de3f57bf1c57e9b5421d607ca89c2a173684
2018-08-16fs: Support possible and online knobs for cpuChenggang Qin
Some linux commands depend on /sys/devices/system/cpu/possible, such as 'lscpu'. Add 2 knobs for cpu: /sys/devices/system/cpu/possible /sys/devices/system/cpu/online Both the values are '0 - Kernel.ApplicationCores()-1'. Change-Id: Iabd8a4e559cbb630ed249686b92c22b4e7120663 PiperOrigin-RevId: 209070163
2018-08-16Internal change.Googler
PiperOrigin-RevId: 209060862 Change-Id: I2cd02f0032b80d0087110095548b1a8ffa696ac2
2018-08-15Remove obsolete comment about panickingIan Gudger
PiperOrigin-RevId: 208908702 Change-Id: I6be9c765c257a9ddb1a965a03942ab3fc3a34a43
2018-08-15runsc fsgofer: Support dynamic serving of filesystems.Kevin Krakauer
When multiple containers run inside a sentry, each container has its own root filesystem and set of mounts. Containers are also added after sentry boot rather than all configured and known at boot time. The fsgofer needs to be able to serve the root filesystem of each container. Thus, it must be possible to add filesystems after the fsgofer has already started. This change: * Creates a URPC endpoint within the gofer process that listens for requests to serve new content. * Enables the sentry, when starting a new container, to add the new container's filesystem. * Mounts those new filesystems at separate roots within the sentry. PiperOrigin-RevId: 208903248 Change-Id: Ifa91ec9c8caf5f2f0a9eead83c4a57090ce92068
2018-08-14Reduce map lookups in syserrIan Gudger
PiperOrigin-RevId: 208755352 Change-Id: Ia24630f452a4a42940ab73a8113a2fd5ea2cfca2
2018-08-14runsc: Change cache policy for root fs and volume mounts.Nicolas Lacasse
Previously, gofer filesystems were configured with the default "fscache" policy, which caches filesystem metadata and contents aggressively. While this setting is best for performance, it means that changes from inside the sandbox may not be immediately propagated outside the sandbox, and vice-versa. This CL changes volumes and the root fs configuration to use a new "remote-revalidate" cache policy which tries to retain as much caching as possible while still making fs changes visible across the sandbox boundary. This cache policy is enabled by default for the root filesystem. The default value for the "--file-access" flag is still "proxy", but the behavior is changed to use the new cache policy. A new value for the "--file-access" flag is added, called "proxy-exclusive", which turns on the previous aggressive caching behavior. As the name implies, this flag should be used when the sandbox has "exclusive" access to the filesystem. All volume mounts are configured to use the new cache policy, since it is safest and most likely to be correct. There is not currently a way to change this behavior, but it's possible to add such a mechanism in the future. The configurability is a smaller issue for volumes, since most of the expensive application fs operations (walking + stating files) will likely served by the root fs. PiperOrigin-RevId: 208735037 Change-Id: Ife048fab1948205f6665df8563434dbc6ca8cfc9
2018-08-14TTY: Fix data race where calls into tty.queue's waiter were not synchronized.Kevin Krakauer
Now, there's a waiter for each end (master and slave) of the TTY, and each waiter.Entry is only enqueued in one of the waiters. PiperOrigin-RevId: 208734483 Change-Id: I06996148f123075f8dd48cde5a553e2be74c6dce
2018-08-14Fix `ls -laR | wc -l` hanging.Kevin Krakauer
stat()-ing /proc/PID/fd/FD incremented but didn't decrement the refcount for FD. This behavior wasn't usually noticeable, but in the above case: - ls would never decrement the refcount of the write end of the pipe to 0. - This caused the write end of the pipe never to close. - wc would then hang read()-ing from the pipe. PiperOrigin-RevId: 208728817 Change-Id: I4fca1ba5ca24e4108915a1d30b41dc63da40604d
2018-08-14Enforce Unix socket address length limitIan Gudger
PiperOrigin-RevId: 208720936 Change-Id: Ic943a88b6efeff49574306d4d4e1f113116ae32e
2018-08-14Automated rollback of changelist 208284483Nicolas Lacasse
PiperOrigin-RevId: 208685417 Change-Id: Ie2849c4811e3a2d14a002f521cef018ded0c6c4a
2018-08-14Fix bind() on overlays.Nicolas Lacasse
InodeOperations.Bind now returns a Dirent which will be cached in the Dirent tree. When an overlay is in-use, Bind cannot return the Dirent created by the upper filesystem because the Dirent does not know about the overlay. Instead, overlayBind must create a new overlay-aware Inode and Dirent and return that. This is analagous to how Lookup and overlayLookup work. PiperOrigin-RevId: 208670710 Change-Id: I6390affbcf94c38656b4b458e248739b4853da29
2018-08-13Prevent renames across walk fast path.Adin Scannell
PiperOrigin-RevId: 208533436 Change-Id: Ifc1a4e2d6438a424650bee831c301b1ac0d670a3
2018-08-13Add path sanity checks.Adin Scannell
PiperOrigin-RevId: 208527333 Change-Id: I55291bc6b8bc6b88fdd75baf899a71854c39c1a7
2018-08-10fs: Allow overlays to revalidate files from the upper fs.Nicolas Lacasse
Previously, an overlay would panic if either the upper or lower fs required revalidation for a given Dirent. Now, we allow revalidation from the upper file, but not the lower. If a cached overlay inode does need revalidation (because the upper needs revalidation), then the entire overlay Inode will be discarded and a new overlay Inode will be built with a fresh copy of the upper file. As a side effect of this change, Revalidate must take an Inode instead of a Dirent, since an overlay needs to revalidate individual Inodes. PiperOrigin-RevId: 208293638 Change-Id: Ic8f8d1ffdc09114721745661a09522b54420c5f1
2018-08-10Implemented the splice(2) syscall.Justine Olshan
Currently the implementation matches the behavior of moving data between two file descriptors. However, it does not implement this through zero-copy movement. Thus, this code is a starting point to build the more complex implementation. PiperOrigin-RevId: 208284483 Change-Id: Ibde79520a3d50bc26aead7ad4f128d2be31db14e