summaryrefslogtreecommitdiffhomepage
AgeCommit message (Collapse)Author
2018-08-27Add various statisticsTamir Duberstein
PiperOrigin-RevId: 210442599 Change-Id: I9498351f461dc69c77b7f815d526c5693bec8e4a
2018-08-27fs: Fix remote-revalidate cache policy.Nicolas Lacasse
When revalidating a Dirent, if the inode id is the same, then we don't need to throw away the entire Dirent. We can just update the unstable attributes in place. If the inode id has changed, then the remote file has been deleted or moved, and we have no choice but to throw away the dirent we have a look up another. In this case, we may still end up losing a mounted dirent that is a child of the revalidated dirent. However, that seems appropriate here because the entire mount point has been pulled out from underneath us. Because gVisor's overlay is at the Inode level rather than the Dirent level, we must pass the parent Inode and name along with the Inode that is being revalidated. PiperOrigin-RevId: 210431270 Change-Id: I705caef9c68900234972d5aac4ae3a78c61c7d42
2018-08-27runsc: fsgofer should return a unique QID.Path for each file.Nicolas Lacasse
Previously, we were only using the host inode id as the QID path. But the host filesystem can have multiple devices with conflicting inode ids. This resulted in duplicate inode ids in the sentry. This CL generates a unique QID for each <host inode, host device> pair. PiperOrigin-RevId: 210424813 Change-Id: I16d106f61c7c8f910c0da4ceec562a010ffca2fb
2018-08-27Add runsc-race target.Adin Scannell
PiperOrigin-RevId: 210422178 Change-Id: I984dd348d467908bc3180a20fc79b8387fcca05e
2018-08-27sentry: mark fsutil.DirFileOperations as savable.Zhaozhong Ni
PiperOrigin-RevId: 210405166 Change-Id: I252766015885c418e914007baf2fc058fec39b3e
2018-08-27Put fsgofer inside chrootFabricio Voznika
Now each container gets its own dedicated gofer that is chroot'd to the rootfs path. This is done to add an extra layer of security in case the gofer gets compromised. PiperOrigin-RevId: 210396476 Change-Id: Iba21360a59dfe90875d61000db103f8609157ca0
2018-08-27runsc: Terminal resizing support.Kevin Krakauer
Implements the TIOCGWINSZ and TIOCSWINSZ ioctls, which allow processes to resize the terminal. This allows, for example, sshd to properly set the window size for ssh sessions. PiperOrigin-RevId: 210392504 Change-Id: I0d4789154d6d22f02509b31d71392e13ee4a50ba
2018-08-25Upstreaming DHCP changes from FuchsiaTamir Duberstein
PiperOrigin-RevId: 210221388 Change-Id: Ic82d592b8c4778855fa55ba913f6b9a10b2d511f
2018-08-24runsc: Terminal support for "docker exec -ti".Nicolas Lacasse
This CL adds terminal support for "docker exec". We previously only supported consoles for the container process, but not exec processes. The SYS_IOCTL syscall was added to the default seccomp filter list, but only for ioctls that get/set winsize and termios structs. We need to allow these ioctl for all containers because it's possible to run "exec -ti" on a container that was started without an attached console, after the filters have been installed. Note that control-character signals are still not properly supported. Tested with: $ docker run --runtime=runsc -it alpine In another terminial: $ docker exec -it <containerid> /bin/sh PiperOrigin-RevId: 210185456 Change-Id: I6d2401e53a7697bb988c120a8961505c335f96d9
2018-08-24fs: Drop unused WaitGroup in Dirent.destroy.Nicolas Lacasse
PiperOrigin-RevId: 210182476 Change-Id: I655a2a801e2069108d30323f7f5ae76deb3ea3ec
2018-08-24compressio: support optional hashing and eliminate hashio.Zhaozhong Ni
Compared to previous compressio / hashio nesting, there is up to 100% speedup. PiperOrigin-RevId: 210161269 Change-Id: I481aa9fe980bb817fe465fe34d32ea33fc8abf1c
2018-08-24runsc: Allow runsc to properly search the PATH for executable name.Kevin Krakauer
Previously, runsc improperly attempted to find an executable in the container's PATH. We now search the PATH via the container's fsgofer rather than the host FS, eliminating the confusing differences between paths on the host and within a container. PiperOrigin-RevId: 210159488 Change-Id: I228174dbebc4c5356599036d6efaa59f28ff28d2
2018-08-24SyscallRules merge and add were dropping AllowAny rulesFabricio Voznika
PiperOrigin-RevId: 210131001 Change-Id: I285707c5143b3e4c9a6948c1d1a452b6f16e65b7
2018-08-24Add option to panic gofer if writes are attempted over RO mountsFabricio Voznika
This is used when '--overlay=true' to guarantee writes are not sent to gofer. PiperOrigin-RevId: 210116288 Change-Id: I7616008c4c0e8d3668e07a205207f46e2144bf30
2018-08-23Implement POSIX per-process interval timers.Jamie Liu
PiperOrigin-RevId: 210021612 Change-Id: If7c161e6fd08cf17942bfb6bc5a8d2c4e271c61e
2018-08-23netstack: make listening tcp socket close state setting and cleanup atomic.Zhaozhong Ni
Otherwise the socket saving logic might find workers still running for closed sockets unexpectedly. PiperOrigin-RevId: 210018905 Change-Id: I443a04d355613f5f9983252cc6863bff6e0eda3a
2018-08-23sentry: mark idMapSeqHandle as savable.Zhaozhong Ni
PiperOrigin-RevId: 209994384 Change-Id: I16186cf79cb4760a134f3968db30c168a5f4340e
2018-08-23Clean up syscall filtersFabricio Voznika
Removed syscalls that are only used by whitelistfs which has its own set of filters. PiperOrigin-RevId: 209967259 Change-Id: Idb2e1b9d0201043d7cd25d96894f354729dbd089
2018-08-23Encapsulate netstack metricsIan Gudger
PiperOrigin-RevId: 209943212 Change-Id: I96dcbc7c2ab2426e510b94a564436505256c5c79
2018-08-22runsc: De-flakes container_test TestMultiContainerSanity.Kevin Krakauer
The bug was caused by os.File's finalizer, which closes the file. Because fsgofer.serve() was passed a file descriptor as an int rather than a os.File, callers would pass os.File.Fd(), and the os.File would go out of scope. Thus, the file would get GC'd and finalized nondeterministically, causing failures when the file was used. PiperOrigin-RevId: 209861834 Change-Id: Idf24d5c1f04c9b28659e62c97202ab3b4d72e994
2018-08-22Add separate Recycle method for allocator.Adin Scannell
This improves debugging for pagetable-related issues. PiperOrigin-RevId: 209827795 Change-Id: I4cfa11664b0b52f26f6bc90a14c5bb106f01e038
2018-08-22Allow building on !linuxGoogler
PiperOrigin-RevId: 209819644 Change-Id: I329d054bf8f4999e7db0dcd95b13f7793c65d4e2
2018-08-22sentry: mark S/R stating errors as save rejections / fs corruptions.Zhaozhong Ni
PiperOrigin-RevId: 209817767 Change-Id: Iddf2b8441bc44f31f9a8cf6f2bd8e7a5b824b487
2018-08-22Always add AT_BASE even if there is no interpreter.Brian Geffon
Linux will ALWAYS add AT_BASE even for a static binary, expect it will be set to 0 [1]. 1. https://github.com/torvalds/linux/blob/master/fs/binfmt_elf.c#L253 PiperOrigin-RevId: 209811129 Change-Id: I92cc66532f23d40f24414a921c030bd3481e12a0
2018-08-22Fix typoFabricio Voznika
PiperOrigin-RevId: 209788842 Change-Id: I70ecb58009777ce8f642f246bc161af1a0bf2628
2018-08-22fs: Hold Dirent.mu when calling Dirent.flush().Nicolas Lacasse
As required by the contract in Dirent.flush(). Also inline Dirent.freeze() into Dirent.Freeze(), since it is only called from there. PiperOrigin-RevId: 209783626 Change-Id: Ie6de4533d93dd299ffa01dabfa257c9cc259b1f4
2018-08-22Mark postgres as not supportedFabricio Voznika
See https://github.com/google/gvisor/issues/88 PiperOrigin-RevId: 209780532 Change-Id: Iff8004474020511503a0a5cd2cdba2b512c327ef
2018-08-21Fix TestUnixDomainSockets failure when path is too largeFabricio Voznika
UDS has a lower size limit than regular files. When running under bazel this limit is exceeded. Test was changed to always mount /tmp and use it for the test. PiperOrigin-RevId: 209717830 Change-Id: I1dbe19fe2051ffdddbaa32b188a9167f446ed193
2018-08-21sentry: do not release gofer inode file state loading lock upon error.Zhaozhong Ni
When an inode file state failed to load asynchronuously, we want to report the error instead of potentially panicing in another async loading goroutine incorrectly unblocked. PiperOrigin-RevId: 209683977 Change-Id: I591cde97710bbe3cdc53717ee58f1d28bbda9261
2018-08-21binary: append slicesIan Gudger
A new optimization in Go 1.11 improves the efficiency of slice extension: "The compiler now optimizes slice extension of the form append(s, make([]T, n)...)." https://tip.golang.org/doc/go1.11#performance-compiler Before: BenchmarkMarshalUnmarshal-12 2000000 664 ns/op 0 B/op 0 allocs/op BenchmarkReadWrite-12 500000 2395 ns/op 304 B/op 24 allocs/op After: BenchmarkMarshalUnmarshal-12 2000000 628 ns/op 0 B/op 0 allocs/op BenchmarkReadWrite-12 500000 2411 ns/op 304 B/op 24 allocs/op BenchmarkMarshalUnmarshal benchmarks the code in this package, BenchmarkReadWrite benchmarks the code in the standard library. PiperOrigin-RevId: 209679979 Change-Id: I51c6302e53f60bf79f84576b1ead4d36658897cb
2018-08-21Temporarily skip multi-container tests in container_test until deflaked.Kevin Krakauer
PiperOrigin-RevId: 209679235 Change-Id: I527e779eeb113d0c162f5e27a2841b9486f0e39f
2018-08-21Expose route tableGoogler
PiperOrigin-RevId: 209670528 Change-Id: I2890bcdef36f0b5f24b372b42cf628b38dd5764e
2018-08-21nonExclusiveFS is causing timeout with --raceFabricio Voznika
Not sure why, just removed for now to unblock the tests. PiperOrigin-RevId: 209661403 Change-Id: I72785c071687d54e22bda9073d36b447d52a7018
2018-08-21Move container_test to the container packageFabricio Voznika
PiperOrigin-RevId: 209655274 Change-Id: Id381114bdb3197c73e14f74b3f6cf1afd87d60cb
2018-08-21Build PCAP file with atomic blocking writesIan Gudger
The previous use of non-blocking writes could result in corrupt PCAP files if a partial write occurs. Using (*os.File).Write solves this problem by not allowing partial writes. This change does not increase allocations (in one path it actually reduces them), but does add additional copying. PiperOrigin-RevId: 209652974 Change-Id: I4b1cf2eda4cfd7f237a4245aceb7391b3055a66c
2018-08-21Initial change for multi-gofer supportFabricio Voznika
PiperOrigin-RevId: 209647293 Change-Id: I980fca1257ea3fcce796388a049c353b0303a8a5
2018-08-21Fix races in kernel.(*Task).Value()Ian Gudger
PiperOrigin-RevId: 209627180 Change-Id: Idc84afd38003427e411df6e75abfabd9174174e1
2018-08-20Fix handling of abstract Unix socket addressesIan Gudger
* Don't truncate abstract addresses at second null. * Properly handle abstract addresses with length < 108 bytes. PiperOrigin-RevId: 209502703 Change-Id: I49053f2d18b5a78208c3f640c27dbbdaece4f1a9
2018-08-20getdents should return type=DT_DIR for SpecialDirectories.Nicolas Lacasse
It was returning DT_UNKNOWN, and this was breaking numpy. PiperOrigin-RevId: 209459351 Change-Id: Ic6f548e23aa9c551b2032b92636cb5f0df9ccbd4
2018-08-20Standardize mounts in testsFabricio Voznika
Tests get a readonly rootfs mapped to / (which was the case before) and writable TEST_TMPDIR. This makes it easier to setup containers to write to files and to share state between test and containers. PiperOrigin-RevId: 209453224 Change-Id: I4d988e45dc0909a0450a3bb882fe280cf9c24334
2018-08-20sysfs: Add (empty) cpu directories for each cpu in /sys/devices/system/cpu.Nicolas Lacasse
Numpy needs these. Also added the "present" directory, since the contents are the same as possible and online. PiperOrigin-RevId: 209451777 Change-Id: I2048de3f57bf1c57e9b5421d607ca89c2a173684
2018-08-17Add nonExclusiveFS dimension to more testsFabricio Voznika
The ones using 'kvm' actually mean that they don't want overlay. PiperOrigin-RevId: 209194318 Change-Id: I941a443cb6d783e2c80cf66eb8d8630bcacdb574
2018-08-16fs: Support possible and online knobs for cpuChenggang Qin
Some linux commands depend on /sys/devices/system/cpu/possible, such as 'lscpu'. Add 2 knobs for cpu: /sys/devices/system/cpu/possible /sys/devices/system/cpu/online Both the values are '0 - Kernel.ApplicationCores()-1'. Change-Id: Iabd8a4e559cbb630ed249686b92c22b4e7120663 PiperOrigin-RevId: 209070163
2018-08-16Internal change.Googler
PiperOrigin-RevId: 209060862 Change-Id: I2cd02f0032b80d0087110095548b1a8ffa696ac2
2018-08-16Combine functions to search for file under one common functionFabricio Voznika
Bazel adds the build type in front of directories making it hard to refer to binaries in code. PiperOrigin-RevId: 209010854 Change-Id: I6c9da1ac3bbe79766868a3b14222dd42d03b4ec5
2018-08-15Remove obsolete comment about panickingIan Gudger
PiperOrigin-RevId: 208908702 Change-Id: I6be9c765c257a9ddb1a965a03942ab3fc3a34a43
2018-08-15runsc fsgofer: Support dynamic serving of filesystems.Kevin Krakauer
When multiple containers run inside a sentry, each container has its own root filesystem and set of mounts. Containers are also added after sentry boot rather than all configured and known at boot time. The fsgofer needs to be able to serve the root filesystem of each container. Thus, it must be possible to add filesystems after the fsgofer has already started. This change: * Creates a URPC endpoint within the gofer process that listens for requests to serve new content. * Enables the sentry, when starting a new container, to add the new container's filesystem. * Mounts those new filesystems at separate roots within the sentry. PiperOrigin-RevId: 208903248 Change-Id: Ifa91ec9c8caf5f2f0a9eead83c4a57090ce92068
2018-08-15runsc: Fix instances of file access "proxy".Nicolas Lacasse
This file access type is actually called "proxy-shared", but I forgot to update all locations. PiperOrigin-RevId: 208832491 Change-Id: I7848bc4ec2478f86cf2de1dcd1bfb5264c6276de
2018-08-14Reduce map lookups in syserrIan Gudger
PiperOrigin-RevId: 208755352 Change-Id: Ia24630f452a4a42940ab73a8113a2fd5ea2cfca2
2018-08-14runsc: Change cache policy for root fs and volume mounts.Nicolas Lacasse
Previously, gofer filesystems were configured with the default "fscache" policy, which caches filesystem metadata and contents aggressively. While this setting is best for performance, it means that changes from inside the sandbox may not be immediately propagated outside the sandbox, and vice-versa. This CL changes volumes and the root fs configuration to use a new "remote-revalidate" cache policy which tries to retain as much caching as possible while still making fs changes visible across the sandbox boundary. This cache policy is enabled by default for the root filesystem. The default value for the "--file-access" flag is still "proxy", but the behavior is changed to use the new cache policy. A new value for the "--file-access" flag is added, called "proxy-exclusive", which turns on the previous aggressive caching behavior. As the name implies, this flag should be used when the sandbox has "exclusive" access to the filesystem. All volume mounts are configured to use the new cache policy, since it is safest and most likely to be correct. There is not currently a way to change this behavior, but it's possible to add such a mechanism in the future. The configurability is a smaller issue for volumes, since most of the expensive application fs operations (walking + stating files) will likely served by the root fs. PiperOrigin-RevId: 208735037 Change-Id: Ife048fab1948205f6665df8563434dbc6ca8cfc9