Age | Commit message (Collapse) | Author |
|
PiperOrigin-RevId: 210422178
Change-Id: I984dd348d467908bc3180a20fc79b8387fcca05e
|
|
PiperOrigin-RevId: 210405166
Change-Id: I252766015885c418e914007baf2fc058fec39b3e
|
|
Now each container gets its own dedicated gofer that is chroot'd to the
rootfs path. This is done to add an extra layer of security in case the
gofer gets compromised.
PiperOrigin-RevId: 210396476
Change-Id: Iba21360a59dfe90875d61000db103f8609157ca0
|
|
Implements the TIOCGWINSZ and TIOCSWINSZ ioctls, which allow processes to resize
the terminal. This allows, for example, sshd to properly set the window size for
ssh sessions.
PiperOrigin-RevId: 210392504
Change-Id: I0d4789154d6d22f02509b31d71392e13ee4a50ba
|
|
PiperOrigin-RevId: 210221388
Change-Id: Ic82d592b8c4778855fa55ba913f6b9a10b2d511f
|
|
This CL adds terminal support for "docker exec". We previously only supported
consoles for the container process, but not exec processes.
The SYS_IOCTL syscall was added to the default seccomp filter list, but only
for ioctls that get/set winsize and termios structs. We need to allow these
ioctl for all containers because it's possible to run "exec -ti" on a
container that was started without an attached console, after the filters
have been installed.
Note that control-character signals are still not properly supported.
Tested with:
$ docker run --runtime=runsc -it alpine
In another terminial:
$ docker exec -it <containerid> /bin/sh
PiperOrigin-RevId: 210185456
Change-Id: I6d2401e53a7697bb988c120a8961505c335f96d9
|
|
PiperOrigin-RevId: 210182476
Change-Id: I655a2a801e2069108d30323f7f5ae76deb3ea3ec
|
|
Compared to previous compressio / hashio nesting, there is up to 100% speedup.
PiperOrigin-RevId: 210161269
Change-Id: I481aa9fe980bb817fe465fe34d32ea33fc8abf1c
|
|
Previously, runsc improperly attempted to find an executable in the container's
PATH.
We now search the PATH via the container's fsgofer rather than the host FS,
eliminating the confusing differences between paths on the host and within a
container.
PiperOrigin-RevId: 210159488
Change-Id: I228174dbebc4c5356599036d6efaa59f28ff28d2
|
|
PiperOrigin-RevId: 210131001
Change-Id: I285707c5143b3e4c9a6948c1d1a452b6f16e65b7
|
|
This is used when '--overlay=true' to guarantee writes are not sent to gofer.
PiperOrigin-RevId: 210116288
Change-Id: I7616008c4c0e8d3668e07a205207f46e2144bf30
|
|
PiperOrigin-RevId: 210021612
Change-Id: If7c161e6fd08cf17942bfb6bc5a8d2c4e271c61e
|
|
Otherwise the socket saving logic might find workers still running for closed
sockets unexpectedly.
PiperOrigin-RevId: 210018905
Change-Id: I443a04d355613f5f9983252cc6863bff6e0eda3a
|
|
PiperOrigin-RevId: 209994384
Change-Id: I16186cf79cb4760a134f3968db30c168a5f4340e
|
|
Removed syscalls that are only used by whitelistfs
which has its own set of filters.
PiperOrigin-RevId: 209967259
Change-Id: Idb2e1b9d0201043d7cd25d96894f354729dbd089
|
|
PiperOrigin-RevId: 209943212
Change-Id: I96dcbc7c2ab2426e510b94a564436505256c5c79
|
|
The bug was caused by os.File's finalizer, which closes the file. Because
fsgofer.serve() was passed a file descriptor as an int rather than a os.File,
callers would pass os.File.Fd(), and the os.File would go out of scope. Thus,
the file would get GC'd and finalized nondeterministically, causing failures
when the file was used.
PiperOrigin-RevId: 209861834
Change-Id: Idf24d5c1f04c9b28659e62c97202ab3b4d72e994
|
|
This improves debugging for pagetable-related issues.
PiperOrigin-RevId: 209827795
Change-Id: I4cfa11664b0b52f26f6bc90a14c5bb106f01e038
|
|
PiperOrigin-RevId: 209819644
Change-Id: I329d054bf8f4999e7db0dcd95b13f7793c65d4e2
|
|
PiperOrigin-RevId: 209817767
Change-Id: Iddf2b8441bc44f31f9a8cf6f2bd8e7a5b824b487
|
|
Linux will ALWAYS add AT_BASE even for a static binary, expect it
will be set to 0 [1].
1. https://github.com/torvalds/linux/blob/master/fs/binfmt_elf.c#L253
PiperOrigin-RevId: 209811129
Change-Id: I92cc66532f23d40f24414a921c030bd3481e12a0
|
|
PiperOrigin-RevId: 209788842
Change-Id: I70ecb58009777ce8f642f246bc161af1a0bf2628
|
|
As required by the contract in Dirent.flush().
Also inline Dirent.freeze() into Dirent.Freeze(), since it is only called from
there.
PiperOrigin-RevId: 209783626
Change-Id: Ie6de4533d93dd299ffa01dabfa257c9cc259b1f4
|
|
See https://github.com/google/gvisor/issues/88
PiperOrigin-RevId: 209780532
Change-Id: Iff8004474020511503a0a5cd2cdba2b512c327ef
|
|
UDS has a lower size limit than regular files. When running under bazel
this limit is exceeded. Test was changed to always mount /tmp and use
it for the test.
PiperOrigin-RevId: 209717830
Change-Id: I1dbe19fe2051ffdddbaa32b188a9167f446ed193
|
|
When an inode file state failed to load asynchronuously, we want to report
the error instead of potentially panicing in another async loading goroutine
incorrectly unblocked.
PiperOrigin-RevId: 209683977
Change-Id: I591cde97710bbe3cdc53717ee58f1d28bbda9261
|
|
A new optimization in Go 1.11 improves the efficiency of slice extension:
"The compiler now optimizes slice extension of the form append(s, make([]T, n)...)."
https://tip.golang.org/doc/go1.11#performance-compiler
Before:
BenchmarkMarshalUnmarshal-12 2000000 664 ns/op 0 B/op 0 allocs/op
BenchmarkReadWrite-12 500000 2395 ns/op 304 B/op 24 allocs/op
After:
BenchmarkMarshalUnmarshal-12 2000000 628 ns/op 0 B/op 0 allocs/op
BenchmarkReadWrite-12 500000 2411 ns/op 304 B/op 24 allocs/op
BenchmarkMarshalUnmarshal benchmarks the code in this package, BenchmarkReadWrite benchmarks the code in the standard library.
PiperOrigin-RevId: 209679979
Change-Id: I51c6302e53f60bf79f84576b1ead4d36658897cb
|
|
PiperOrigin-RevId: 209679235
Change-Id: I527e779eeb113d0c162f5e27a2841b9486f0e39f
|
|
PiperOrigin-RevId: 209670528
Change-Id: I2890bcdef36f0b5f24b372b42cf628b38dd5764e
|
|
Not sure why, just removed for now to unblock the tests.
PiperOrigin-RevId: 209661403
Change-Id: I72785c071687d54e22bda9073d36b447d52a7018
|
|
PiperOrigin-RevId: 209655274
Change-Id: Id381114bdb3197c73e14f74b3f6cf1afd87d60cb
|
|
The previous use of non-blocking writes could result in corrupt PCAP files if a
partial write occurs. Using (*os.File).Write solves this problem by not
allowing partial writes. This change does not increase allocations (in one path
it actually reduces them), but does add additional copying.
PiperOrigin-RevId: 209652974
Change-Id: I4b1cf2eda4cfd7f237a4245aceb7391b3055a66c
|
|
PiperOrigin-RevId: 209647293
Change-Id: I980fca1257ea3fcce796388a049c353b0303a8a5
|
|
PiperOrigin-RevId: 209627180
Change-Id: Idc84afd38003427e411df6e75abfabd9174174e1
|
|
* Don't truncate abstract addresses at second null.
* Properly handle abstract addresses with length < 108 bytes.
PiperOrigin-RevId: 209502703
Change-Id: I49053f2d18b5a78208c3f640c27dbbdaece4f1a9
|
|
It was returning DT_UNKNOWN, and this was breaking numpy.
PiperOrigin-RevId: 209459351
Change-Id: Ic6f548e23aa9c551b2032b92636cb5f0df9ccbd4
|
|
Tests get a readonly rootfs mapped to / (which was the case before)
and writable TEST_TMPDIR. This makes it easier to setup containers to
write to files and to share state between test and containers.
PiperOrigin-RevId: 209453224
Change-Id: I4d988e45dc0909a0450a3bb882fe280cf9c24334
|
|
Numpy needs these.
Also added the "present" directory, since the contents are the same as possible
and online.
PiperOrigin-RevId: 209451777
Change-Id: I2048de3f57bf1c57e9b5421d607ca89c2a173684
|
|
The ones using 'kvm' actually mean that they don't want overlay.
PiperOrigin-RevId: 209194318
Change-Id: I941a443cb6d783e2c80cf66eb8d8630bcacdb574
|
|
Some linux commands depend on /sys/devices/system/cpu/possible, such
as 'lscpu'.
Add 2 knobs for cpu:
/sys/devices/system/cpu/possible
/sys/devices/system/cpu/online
Both the values are '0 - Kernel.ApplicationCores()-1'.
Change-Id: Iabd8a4e559cbb630ed249686b92c22b4e7120663
PiperOrigin-RevId: 209070163
|
|
PiperOrigin-RevId: 209060862
Change-Id: I2cd02f0032b80d0087110095548b1a8ffa696ac2
|
|
Bazel adds the build type in front of directories making it hard to
refer to binaries in code.
PiperOrigin-RevId: 209010854
Change-Id: I6c9da1ac3bbe79766868a3b14222dd42d03b4ec5
|
|
PiperOrigin-RevId: 208908702
Change-Id: I6be9c765c257a9ddb1a965a03942ab3fc3a34a43
|
|
When multiple containers run inside a sentry, each container has its own root
filesystem and set of mounts. Containers are also added after sentry boot rather
than all configured and known at boot time.
The fsgofer needs to be able to serve the root filesystem of each container.
Thus, it must be possible to add filesystems after the fsgofer has already
started.
This change:
* Creates a URPC endpoint within the gofer process that listens for requests to
serve new content.
* Enables the sentry, when starting a new container, to add the new container's
filesystem.
* Mounts those new filesystems at separate roots within the sentry.
PiperOrigin-RevId: 208903248
Change-Id: Ifa91ec9c8caf5f2f0a9eead83c4a57090ce92068
|
|
This file access type is actually called "proxy-shared", but I forgot to update
all locations.
PiperOrigin-RevId: 208832491
Change-Id: I7848bc4ec2478f86cf2de1dcd1bfb5264c6276de
|
|
PiperOrigin-RevId: 208755352
Change-Id: Ia24630f452a4a42940ab73a8113a2fd5ea2cfca2
|
|
Previously, gofer filesystems were configured with the default "fscache"
policy, which caches filesystem metadata and contents aggressively. While this
setting is best for performance, it means that changes from inside the sandbox
may not be immediately propagated outside the sandbox, and vice-versa.
This CL changes volumes and the root fs configuration to use a new
"remote-revalidate" cache policy which tries to retain as much caching as
possible while still making fs changes visible across the sandbox boundary.
This cache policy is enabled by default for the root filesystem. The default
value for the "--file-access" flag is still "proxy", but the behavior is
changed to use the new cache policy.
A new value for the "--file-access" flag is added, called "proxy-exclusive",
which turns on the previous aggressive caching behavior. As the name implies,
this flag should be used when the sandbox has "exclusive" access to the
filesystem.
All volume mounts are configured to use the new cache policy, since it is
safest and most likely to be correct. There is not currently a way to change
this behavior, but it's possible to add such a mechanism in the future. The
configurability is a smaller issue for volumes, since most of the expensive
application fs operations (walking + stating files) will likely served by the
root fs.
PiperOrigin-RevId: 208735037
Change-Id: Ife048fab1948205f6665df8563434dbc6ca8cfc9
|
|
Now, there's a waiter for each end (master and slave) of the TTY, and each
waiter.Entry is only enqueued in one of the waiters.
PiperOrigin-RevId: 208734483
Change-Id: I06996148f123075f8dd48cde5a553e2be74c6dce
|
|
stat()-ing /proc/PID/fd/FD incremented but didn't decrement the refcount for
FD. This behavior wasn't usually noticeable, but in the above case:
- ls would never decrement the refcount of the write end of the pipe to 0.
- This caused the write end of the pipe never to close.
- wc would then hang read()-ing from the pipe.
PiperOrigin-RevId: 208728817
Change-Id: I4fca1ba5ca24e4108915a1d30b41dc63da40604d
|
|
PiperOrigin-RevId: 208720936
Change-Id: Ic943a88b6efeff49574306d4d4e1f113116ae32e
|