gvisor - Container Runtime Sandbox

Age	Commit message (Collapse)	Author
2021-09-20	[lisa] Plumb lisafs through runsc.	Ayush Ranjan
	lisafs is only supported in VFS2. Added a runsc flag which enables lisafs. When the flag is enabled, the gofer process and the client communicate using lisafs protocol instead of 9P. Added a filesystem option in fsimpl/gofer which indicates if lisafs is being used. That will be used to gate lisafs on the gofer client. Note that this change does not make the gofer client use lisafs just yet. Updates #5465 PiperOrigin-RevId: 397917844
2021-09-16	Merge pull request #6579 from prattmic:runsc_do_profile	gVisor bot
	PiperOrigin-RevId: 397114051
2021-09-16	runsc: add global profile collection flags	Michael Pratt
	Add global flags -profile-{block,cpu,heap,mutex} and -trace which enable collection of the specified profile for the entire duration of a container execution. This provides a way to definitively start profiling before that application starts, rather than attempting to race with an out-of-band `runsc debug`. Note that only the main boot process is profiled. This exposed a bug in Task.traceExecEvent: a crash when tracing and -race are enabled. traceExecEvent is called off of the task goroutine, but uses the Task as a context, which is a violation of the Task contract. Switching to the AsyncContext fixes the issue. Fixes #220
2021-09-15	Merge pull request #6581 from prattmic:runsc_rootless	gVisor bot
	PiperOrigin-RevId: 396938550
2021-09-14	Remove extra newline	Michael Pratt
	PiperOrigin-RevId: 396754242
2021-09-14	runsc: allow rootless mode for runsc run	Michael Pratt
	Rootless mode seems to work fine for simple containers with runsc run, so allow its use. Since runsc run is more widely used, require a workable --network option is passed rather than automatically switching like runsc do does. Fixes #3036
2021-09-13	runsc/cmd: alphabetize runsc debug profiling options	Michael Pratt
	Updates #220
2021-08-13	Add Event controls	Chong Cai
	Add Event controls and implement "stream" commands. PiperOrigin-RevId: 390691702
2021-08-12	Add Usage controls	Chong Cai
	Add Usage controls and implement "usage/usagefd" commands. PiperOrigin-RevId: 390507423
2021-08-12	Clear Merkle files before measuring verity fs	Chong Cai
	PiperOrigin-RevId: 390467957
2021-08-06	[SMT] Refactor runsc mititgate	Zach Koopmans
	Refactor mitigate to use /sys/devices/system/cpu/smt/control instead of individual CPU control files. PiperOrigin-RevId: 389215975
2021-08-04	Add Fs controls	Chong Cai
	Add Fs controls and implement "cat" command. PiperOrigin-RevId: 388812540
2021-07-26	Merge pull request #6292 from btw616:local-timezone	gVisor bot
	PiperOrigin-RevId: 386988406
2021-07-20	Add go:build directives as required by Go 1.17's gofmt.	Jamie Liu
	PiperOrigin-RevId: 385894869
2021-07-13	Replace whitelist with allowlist	Fabricio Voznika
	PiperOrigin-RevId: 384586164
2021-07-12	Fix stdios ownership	Fabricio Voznika
	Set stdio ownership based on the container's user to ensure the user can open/read/write to/from stdios. 1. stdios in the host are changed to have the owner be the same uid/gid of the process running the sandbox. This ensures that the sandbox has full control over it. 2. stdios owner owner inside the sandbox is changed to match the container's user to give access inside the container and make it behave the same as runc. Fixes #6180 PiperOrigin-RevId: 384347009
2021-07-12	Fix GoLand analyzer errors under runsc/...	Fabricio Voznika
	PiperOrigin-RevId: 384344990
2021-07-09	runsc: fix the local timezone support in logs	Tiwei Bie
	This patch fixes the local timezone support in logs by creating etc/localtime in the rootfs of sandbox process and gofer process based on the current /etc/localtime on host. Before this patch, the timestamps in sandbox and gofer logs will fallback to UTC timezone after execving "/proc/self/exe" which may not be very convenient for users to analyse the logs: I0708 15:37:43.825100 1 chroot.go:69] Setting up sandbox chroot in "/tmp" I0708 15:37:43.825189 1 chroot.go:31] Mounting "proc" at "/tmp/proc" ...... I0708 15:37:43.850926 1 cmd.go:73] Execve "/proc/self/exe" again, bye! I0708 07:37:43.856719 1 main.go:218] *************************** I0708 07:37:43.856751 1 main.go:219] Args: [runsc-sandbox --root=/run/...] I0708 07:37:43.856785 1 main.go:220] Version release-20210628.0-27-g02fec8dba5a6 I0708 07:37:43.856795 1 main.go:221] GOOS: linux I0708 07:37:43.856803 1 main.go:222] GOARCH: amd64 ...... Fixes #1984 Signed-off-by: Tiwei Bie <tiwei.btw@antgroup.com>
2021-07-09	runsc: check the error when preparing tree for pivot_root	Tiwei Bie
	Signed-off-by: Tiwei Bie <tiwei.btw@antgroup.com>
2021-07-08	clarify safemount behavior	Kevin Krakauer
	PiperOrigin-RevId: 383750666
2021-07-08	Replace kernel.ExitStatus with linux.WaitStatus.	Jamie Liu
	PiperOrigin-RevId: 383705129
2021-07-02	runsc: validate mount targets	Kevin Krakauer
	PiperOrigin-RevId: 382845950
2021-07-01	Mix checklocks and atomic analyzers.	Adin Scannell
	This change makes the checklocks analyzer considerable more powerful, adding: * The ability to traverse complex structures, e.g. to have multiple nested fields as part of the annotation. * The ability to resolve simple anonymous functions and closures, and perform lock analysis across these invocations. This does not apply to closures that are passed elsewhere, since it is not possible to know the context in which they might be invoked. * The ability to annotate return values in addition to receivers and other parameters, with the same complex structures noted above. * Ignoring locking semantics for "fresh" objects, i.e. objects that are allocated in the local frame (typically a new-style function). * Sanity checking of locking state across block transitions and returns, to ensure that no unexpected locks are held. Note that initially, most of these findings are excluded by a comprehensive nogo.yaml. The findings that are included are fundamental lock violations. The changes here should be relatively low risk, minor refactorings to either include necessary annotations to simplify the code structure (in general removing closures in favor of methods) so that the analyzer can be easily track the lock state. This change additional includes two changes to nogo itself: * Sanity checking of all types to ensure that the binary and ast-derived types have a consistent objectpath, to prevent the bug above from occurring silently (and causing much confusion). This also requires a trick in order to ensure that serialized facts are consumable downstream. This can be removed with https://go-review.googlesource.com/c/tools/+/331789 merged. * A minor refactoring to isolation the objdump settings in its own package. This was originally used to implement the sanity check above, but this information is now being passed another way. The minor refactor is preserved however, since it cleans up the code slightly and is minimal risk. PiperOrigin-RevId: 382613300
2021-06-09	Remove --overlayfs-stale-read flag	Fabricio Voznika
	It defaults to true and setting it to false can cause filesytem corruption. PiperOrigin-RevId: 378518663
2021-05-10	Merge pull request #5764 from zhlhahaha:2126-2	gVisor bot
	PiperOrigin-RevId: 372993341
2021-04-16	Allow runsc to generate coverage reports.	Dean Deng
	Add a coverage-report flag that will cause the sandbox to generate a coverage report (with suffix .cov) in the debug log directory upon exiting. For the report to be generated, runsc must have been built with the following Bazel flags: `--collect_code_coverage --instrumentation_filter=...`. With coverage reports, we should be able to aggregate results across all tests to surface code coverage statistics for the project as a whole. The report is simply a text file with each line representing a covered block as `file:start_line.start_col,end_line.end_col`. Note that this is similar to the format of coverage reports generated with `go test -coverprofile`, although we omit the count and number of statements, which are not useful for us. Some simple ways of getting coverage reports: bazel test <some_test> --collect_code_coverage \ --instrumentation_filter=//pkg/... bazel build //runsc --collect_code_coverage \ --instrumentation_filter=//pkg/... runsc -coverage-report=dir/ <other_flags> do ... PiperOrigin-RevId: 368952911
2021-04-16	Internal change	Zach Koopmans
	PiperOrigin-RevId: 368919504
2021-04-05	Set Verity bit in verity_prepare cmd	Chong Cai
	This is needed to enable Xattrs features required by verity. PiperOrigin-RevId: 366843640
2021-04-02	Implement the runsc verity-prepare command.	Rahat Mahmood
	Implement a new runsc command to set up a sandbox with verityfs and run the measure tool. This is loosely forked from the do command, and currently requires the caller to provide the measure tool binary. PiperOrigin-RevId: 366553769
2021-04-01	Disable mitigate and related test on ARM64	Howard Zhang
	As MDS side channel attack does not affect ARM64, we disable mitigate on ARM64 in case misusage. For more detail, please refer to: https://access.redhat.com/security/vulnerabilities/mds Signed-off-by: Howard Zhang <howard.zhang@arm.com>
2021-03-30	Fix panic when overriding /dev files with VFS2	Fabricio Voznika
	VFS1 skips over mounts that overrides files in /dev because the list of files is hardcoded. This is not needed for VFS2 and a recent change lifted this restriction. However, parts of the code were still skipping /dev mounts even in VFS2, causing the loader to panic when it ran short of FDs to connect to the gofer. PiperOrigin-RevId: 365858436
2021-03-23	setgid directory support in goferfs	Kevin Krakauer
	Also adds support for clearing the setuid bit when appropriate (writing, truncating, changing size, changing UID, or changing GID). VFS2 only. PiperOrigin-RevId: 364661835
2021-03-23	Allow FSETXATTR/FGETXATTR host calls for Verity	Chong Cai
	These host calls are needed for Verity fs to generate/verify hashes. PiperOrigin-RevId: 364598180
2021-03-18	Skip /dev submount hack on VFS2.	Jamie Liu
	containerd usually configures both /dev and /dev/shm as tmpfs mounts, e.g.: ``` "mounts": [ ... { "destination": "/dev", "type": "tmpfs", "source": "/run/containerd/io.containerd.runtime.v2.task/moby/10eedbd6a0e7937ddfcab90f2c25bd9a9968b734c4ae361318142165d445e67e/tmpfs", "options": [ "nosuid", "strictatime", "mode=755", "size=65536k" ] }, ... { "destination": "/dev/shm", "type": "tmpfs", "source": "/run/containerd/io.containerd.runtime.v2.task/moby/10eedbd6a0e7937ddfcab90f2c25bd9a9968b734c4ae361318142165d445e67e/shm", "options": [ "nosuid", "noexec", "nodev", "mode=1777", "size=67108864" ] }, ... ``` (This is mostly consistent with how Linux is usually configured, except that /dev is conventionally devtmpfs, not regular tmpfs. runc/libcontainer implements OCI-runtime-spec-undocumented behavior to create /dev/{ptmx,fd,stdin,stdout,stderr} in non-bind /dev mounts. runsc silently switches /dev to devtmpfs. In VFS1, this is necessary to get device files like /dev/null at all, since VFS1 doesn't support real device special files, only what is hardcoded in devfs. VFS2 does support device special files, but using devtmpfs is the easiest way to get pre-created files in /dev.) runsc ignores many /dev submounts in the spec, including /dev/shm. In VFS1, this appears to be to avoid introducing a submount overlay for /dev, and is mostly fine since the typical mode for the /dev/shm mount is ~consistent with the mode of the /dev/shm directory provided by devfs (modulo the sticky bit). In VFS2, this is vestigial (VFS2 does not use submount overlays), and devtmpfs' /dev/shm mode is correct for the mount point but not the mount. So turn off this behavior for VFS2. After this change: ``` $ docker run --rm -it ubuntu:focal ls -lah /dev/shm total 0 drwxrwxrwt 2 root root 40 Mar 18 00:16 . drwxr-xr-x 5 root root 360 Mar 18 00:16 .. $ docker run --runtime=runsc --rm -it ubuntu:focal ls -lah /dev/shm total 0 drwxrwxrwx 1 root root 0 Mar 18 00:16 . dr-xr-xr-x 1 root root 0 Mar 18 00:16 .. $ docker run --runtime=runsc-vfs2 --rm -it ubuntu:focal ls -lah /dev/shm total 0 drwxrwxrwt 2 root root 40 Mar 18 00:16 . drwxr-xr-x 5 root root 320 Mar 18 00:16 .. ``` Fixes #5687 PiperOrigin-RevId: 363699385
2021-03-11	Major refactor of runsc mitigate.	Zach Koopmans
	PiperOrigin-RevId: 362360425
2021-03-08	Internal change.	Chong Cai
	PiperOrigin-RevId: 361689477
2021-03-06	[op] Replace syscall package usage with golang.org/x/sys/unix in runsc/.	Ayush Ranjan
	The syscall package has been deprecated in favor of golang.org/x/sys. Note that syscall is still used in some places because the following don't seem to have an equivalent in unix package: - syscall.SysProcIDMap - syscall.Credential Updates #214 PiperOrigin-RevId: 361381490
2021-03-02	Add reverse flag to mitigate.	Zach Koopmans
	Add reverse operation to mitigate that just enables all CPUs. PiperOrigin-RevId: 360511215
2021-02-22	Fix `runsc kill --pid`	Fabricio Voznika
	Previously, loader.signalProcess was inconsitently using both root and container's PID namespace to find the process. It used root namespace for the exec'd process and container's PID namespace for other processes. This fixes the code to use the root PID namespace across the board, which is the same PID reported in `runsc ps` (or soon will after https://github.com/google/gvisor/pull/5519). PiperOrigin-RevId: 358836297
2021-02-10	Add mitigate command to runsc	Zach Koopmans
	PiperOrigin-RevId: 356772367
2021-02-02	Stub out basic `runsc events --stat` CPU functionality	Kevin Krakauer
	Because we lack gVisor-internal cgroups, we take the CPU usage of the entire pod and divide it proportionally according to sentry-internal usage stats. This fixes `kubectl top pods`, which gets a pod's CPU usage by summing the usage of its containers. Addresses #172. PiperOrigin-RevId: 355229833
2021-01-13	Switch uses of os.Getenv that check for empty string to os.LookupEnv.	Dean Deng
	Whether the variable was found is already returned by syscall.Getenv. os.Getenv drops this value while os.Lookupenv passes it along. PiperOrigin-RevId: 351674032
2021-01-12	Fix simple mistakes identified by goreportcard.	Adin Scannell
	These are primarily simplification and lint mistakes. However, minor fixes are also included and tests added where appropriate. PiperOrigin-RevId: 351425971
2021-01-11	OCI spec may contain duplicate environment variables	Fabricio Voznika
	Closes #5226 PiperOrigin-RevId: 351259576
2021-01-05	Add benchmarks targets to BuildKite.	Adin Scannell
	This includes minor fix-ups: * Handle SIGTERM in runsc debug, to exit gracefully. * Fix cmd.debug.go opening all profiles as RDONLY. * Fix the test name in fio_test.go, and encode the block size in the test. PiperOrigin-RevId: 350205718
2020-12-30	Fix condition checking in `runsc debug`	Fabricio Voznika
	Closes #5052 PiperOrigin-RevId: 349579814
2020-12-29	Make profiling commands synchronous.	Adin Scannell
	This allows for a model of profiling when you can start collection, and it will terminate when the sandbox terminates. Without this synchronous call, it is effectively impossible to collect length blocking and mutex profiles. PiperOrigin-RevId: 349483418
2020-12-17	Add sandbox ID to state file name	Fabricio Voznika
	This allows to find all containers inside a sandbox more efficiently. This operation is required every time a container starts and stops, and previously required loading all container state files to check whether the container belonged to the sandbox. Apert from being inneficient, it has caused problems when state files are stale or corrupt, causing inavalability to create any container. Also adjust commands `list` and `debug` to skip over files that fail to load. Resolves #5052 PiperOrigin-RevId: 348050637
2020-12-11	Add runsc symbolize command.	Dean Deng
	This command takes instruction pointers from stdin and converts them into their corresponding file names and line/column numbers in the runsc source code. The inputs are not interpreted as actual addresses, but as synthetic values that are exposed through /sys/kernel/debug/kcov. One can extract coverage information from kcov and translate those values into locations in the source code by running symbolize on the same runsc binary. This will allow us to generate syzkaller coverage reports. PiperOrigin-RevId: 347089624
2020-12-03	Surface usage message for `runsc do`.	Dean Deng
	c.Usage() only returns a string; f.Usage() will print the usage message. PiperOrigin-RevId: 345500123