Age | Commit message (Collapse) | Author |
|
Sandbox root dir was not being saved with the Container state,
so it would point to the wrong directory location when attempting
to lock the sandbox. This led to race conditions saving and
loading container state. Fixing it, led to multiple deadlocks.
I've moved the saving and locking logic to a separate struct and
moved the lock file inside the RootDir (instead of container
root dir), which allows the lock to be taken inside Destroy,
and removes the need to lock the sandbox.
PiperOrigin-RevId: 277599612
|
|
container.startContainers() cannot be called twice in a test
(e.g. TestMultiContainerLoadSandbox) because the cleanup
function deletes the rootDir, together with information from
all other containers that may exist.
PiperOrigin-RevId: 276591806
|
|
Obligatory https://xkcd.com/927
Fixes #626
|
|
Linux kernel before 4.19 doesn't implement a feature that updates
open FD after a file is open for write (and is copied to the upper
layer). Already open FD will continue to read the old file content
until they are reopened. This is especially problematic for gVisor
because it caches open files.
Flag was added to force readonly files to be reopenned when the
same file is open for write. This is only needed if using kernels
prior to 4.19.
Closes #1006
It's difficult to really test this because we never run on tests
on older kernels. I'm adding a test in GKE which uses kernels
with the overlayfs problem for 1.14 and lower.
PiperOrigin-RevId: 275115289
|
|
Options that do not change mount behavior inside the Sentry are
irrelevant and should not be used when looking for possible
incompatibilities between master and slave mounts.
PiperOrigin-RevId: 273593486
|
|
'docker exec' was getting CAP_NET_RAW even when --net-raw=false
because it was not filtered out from when copying container's
capabilities.
PiperOrigin-RevId: 272260451
|
|
- Sandbox logs are generated when running tests
- Kokoro uploads the sandbox logs
- Supports multiple parallel runs
- Revive script to install locally built runsc with docker
PiperOrigin-RevId: 269337274
|
|
This is done because the root container for CRI is the infrastructure (pause)
container and always gets a low oom_score_adj. We do this to ensure that only
the oom_score_adj of user containers is used to calculated the sandbox
oom_score_adj.
Implemented in runsc rather than the containerd shim as it's a bit cleaner to
implement here (in the shim it would require overwriting the oomScoreAdj and
re-writing out the config.json again). This processing is Kubernetes(CRI)
specific but we are currently only supporting CRI for multi-container support
anyway.
PiperOrigin-RevId: 267507706
|
|
Some processes are reparented to the root container depending
on the kill order and the root container would not reap in time.
So some zombie processes were still present when the test checked.
Fix it by running the second container inside a PID namespace.
PiperOrigin-RevId: 267278591
|
|
The simple test script has gotten out of control. Shard this script into
different pieces and attempt to impose order on overall test structure. This
change helps lay some of the foundations for future improvements.
* The runsc/test directories are moved into just test/.
* The runsc/test/testutil package is split into logical pieces.
* The scripts/ directory contains new top-level targets.
* Each test is now responsible for building targets it requires.
* The install functionality is moved into `runsc` itself for simplicity.
* The existing kokoro run_tests.sh file now just calls all (can be split).
After this change is merged, I will create multiple distinct workflows for
Kokoro, one for each of the scripts currently targeted by `run_tests.sh` today,
which should dramatically reduce the time-to-run for the Kokoro tests, and
provides a better foundation for further improvements to the infrastructure.
PiperOrigin-RevId: 267081397
|
|
This used to be the case, but regressed after a recent change.
Also made a few fixes around it and clean up the code a bit.
Closes #720
PiperOrigin-RevId: 265717496
|
|
Updates #512
PiperOrigin-RevId: 262195448
|
|
PiperOrigin-RevId: 262071646
|
|
Each gofer now has a goroutine that polls on the FDs used
to communicate with the sandbox. The respective gofer is
destroyed if any of the FDs is closed.
Closes #601
PiperOrigin-RevId: 261383725
|
|
Set /proc/self/oom_score_adj based on oomScoreAdj specified in the OCI bundle.
When new containers are added to the sandbox oom_score_adj for the sandbox and
all other gofers are adjusted so that oom_score_adj is equal to the lowest
oom_score_adj of all containers in the sandbox.
Fixes #512
PiperOrigin-RevId: 261242725
|
|
The different containers in a sandbox used only one pid
namespace before. This results in that a container can see
the processes in another container in the same sandbox.
This patch use different pid namespace for different containers.
Signed-off-by: chris.zn <chris.zn@antfin.com>
|
|
This keeps all container filesystem completely separate from eachother
(including from the root container filesystem), and allows us to get rid of the
"__runsc_containers__" directory.
It also simplifies container startup/teardown as we don't have to muck around
in the root container's filesystem.
PiperOrigin-RevId: 259613346
|
|
PiperOrigin-RevId: 257037608
|
|
PiperOrigin-RevId: 256494243
|
|
Addresses obvious typos, in the documentation only.
COPYBARA_INTEGRATE_REVIEW=https://github.com/google/gvisor/pull/443 from Pixep:fix/documentation-spelling 4d0688164eafaf0b3010e5f4824b35d1e7176d65
PiperOrigin-RevId: 255477779
|
|
PiperOrigin-RevId: 253882115
|
|
There were 3 string arguments that could be easily misplaced
and it makes it easier to add new arguments, especially for
Container that has dozens of callers.
PiperOrigin-RevId: 253872074
|
|
This can be merged after:
https://github.com/google/gvisor-website/pull/77
or
https://github.com/google/gvisor-website/pull/78
PiperOrigin-RevId: 253132620
|
|
'--rootless' flag lets a non-root user execute 'runsc do'.
The drawback is that the sandbox and gofer processes will
run as root inside a user namespace that is mapped to the
caller's user, intead of nobody. And network is defaulted
to '--network=host' inside the root network namespace. On
the bright side, it's very convenient for testing:
runsc --rootless do ls
runsc --rootless do curl www.google.com
PiperOrigin-RevId: 252840970
|
|
Parse annotations containing 'gvisor.dev/spec/mount' that gives
hints about how mounts are shared between containers inside a
pod. This information can be used to better inform how to mount
these volumes inside gVisor. For example, a volume that is shared
between containers inside a pod can be bind mounted inside the
sandbox, instead of being two independent mounts.
For now, this information is used to allow the same tmpfs mounts
to be shared between containers which wasn't possible before.
PiperOrigin-RevId: 252704037
|
|
clearStatus was added to allow detached execution to wait
on the exec'd process and retrieve its exit status. However,
it's not currently used. Both docker and gvisor-containerd-shim
wait on the "shim" process and retrieve the exit status from
there. We could change gvisor-containerd-shim to use waits, but
it will end up also consuming a process for the wait, which is
similar to having the shim process.
Closes #234
PiperOrigin-RevId: 251349490
|
|
Change-Id: I02b30de13f1393df66edf8829fedbf32405d18f8
PiperOrigin-RevId: 246621192
|
|
Opensource tools (e. g. https://github.com/fatih/vim-go) can't hanlde more than
one golang package in one directory.
PiperOrigin-RevId: 246435962
Change-Id: I67487915e3838762424b2d168efc54ae34fb801f
|
|
Sandbox always runsc with IP 192.168.10.2 and the peer
network adds 1 to the address (192.168.10.3). Sandbox
IP can be changed using --ip flag.
Here a few examples:
sudo runsc do curl www.google.com
sudo runsc do --ip=10.10.10.2 bash -c "echo 123 | netcat -l -p 8080"
PiperOrigin-RevId: 246421277
Change-Id: I7b3dce4af46a57300350dab41cb27e04e4b6e9da
|
|
Based on the guidelines at
https://opensource.google.com/docs/releasing/authors/.
1. $ rg -l "Google LLC" | xargs sed -i 's/Google LLC.*/The gVisor Authors./'
2. Manual fixup of "Google Inc" references.
3. Add AUTHORS file. Authors may request to be added to this file.
4. Point netstack AUTHORS to gVisor AUTHORS. Drop CONTRIBUTORS.
Fixes #209
PiperOrigin-RevId: 245823212
Change-Id: I64530b24ad021a7d683137459cafc510f5ee1de9
|
|
PiperOrigin-RevId: 245818639
Change-Id: I03703ef0fb9b6675955637b9fe2776204c545789
|
|
Create, Start, and Destroy were racing to create and destroy the
metadata directory of containers.
This is a re-upload of
https://gvisor-review.googlesource.com/c/gvisor/+/16260, but with the
correct account.
Change-Id: I16b7a9d0971f0df873e7f4145e6ac8f72730a4f1
PiperOrigin-RevId: 244892991
|
|
Otherwise, we will not have capabilities in the user namespace.
And this patch adds the noexec option for mounts.
https://github.com/google/gvisor/issues/145
PiperOrigin-RevId: 242706519
Change-Id: I1b78b77d6969bd18038c71616e8eb7111b71207c
|
|
PiperOrigin-RevId: 241056805
Change-Id: I13ea8f5dbfb01ca02a3b0ab887b8c3bdf4d556a6
|
|
Properly handle propagation options for root and mounts. Now usage of
mount options shared, rshared, and noexec cause error to start. shared/
rshared breaks sandbox=>host isolation. slave however can be supported
because changes propagate from host to sandbox.
Root FS setup moved inside the gofer. Apart from simplifying the code,
it keeps all mounts inside the namespace. And they are torn down when
the namespace is destroyed (DestroyFS is no longer needed).
PiperOrigin-RevId: 239037661
Change-Id: I8b5ee4d50da33c042ea34fa68e56514ebe20e6e0
|
|
This can happen when 'docker run --cgroup-parent=' flag is set.
PiperOrigin-RevId: 235645559
Change-Id: Ieea3ae66939abadab621053551bf7d62d412e7ee
|
|
PiperOrigin-RevId: 231864273
Change-Id: I8545b72b615f5c2945df374b801b80be64ec3e13
|
|
Nothing reads them and they can simply get stale.
Generated with:
$ sed -i "s/licenses(\(.*\)).*/licenses(\1)/" **/BUILD
PiperOrigin-RevId: 231818945
Change-Id: Ibc3f9838546b7e94f13f217060d31f4ada9d4bf0
|
|
PiperOrigin-RevId: 231263114
Change-Id: I57467a34fe94e395fdd3685462c4fe9776d040a3
|
|
When file size changes outside the sandbox, page cache was not
refreshing file size which is required for cacheRemoteRevalidating.
In fact, cacheRemoteRevalidating should be skipping the cache
completely since it's not really benefiting from it. The cache is
cache is already bypassed for unstable attributes (see
cachePolicy.cacheUAttrs). And althought the cache is called to
map pages, they will always miss the cache and map directly from
the host.
Created a HostMappable struct that maps directly to the host and
use it for files with cacheRemoteRevalidating.
Closes #124
PiperOrigin-RevId: 230998440
Change-Id: Ic5f632eabe33b47241e05e98c95e9b2090ae08fc
|
|
In Container.Destroy(), we call c.stop() before calling
executeHooksBestEffort(), therefore, when we call
executeHooksBestEffort(c.Spec.Hooks.Poststop, c.State()) to execute
the poststop hook, it results in a nil pointer dereference since it
reads c.Sandbox.Pid in c.State() after the sandbox has been destroyed.
To fix this bug, we can change container's status to "stopped" before
executing the poststop hook.
Signed-off-by: ShiruRen <renshiru2000@gmail.com>
Change-Id: I4d835e430066fab7e599e188f945291adfc521ef
PiperOrigin-RevId: 230975505
|
|
Mounting lib and lib64 are not necessary anymore and simplifies the test.
PiperOrigin-RevId: 230971195
Change-Id: Ib91a3ffcec4b322cd3687c337eedbde9641685ed
|
|
PiperOrigin-RevId: 230437407
Change-Id: Id9d8ceeb018aad2fe317407c78c6ee0f4b47aa2b
|
|
Removed "error" and "failed to" prefix that don't add value
from messages. Adjusted a few other messages. In particular,
when the container fail to start, the message returned is easier
for humans to read:
$ docker run --rm --runtime=runsc alpine foobar
docker: Error response from daemon: OCI runtime start failed: <path> did not terminate sucessfully: starting container: starting root container [foobar]: starting sandbox: searching for executable "foobar", cwd: "/", $PATH="/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin": no such file or directory
Closes #77
PiperOrigin-RevId: 230022798
Change-Id: I83339017c70dae09e4f9f8e0ea2e554c4d5d5cd1
|
|
In addition, it fixes a race condition in TestMultiContainerGoferStop.
There are two scripts copy the same set of files into the same directory
and sometime one of this command fails with EXIST.
PiperOrigin-RevId: 230011247
Change-Id: I9289f72e65dc407cdcd0e6cd632a509e01f43e9c
|
|
Runsc wants to mount /tmp using internal tmpfs implementation for
performance. However, it risks hiding files that may exist under
/tmp in case it's present in the container. Now, it only mounts
over /tmp iff:
- /tmp was not explicitly asked to be mounted
- /tmp is empty
If any of this is not true, then /tmp maps to the container's
image /tmp.
Note: checkpoint doesn't have sentry FS mounted to check if /tmp
is empty. It simply looks for explicit mounts right now.
PiperOrigin-RevId: 229607856
Change-Id: I10b6dae7ac157ef578efc4dfceb089f3b94cde06
|
|
PiperOrigin-RevId: 229438125
Change-Id: I58eb0d10178d1adfc709d7b859189d1acbcb2f22
|
|
And we need to wait a gofer process before cgroup.Uninstall,
because it is running in the sandbox cgroups.
PiperOrigin-RevId: 228904020
Change-Id: Iaf8826d5b9626db32d4057a1c505a8d7daaeb8f9
|
|
The original code assumed that it was safe to join and not restore cgroup,
but Container.Run will not exit after calling start, making cgroup cleanup
fail because there were still processes inside the cgroup.
PiperOrigin-RevId: 228529199
Change-Id: I12a48d9adab4bbb02f20d71ec99598c336cbfe51
|
|
If the sandbox process is dead (because of a panic or some other problem),
container.Destroy will never remove the container metadata file, since it will
always fail when calling container.stop().
This CL changes container.Destroy() to always perform the three necessary
cleanup operations:
* Stop the sandbox and gofer processes.
* Remove the container fs on the host.
* Delete the container metadata directory.
Errors from these three operations will be concatenated and returned from
Destroy().
PiperOrigin-RevId: 225448164
Change-Id: I99c6311b2e4fe5f6e2ca991424edf1ebeae9df32
|