Age | Commit message (Collapse) | Author |
|
We accidentally set the wrong maximum. I've also added PATH_MAX and
NAME_MAX to the linux abi package.
PiperOrigin-RevId: 216221311
Change-Id: I44805fcf21508831809692184a0eba4cee469633
|
|
- Shared futex objects on shared mappings are represented by Mappable +
offset, analogous to Linux's use of inode + offset. Add type
futex.Key, and change the futex.Manager bucket API to use futex.Keys
instead of addresses.
- Extend the futex.Checker interface to be able to return Keys for
memory mappings. It returns Keys rather than just mappings because
whether the address or the target of the mapping is used in the Key
depends on whether the mapping is MAP_SHARED or MAP_PRIVATE; this
matters because using mapping target for a futex on a MAP_PRIVATE
mapping causes it to stop working across COW-breaking.
- futex.Manager.WaitComplete depends on atomic updates to
futex.Waiter.addr to determine when it has locked the right bucket,
which is much less straightforward for struct futex.Waiter.key. Switch
to an atomically-accessed futex.Waiter.bucket pointer.
- futex.Manager.Wake now needs to take a futex.Checker to resolve
addresses for shared futexes. CLONE_CHILD_CLEARTID requires the exit
path to perform a shared futex wakeup (Linux:
kernel/fork.c:mm_release() => sys_futex(tsk->clear_child_tid,
FUTEX_WAKE, ...)). This is a problem because futexChecker is in the
syscalls/linux package. Move it to kernel.
PiperOrigin-RevId: 216207039
Change-Id: I708d68e2d1f47e526d9afd95e7fed410c84afccf
|
|
Docker and Containerd both eat the boot processes stderr, making it difficult
to track down panics (which are always written to stderr).
This CL makes the boot process dup its debug log FD to stderr, so that panics
will be captured in the debug log, which is better than nothing.
This is the 3rd try at this CL. Previous attempts were foiled because Docker
expects the 'create' command to pass its stdio directly to the container, so
duping stderr in 'create' caused the applications stderr to go to the log file,
which breaks many applications (including our mysql test).
I added a new image_test that makes sure stdout and stderr are handled
correctly.
PiperOrigin-RevId: 215767328
Change-Id: Icebac5a5dcf39b623b79d7a0e2f968e059130059
|
|
Sandbox was setting chroot, but was not chaging the working
dir. Added test to ensure this doesn't happen in the future.
PiperOrigin-RevId: 215676270
Change-Id: I14352d3de64a4dcb90e50948119dc8328c9c15e1
|
|
PiperOrigin-RevId: 215674589
Change-Id: I4f8871b64c570dc6da448d2fe351cec8a406efeb
|
|
PiperOrigin-RevId: 215664253
Change-Id: Ice2500e669194630c9d03903c35622afb92dcba5
|
|
PiperOrigin-RevId: 215658757
Change-Id: If63b33293f3e53a7f607ae72daa79e2b7ef6fcfd
|
|
PiperOrigin-RevId: 215655197
Change-Id: I668b1bc7c29daaf2999f8f759138bcbb09c4de6f
|
|
PiperOrigin-RevId: 215633475
Change-Id: I7bc471e3b9a2c725fb5e15b3bbcba2ee1ea574b1
|
|
PiperOrigin-RevId: 215620949
Change-Id: I519da4b44386d950443e5784fb8c48ff9a36c5d3
|
|
This can happen if an error is encountered during Create() which causes the
container to be destroyed and set to state Stopped.
Without this transition, errors during Create get hidden by the later panic.
PiperOrigin-RevId: 215599193
Change-Id: Icd3f42e12c685cbf042f46b3929bccdf30ad55b0
|
|
We add an additional (2^3)-1=7 processes, but the code was only waiting for 3.
I switched back to Math.Pow format to make the arithmetic easier to inspect.
PiperOrigin-RevId: 215588140
Change-Id: Iccad4d6f977c1bfc5c4b08d3493afe553fe25733
|
|
Docker and containerd do not expose runsc's stderr, so tracking down sentry
panics can be painful.
If we have a debug log file, we should send panics (and all stderr data) to the
log file.
PiperOrigin-RevId: 215585559
Change-Id: I3844259ed0cd26e26422bcdb40dded302740b8b6
|
|
We were previously using the sandbox process's stdio as the root container's
stdio. This makes it difficult/impossible to distinguish output application
output from sandbox output, such as panics, which are always written to stderr.
Also close the console socket when we are done with it.
PiperOrigin-RevId: 215585180
Change-Id: I980b8c69bd61a8b8e0a496fd7bc90a06446764e0
|
|
PiperOrigin-RevId: 215574070
Change-Id: Ib36e804adebaf756adb9cbc2752be9789691530b
|
|
PiperOrigin-RevId: 215489101
Change-Id: Iaf96aa8edb1101b70548030c62995841215237d9
|
|
Docker.Run only returns a single argument.
PiperOrigin-RevId: 215427309
Change-Id: I1eebbc628853ca57f79d25e18d4f04dfa5a2a003
|
|
Terminal support in runsc relies on host tty file descriptors that are imported
into the sandbox. Application tty ioctls are sent directly to the host fd.
However, those host tty ioctls are associated in the host kernel with a host
process (in this case runsc), and the host kernel intercepts job control
characters like ^C and send signals to the host process. Thus, typing ^C into a
"runsc exec" shell will send a SIGINT to the runsc process.
This change makes "runsc exec" handle all signals, and forward them into the
sandbox via the "ContainerSignal" urpc method. Since the "runsc exec" is
associated with a particular container process in the sandbox, the signal must
be associated with the same container process.
One big difficulty is that the signal should not necessarily be sent to the
sandbox process started by "exec", but instead must be sent to the foreground
process group for the tty. For example, we may exec "bash", and from bash call
"sleep 100". A ^C at this point should SIGINT sleep, not bash.
To handle this, tty files inside the sandbox must keep track of their
foreground process group, which is set/get via ioctls. When an incoming
ContainerSignal urpc comes in, we look up the foreground process group via the
tty file. Unfortunately, this means we have to expose and cache the tty file in
the Loader.
Note that "runsc exec" now handles signals properly, but "runs run" does not.
That will come in a later CL, as this one is complex enough already.
Example:
root@:/usr/local/apache2# sleep 100
^C
root@:/usr/local/apache2# sleep 100
^Z
[1]+ Stopped sleep 100
root@:/usr/local/apache2# fg
sleep 100
^C
root@:/usr/local/apache2#
PiperOrigin-RevId: 215334554
Change-Id: I53cdce39653027908510a5ba8d08c49f9cf24f39
|
|
PiperOrigin-RevId: 215278262
Change-Id: Icd10384c99802be6097be938196044386441e282
|
|
PiperOrigin-RevId: 215274663
Change-Id: I051721f459084db3aa608432831170cd47ae7df0
|
|
There was a race where we checked task.Parent() != nil, and then later called
task.Parent() again, assuming that it is not nil. If the task is exiting, the
parent may have been set to nil in between the two calls, causing a panic.
This CL changes the code to only call task.Parent() once.
PiperOrigin-RevId: 215274456
Change-Id: Ib5a537312c917773265ec72016014f7bc59a5f59
|
|
And remove multicontainer option.
PiperOrigin-RevId: 215236981
Change-Id: I9fd1d963d987e421e63d5817f91a25c819ced6cb
|
|
This makes runsc more friendly to run without docker or K8s.
PiperOrigin-RevId: 215165586
Change-Id: Id45a9fc24a3c09b1645f60dbaf70e64711a7a4cd
|
|
PiperOrigin-RevId: 215162121
Change-Id: I35f06ac3235cf31c9e8a158dcf6261a7ded6c4c4
|
|
PiperOrigin-RevId: 215025517
Change-Id: I04b9d8022b3d9dfe279e466ddb91310b9860b9af
|
|
PiperOrigin-RevId: 215023376
Change-Id: I139569bd15c013e5dd0f60d0c98a64eaa0ba9e8e
|
|
PiperOrigin-RevId: 215009105
Change-Id: I1ab12eddf7694c4db98f6dafca9dae352a33f7c4
|
|
PiperOrigin-RevId: 215009066
Change-Id: I54ab920fa649cf4d0817f7cb8ea76f9126523330
|
|
PiperOrigin-RevId: 214976251
Change-Id: I631348c3886f41f63d0e77e7c4f21b3ede2ab521
|
|
PiperOrigin-RevId: 214975659
Change-Id: I7bd31a2c54f03ff52203109da312e4206701c44c
|
|
Call out the error that Gerrit returns if there is no CLA on file.
PiperOrigin-RevId: 214964718
Change-Id: I3d92e3eb73f178e8c4c52b5defbe8d21db536215
|
|
host.endpoint already has the check, but it is missing from
host.ConnectedEndpoint.
PiperOrigin-RevId: 214962762
Change-Id: I88bb13a5c5871775e4e7bf2608433df8a3d348e6
|
|
Previously, if address resolution for UDP or Ping sockets required sending
packets using Write in Transport layer, Resolve would return ErrWouldBlock
and Write would return ErrNoLinkAddress. Meanwhile startAddressResolution
would run in background. Further calls to Write using same address would also
return ErrNoLinkAddress until resolution has been completed successfully.
Since Write is not allowed to block and System Calls need to be
interruptible in System Call layer, the caller to Write is responsible for
blocking upon return of ErrWouldBlock.
Now, when startAddressResolution is called a notification channel for
the completion of the address resolution is returned.
The channel will traverse up to the calling function of Write as well as
ErrNoLinkAddress. Once address resolution is complete (success or not) the
channel is closed. The caller would call Write again to send packets and
check if address resolution was compeleted successfully or not.
Fixes google/gvisor#5
Change-Id: Idafaf31982bee1915ca084da39ae7bd468cebd93
PiperOrigin-RevId: 214962200
|
|
Some tests check current capabilities and re-run the tests as root inside
userns if required capabibilities are missing. It was checking for
CAP_SYS_ADMIN only, CAP_SYS_CHROOT is also required now.
PiperOrigin-RevId: 214949226
Change-Id: Ic81363969fa76c04da408fae8ea7520653266312
|
|
It's easier to manage a single map with processes that we're interested
to track. This will make the next change to clean up the map on destroy
easier.
PiperOrigin-RevId: 214894210
Change-Id: I099247323a0487cd0767120df47ba786fac0926d
|
|
PiperOrigin-RevId: 214890335
Change-Id: I42743f0ce46a5a42834133bce2f32d187194fc87
|
|
We already forward TCSETS and TCSETSW. TCSETSF is roughly equivalent but
discards pending input.
The filters were relaxed to allow host ioctls with TCSETSF argument.
This fixes programs like "passwd" that prevent user input from being displayed
on the terminal.
Before:
root@b8a0240fc836:/# passwd
Enter new UNIX password: 123
Retype new UNIX password: 123
passwd: password updated successfully
After:
root@ae6f5dabe402:/# passwd
Enter new UNIX password:
Retype new UNIX password:
passwd: password updated successfully
PiperOrigin-RevId: 214869788
Change-Id: I31b4d1373c1388f7b51d0f2f45ce40aa8e8b0b58
|
|
In order to implement kill --all correctly, the Sentry needs
to track all tasks that belong to a given container. This change
introduces ContainerID to the task, that gets inherited by all
children. 'kill --all' then iterates over all tasks comparing the
ContainerID field to find all processes that need to be signalled.
PiperOrigin-RevId: 214841768
Change-Id: I693b2374be8692d88cc441ef13a0ae34abf73ac6
|
|
The //go:linkname directive requires the presence of
assembly files in the package. Even an empty file will do.
There was an empty assembly file commit_arm64.s, but
that is limited to GOARCH=arm64. Renaming to empty.s will
remove the unnecessary build constraint and allow building
netstack for other architectures than amd64 and arm64.
Without this, building directly with go (not bazel)
for e.g., GOARCH=arm gives:
sleep/sleep_unsafe.go:88:6: missing function body
sleep/sleep_unsafe.go:91:6: missing function body
Change-Id: I29d1d13e1ff31506a174d4595b8cd57fa58bf52b
PiperOrigin-RevId: 214820299
|
|
PiperOrigin-RevId: 214798278
Change-Id: Id59d1ceb35037cda0689d3a1c4844e96c6957615
|
|
This makes the flow slightly simpler (no need to call
Loader.SetRootContainer). And this is required change to tag
tasks with container ID inside the Sentry.
PiperOrigin-RevId: 214795210
Change-Id: I6ff4af12e73bb07157f7058bb15fd5bb88760884
|
|
This was done so it's easier to add more functionality
to this file for other tests.
PiperOrigin-RevId: 214782043
Change-Id: I1f38b9ee1219b3ce7b789044ada8e52bdc1e6279
|
|
Old code was returning ID of the thread that created
the child process. It should be returning the ID of
the parent process instead.
PiperOrigin-RevId: 214720910
Change-Id: I95715c535bcf468ecf1ae771cccd04a4cd345b36
|
|
PiperOrigin-RevId: 214700295
Change-Id: I73d8490572eebe5da584af91914650d1953aeb91
|
|
There is a subtle bug that is the result of two changes made when upstreaming
ICMPv6 support from Fuchsia:
1) ipv6.endpoint.WritePacket writes the local address it was initialized with,
rather than the provided route's local address
2) ipv6.endpoint.handleICMP doesn't set its route's local address to the ICMP
target address before writing the response
The result is that the ICMP response erroneously uses the target ipv6 address
(rather than icmp) as its source address in the response. When trying to debug
this by fixing (2), we ran into problems with bad ipv6 checksums because (1)
didn't respect the local address of the route being passed to it.
This fixes both problems.
PiperOrigin-RevId: 214650822
Change-Id: Ib6148bf432e6428d760ef9da35faef8e4b610d69
|
|
This is useful for Fuchsia.
PiperOrigin-RevId: 214619681
Change-Id: If5a60dd82365c2eae51a12bbc819e5aae8c76ee9
|
|
This CL changes the semantics of the "--file-access" flag so that it only
affects the root filesystem. The default remains "exclusive" which is the
common use case, as neither Docker nor K8s supports sharing the root.
Keeping the root fs as "exclusive" means that the fs-intensive work done during
application startup will mostly be cacheable, and thus faster.
Non-root bind mounts will always be shared.
This CL also removes some redundant FSAccessType validations. We validate this
flag in main(), so we can assume it is valid afterwards.
PiperOrigin-RevId: 214359936
Change-Id: I7e75d7bf52dbd7fa834d0aacd4034868314f3b51
|
|
PiperOrigin-RevId: 214073949
Change-Id: I8fab916cd77362c13dac2c9dcf2ecc1710d87a5e
|
|
PiperOrigin-RevId: 214040901
Change-Id: I74d79497a053da3624921ad2b7c5193ca4a87942
|
|
PiperOrigin-RevId: 214039349
Change-Id: Ia7d09c5f85eddd1e5634f3c21b0bd60b10be6bd2
|