Age | Commit message (Collapse) | Author |
|
Like (AF_INET, SOCK_RAW) sockets, AF_PACKET sockets require CAP_NET_RAW. With
runsc, you'll need to pass `--net-raw=true` to enable them.
Binding isn't supported yet.
PiperOrigin-RevId: 275909366
|
|
PiperOrigin-RevId: 275565958
|
|
This change fixes several issues with the fsgofer host UDS support. Notably, it
adds support for SOCK_SEQPACKET and SOCK_DGRAM sockets [1]. It also fixes
unsafe use of unet.Socket, which could cause a panic if Socket.FD is called
when err != nil, and calls to Socket.FD with nothing to prevent the garbage
collector from destroying and closing the socket.
A set of tests is added to exercise host UDS access. This required extracting
most of the syscall test runner into a library that can be used by custom
tests.
Updates #235
Updates #1003
[1] N.B. SOCK_DGRAM sockets are likely not particularly useful, as a server can
only reply to a client that binds first. We don't allow bind, so these are
unlikely to be used.
PiperOrigin-RevId: 275558502
|
|
* Use mknod instead of open&close to create an empty file.
* Limit a number of files to (1<<16) instead of 100K.
In this case, a test set is (1, 8, 64, 512, 4K, 32K, 64K) instead of (1, 8, 64,
512, 4K, 32K, 98K). I think it is easier to compare results for 32K and 64K
than 32K and 98K. And results for 98K doesn't give us more information than for
54K.
PiperOrigin-RevId: 275552507
|
|
Otherwise we need to do a lot of system calls and cooperative_save tests work
slow.
PiperOrigin-RevId: 275536957
|
|
PiperOrigin-RevId: 275114157
|
|
These aren't actually death tests in the GUnit sense. i.e., they don't call
EXPECT_EXIT or EXPECT_DEATH.
PiperOrigin-RevId: 275099957
|
|
PiperOrigin-RevId: 274700093
|
|
This allows for peeking at the length of the next message on a netlink socket
without pulling it off the socket's buffer/queue, allowing tools like 'ip' to
work.
This CL also fixes an issue where dump_done_errno was not included in the
NLMSG_DONE messages payload.
Issue #769
PiperOrigin-RevId: 274068637
|
|
The signalfd descriptors otherwise always show as available. This can lead
programs to spin, assuming they are looking to see what signals are pending.
Updates #139
PiperOrigin-RevId: 274017890
|
|
PiperOrigin-RevId: 273781112
|
|
Also change the default TTL to 64 to match Linux.
PiperOrigin-RevId: 273430341
|
|
Adds two tests. One to make sure that $HOME is set when starting a container
via 'docker run' and one to make sure that $HOME is set for each container in a
multi-container sandbox.
Issue #701
PiperOrigin-RevId: 273395763
|
|
The behavior for sending and receiving local broadcast (255.255.255.255)
traffic is as follows:
Outgoing
--------
* A broadcast packet sent on a socket that is bound to an interface goes out
that interface
* A broadcast packet sent on an unbound socket follows the route table to
select the outgoing interface
+ if an explicit route entry exists for 255.255.255.255/32, use that one
+ else use the default route
* Broadcast packets are looped back and delivered following the rules for
incoming packets (see next). This is the same behavior as for multicast
packets, except that it cannot be disabled via sockopt.
Incoming
--------
* Sockets wishing to receive broadcast packets must bind to either INADDR_ANY
(0.0.0.0) or INADDR_BROADCAST (255.255.255.255). No other socket receives
broadcast packets.
* Broadcast packets are multiplexed to all sockets matching it. This is the
same behavior as for multicast packets.
* A socket can bind to 255.255.255.255:<port> and then receive its own
broadcast packets sent to 255.255.255.255:<port>
In addition, this change implicitly fixes an issue with multicast reception. If
two sockets want to receive a given multicast stream and one is bound to ANY
while the other is bound to the multicast address, only one of them will
receive the traffic.
PiperOrigin-RevId: 272792377
|
|
The input file descriptor is always a regular file, so sendfile can't lose any
data if it will not be able to write them to the output file descriptor.
Reported-by: syzbot+22d22330a35fa1c02155@syzkaller.appspotmail.com
PiperOrigin-RevId: 272730357
|
|
https://github.com/google/gvisor/commit/dd69b49ed1103bab82a6b2ac95221b89b46f3376
makes this test take longer.
PiperOrigin-RevId: 272535892
|
|
PiperOrigin-RevId: 272522508
|
|
Spoiler alert: it doesn't.
PiperOrigin-RevId: 272513529
|
|
gVisor does not currently implement the functionality that would result in
AT_SECURE = 1, but Linux includes AT_SECURE = 0 in the normal case, so we
should do the same.
PiperOrigin-RevId: 272311488
|
|
Tests in the blacklist will be explicitly skipped (with associated log line).
Checks in a blacklist for the nodejs tests.
PiperOrigin-RevId: 272272749
|
|
Refactoring in 0036d1f7eb95bcc52977f15507f00dd07018e7e2 (v4.10) caused Linux to
start unconditionally zeroing the remainder of the last page in the
interpreter. Previously it did not due so if filesz == memsz, and *still* does
not do so when filesz == memsz for loading binaries, only interpreter.
This inconsistency is not worth replicating in gVisor, as it is arguably a bug,
but our tests must ensure we create interpreter ELFs compatible with this new
requirement.
PiperOrigin-RevId: 272266401
|
|
Kernel.cpuClockTicker increments kernel.cpuClock, which tasks use as a clock to
track their CPU usage. This improves latency in the syscall path by avoid
expensive monotonic clock calls on every syscall entry/exit.
However, this timer fires every 10ms. Thus, when all tasks are idle (i.e.,
blocked or stopped), this forces a sentry wakeup every 10ms, when we may
otherwise be able to sleep until the next app-relevant event. These wakeups
cause the sentry to utilize approximately 2% CPU when the application is
otherwise idle.
Updates to clock are not strictly necessary when the app is idle, as there are
no readers of cpuClock. This commit reduces idle CPU by disabling the timer
when tasks are completely idle, and computing its effects at the next wakeup.
Rather than disabling the timer as soon as the app goes idle, we wait until the
next tick, which provides a window for short sleeps to sleep and wakeup without
doing the (relatively) expensive work of disabling and enabling the timer.
PiperOrigin-RevId: 272265822
|
|
'docker exec' was getting CAP_NET_RAW even when --net-raw=false
because it was not filtered out from when copying container's
capabilities.
PiperOrigin-RevId: 272260451
|
|
Linux changed this behavior in 16e72e9b30986ee15f17fbb68189ca842c32af58
(v4.11). Previously, extra pages were always mapped RW. Now, those pages will
be executable if the segment specified PF_X. They still must be writeable.
PiperOrigin-RevId: 272256280
|
|
PiperOrigin-RevId: 272059043
|
|
It looks like the old code attempted to do this, but didn't realize that err !=
nil even in the happy case.
PiperOrigin-RevId: 272005887
|
|
PiperOrigin-RevId: 271665517
|
|
PiperOrigin-RevId: 271649711
|
|
PiperOrigin-RevId: 271644926
|
|
PiperOrigin-RevId: 271442321
|
|
This change fixes compile errors:
pty.cc:1460:7: error: expected primary-expression before '.' token
...
PiperOrigin-RevId: 271033729
|
|
Closes #261
PiperOrigin-RevId: 270973347
|
|
This makes them run much faster. Also cleaned up the log reporting.
PiperOrigin-RevId: 270799808
|
|
We already do this for `runsc run`, but need to do the same for `runsc exec`.
PiperOrigin-RevId: 270793459
|
|
The test is checking the wrong poll_fd for POLLHUP. The only
reason it passed till now was because it was also checking
for POLLIN which was always true on the other fd from the
previous poll!
PiperOrigin-RevId: 270780401
|
|
PiperOrigin-RevId: 270764996
|
|
Previously, when we set hostname:
$ strace hostname abc
...
sethostname("abc", 3) = -1 ENAMETOOLONG (File name too long)
...
According to man 2 sethostname:
"The len argument specifies the number of bytes in name. (Thus, name
does not require a terminating null byte.)"
We wrongly use the CopyStringIn() to check terminating zero byte in
the implementation of sethostname syscall.
To fix this, we use CopyInBytes() instead.
Fixes: #861
Reported-by: chenglang.hy <chenglang.hy@antfin.com>
Signed-off-by: Jianfeng Tan <henry.tjf@antfin.com>
|
|
Fixes: #829
Signed-off-by: Jianfeng Tan <henry.tjf@antfin.com>
Signed-off-by: Jielong Zhou <jielong.zjl@antfin.com>
|
|
Adresses a deadlock with the rolled back change:
https://github.com/google/gvisor/commit/b6a5b950d28e0b474fdad160b88bc15314cf9259
Creating a session from an orphaned process group was causing a lock to be
acquired twice by a single goroutine. This behavior is addressed, and a test
(OrphanRegression) has been added to pty.cc.
Implemented the following ioctls:
- TIOCSCTTY - set controlling TTY
- TIOCNOTTY - remove controlling tty, maybe signal some other processes
- TIOCGPGRP - get foreground process group. Also enables tcgetpgrp().
- TIOCSPGRP - set foreground process group. Also enabled tcsetpgrp().
Next steps are to actually turn terminal-generated control characters (e.g. C^c)
into signals to the proper process groups, and to send SIGTTOU and SIGTTIN when
appropriate.
PiperOrigin-RevId: 270088599
|
|
Default of 20 shards was arbitrary and will need fine-tuning in later CLs.
PiperOrigin-RevId: 269922871
|
|
Note that the exact semantics for these signalfds are slightly different from
Linux. These signalfds are bound to the process at creation time. Reads, polls,
etc. are all associated with signals directed at that task. In Linux, all
signalfd operations are associated with current, regardless of where the
signalfd originated.
In practice, this should not be an issue given how signalfds are used. In order
to fix this however, we will need to plumb the context through all the event
APIs. This gets complicated really quickly, because the waiter APIs are all
netstack-specific, and not generally exposed to the context. Probably not
worthwhile fixing immediately.
PiperOrigin-RevId: 269901749
|
|
- Fix ARG syntax in Dockerfiles.
- Fix curl commands in Dockerfiles.
- Fix some paths in proctor binaries.
- Check error from Walk in search helper.
PiperOrigin-RevId: 269641686
|
|
* Use multi-stage builds in Dockerfiles.
* Combine all proctor binaries into a single binary.
* Change the TestRunner interface to reduce code duplication.
PiperOrigin-RevId: 269462101
|
|
absl flags are more modern and we can easily depend on them directly.
The repo now successfully builds with --incompatible_load_cc_rules_from_bzl.
PiperOrigin-RevId: 269387081
|
|
- Sandbox logs are generated when running tests
- Kokoro uploads the sandbox logs
- Supports multiple parallel runs
- Revive script to install locally built runsc with docker
PiperOrigin-RevId: 269337274
|
|
ENOTDIR has to be returned when a component used as a directory in
pathname is not, in fact, a directory.
PiperOrigin-RevId: 269037893
|
|
This also allows the tee(2) implementation to be enabled, since dup can now be
properly supported via WriteTo.
Note that this change necessitated some minor restructoring with the
fs.FileOperations splice methods. If the *fs.File is passed through directly,
then only public API methods are accessible, which will deadlock immediately
since the locking is already done by fs.Splice. Instead, we pass through an
abstract io.Reader or io.Writer, which elide locks and use the underlying
fs.FileOperations directly.
PiperOrigin-RevId: 268805207
|
|
A recent Kokoro change pointed to go_tests.cfg (in line with the
other configurations), which unfortunately broke the presubmits.
This change also enabled the KVM tests, which were still using a
remote execution strategy.
This fixes both of these issues and allows presubmits to pass.
One additional test was caught with this case, which seems to
have been broken. It's unclear why this was not being caught.
PiperOrigin-RevId: 268166291
|
|
See https://github.com/bazelbuild/bazel/issues/8743. This will be required in
Bazel 1.0.
Protobuf was updated in
https://github.com/protocolbuffers/protobuf/commit/bf0c69e1302fe9568fbe310cc54b37d20a9d16a3#diff-96239ee297e0a92ac6ff96a6bc434ef0.
GoogleTest was updated in
https://github.com/google/googletest/commit/6fd262ecf787d0dc2a91696fd4bf1d3ee1ebfa14.
gflags has not yet been updated, so the repo still won't build with
--incompatible_load_cc_rules_from_bzl.
Tested with buildifier -warnings=native-cc -lint=warn **/BUILD.
PiperOrigin-RevId: 267638515
|
|
This is done because the root container for CRI is the infrastructure (pause)
container and always gets a low oom_score_adj. We do this to ensure that only
the oom_score_adj of user containers is used to calculated the sandbox
oom_score_adj.
Implemented in runsc rather than the containerd shim as it's a bit cleaner to
implement here (in the shim it would require overwriting the oomScoreAdj and
re-writing out the config.json again). This processing is Kubernetes(CRI)
specific but we are currently only supporting CRI for multi-container support
anyway.
PiperOrigin-RevId: 267507706
|