Age | Commit message (Collapse) | Author |
|
This eliminates the indirection that existed in task_futex.
PiperOrigin-RevId: 221832498
Change-Id: Ifb4c926d493913aa6694e193deae91616a29f042
|
|
Also update test utilities for probing vsyscall support and add a
metric to see if vsyscalls are actually used in sandboxes.
PiperOrigin-RevId: 221698834
Change-Id: I57870ecc33ea8c864bd7437833f21aa1e8117477
|
|
PiperOrigin-RevId: 221683127
Change-Id: Ide6a9f41d75aa19d0e2051a05a1e4a114a4fb93c
|
|
Moving the wakeup logic into the disable blocks is an optimization.
PiperOrigin-RevId: 221677028
Change-Id: Ib5a5a6d52cc77b4bbc5dedcad9ee1dbb3da98deb
|
|
...to (remote, local), reflecting the (correct) names in the implementation of
DeliverNetworkPacket (see tcpip/stack/nic.go).
Also trim the names in DeliverNetworkPacket and elsewhere to avoid stuttering;
since the type is tcpip.LinkAddress, there's no need to include "LinkAddr" in
the parameter names.
Note that every callsite passes arguments in the order (src, dst).
PiperOrigin-RevId: 221514396
Change-Id: I3637454ad0d6e62a19e4dcbc2a16493798bd0f09
|
|
PiperOrigin-RevId: 221484739
Change-Id: I44c71f79f99d0d00a2e70a7f06d7024a62a5de0a
|
|
Previously, TCP_NODELAY was always enabled and we would lie about it being
configurable. TCP_NODELAY is now disabled by default (to match Linux) in the
socket layer so that non-gVisor users don't automatically start using this
questionable optimization.
PiperOrigin-RevId: 221368472
Change-Id: Ib0240f66d94455081f4e0ca94f09d9338b2c1356
|
|
PiperOrigin-RevId: 221189534
Change-Id: Id20d318bed97d5226b454c9351df396d11251e1f
|
|
PiperOrigin-RevId: 221117846
Change-Id: I2a43fd8135b1d1194ff81e98644ce6b6182ece50
|
|
PiperOrigin-RevId: 220866996
Change-Id: I89d48215df57c00d6a6ec512fc18712a2ea9080b
|
|
sync_file_range - sync a file segment with disk
In Linux, sync_file_range() accepts three flags:
SYNC_FILE_RANGE_WAIT_BEFORE
Wait upon write-out of all pages in the specified range that
have already been submitted to the device driver for write-out
before performing any write.
SYNC_FILE_RANGE_WRITE
Initiate write-out of all dirty pages in the specified range
which are not presently submitted write-out. Note that even
this may block if you attempt to write more than request queue
size.
SYNC_FILE_RANGE_WAIT_AFTER
Wait upon write-out of all pages in the range after performing
any write.
In this implementation:
SYNC_FILE_RANGE_WAIT_BEFORE without SYNC_FILE_RANGE_WAIT_AFTER isn't
supported right now.
SYNC_FILE_RANGE_WRITE is skipped. It should initiate write-out of all
dirty pages, but it doesn't wait, so it should be safe to do nothing
while nobody uses SYNC_FILE_RANGE_WAIT_BEFORE.
SYNC_FILE_RANGE_WAIT_AFTER is equal to fdatasync(). In Linux,
sync_file_range() doesn't writes out the file's meta-data, but
fdatasync() does if a file size is changed.
PiperOrigin-RevId: 220730840
Change-Id: Iae5dfb23c2c916967d67cf1a1ad32f25eb3f6286
|
|
Create syscall stubs for missing syscalls upto Linux 4.4 and advertise
a kernel version of 4.4.
PiperOrigin-RevId: 220667680
Change-Id: Idbdccde538faabf16debc22f492dd053a8af0ba7
|
|
Increase timeout to prevent the entry from being
found when there is delay on the address resolution
goroutine that doesn't mark the request as failed.
PiperOrigin-RevId: 220504789
Change-Id: I7e44fd95d8624bd69962f862fbf5517a81395f2a
|
|
PiperOrigin-RevId: 220314735
Change-Id: Ic519567e43f6caf042b9f223e517da40640b7d38
|
|
These files were added with the wrong name after all of the existing files
were corrected.
PiperOrigin-RevId: 220202068
Change-Id: Ia0d15233c1aa69330356a7cf16b5aa00d978e09c
|
|
PiperOrigin-RevId: 220185891
Change-Id: Iaea73fd7b2fa8c399b989cdcaabf4885f370df4b
|
|
Fluentd configuration uses 'log' for the log message
while containerd uses 'msg'. Since we can't have a single
JSON format for both, add another log format and make
debug log configurable.
PiperOrigin-RevId: 219729658
Change-Id: I2a6afc4034d893ab90bafc63b394c4fb62b2a7a0
|
|
Updated error messages so that it doesn't print full Go struct representations
when running a new container in a sandbox. For example, this occurs frequently
when commands are not found when doing a 'kubectl exec'.
PiperOrigin-RevId: 219729141
Change-Id: Ic3a7bc84cd7b2167f495d48a1da241d621d3ca09
|
|
Shm segments can be marked for lazy destruction via shmctl(IPC_RMID),
which destroys a segment once it is no longer attached to any
processes. We were unconditionally decrementing the segment refcount
on shmctl(IPC_RMID) which allowed a user to force a segment to be
destroyed by repeatedly calling shmctl(IPC_RMID), with outstanding
memory maps to the segment.
This is problematic because the memory released by a segment destroyed
this way can be reused by a different process while remaining
accessible by the process with outstanding maps to the segment.
PiperOrigin-RevId: 219713660
Change-Id: I443ab838322b4fb418ed87b2722c3413ead21845
|
|
https://github.com/containerd/containerd/blob/master/oci/spec.go#L206, the mode=755
didn't match the pattern modeRegexp = regexp.MustCompile("0[0-7][0-7][0-7]").
Closes #112
Signed-off-by: Juan <xionghuan.cn@gmail.com>
Change-Id: I469e0a68160a1278e34c9e1dbe4b7784c6f97e5a
PiperOrigin-RevId: 219672525
|
|
PiperOrigin-RevId: 219575226
Change-Id: If4e67b29110332c94013513fb111ec7e019f2915
|
|
PiperOrigin-RevId: 219571556
Change-Id: I5a1042c1cb05eb2711eb01627fd298bad6c543a6
|
|
Replacing map lookups with slice indexing is higher performance.
PiperOrigin-RevId: 219569901
Change-Id: I9b7cd22abd4b95383025edbd5a80d1c1a4496936
|
|
This reduces the number of floating point save/restore cycles required (since
we don't need to restore immediately following the switch, this always happens
in a known context) and allows the kernel hooks to capture state. This lets us
remove calls like "Current()".
PiperOrigin-RevId: 219552844
Change-Id: I7676fa2f6c18b9919718458aa888b832a7db8cab
|
|
This improves debuggability greatly.
PiperOrigin-RevId: 219551560
Change-Id: I2ecaffdd1c17b0d9f25911538ea6f693e2bc699f
|
|
PiperOrigin-RevId: 219492587
Change-Id: I47f6fc0b74a4907ab0aff03d5f26453bdb983bb5
|
|
This field was added in the intial implementation, before Route existed
to pass the local and remote addresses to the packet-writing path.
Today, the Route's members should be respected. A similar bug was
previously fixed in 214650822.
PiperOrigin-RevId: 219474095
Change-Id: Id2a8ee4421d2841c8d88ccb3c193c455086350ee
|
|
Use private futexes for performance and to align with other runtime uses.
PiperOrigin-RevId: 219422634
Change-Id: Ief2af5e8302847ea6dc246e8d1ee4d64684ca9dd
|
|
Actually parse flags from cpuinfo to avoid mistakenly matching
substrings in cpuinfo that happen to match a flags.
Some features were only exposed in recent versions of Linux. Don't
require them to appear in cpuinfo on old versions of Linux.
Move PREFETCHWT1 back to parse only features. It isn't actually exposed
in Linux yet. Move SDBG to shown features. It has been visible since
Linux 4.3.
PiperOrigin-RevId: 219381731
Change-Id: Ied7c0ee7c8a9879683e81933de56c9074b01108f
|
|
Extend the cpuid package to parse and emulate cpuid features that exist
only on AMD and not Intel. The least straightforward part of this is
that AMD duplicates several block 1 features in block 6. Thus we ignore
those features when parsing block 6 and add them when emulating.
PiperOrigin-RevId: 218935032
Change-Id: Id41bf1c24720b0d9b968e2c19ab5bc00a6d62bd4
|
|
Linux added these block 3 features to the end of /proc/cpuinfo in
dfb4a70f20c5b3880da56ee4c9484bdb4e8f1e65.
This also fixes that block 3 features were completely missing from
FeatureSet.FlagsString(false) because FlagsString only prints Linux
blocks regardless of the cpuinfo option.
PiperOrigin-RevId: 218913816
Change-Id: I2f9c38c7c9da4b247a140877d4aca782e80684bd
|
|
PiperOrigin-RevId: 218894181
Change-Id: I97d0c74175f4aa528363f768a0a85d6953ea0bfd
|
|
PiperOrigin-RevId: 218592058
Change-Id: I373a2d813aa6cc362500dd5a894c0b214a1959d7
|
|
Previously this code used the tcpip error space. Since it is no longer part of
netstack, it can use the sentry's error space (except for a few cases where
there is still some shared code. This reduces the number of error space
conversions required for hot Unix socket operations.
PiperOrigin-RevId: 218541611
Change-Id: I3d13047006a8245b5dfda73364d37b8a453784bb
|
|
PiperOrigin-RevId: 218537640
Change-Id: I1c5f55a46390174e1f5caeff74b1a364fa3268d9
|
|
Pseudoterminal job control signals are meant to be received and handled by the
sandbox process, but if the ptrace stubs are running in the same process group,
they will receive the signals as well and inject then into the sentry kernel.
This can result in duplicate signals being delivered (often to the wrong
process), or a sentry panic if the ptrace stub is inactive.
This CL makes the ptrace stub run in a new session.
PiperOrigin-RevId: 218536851
Change-Id: Ie593c5687439bbfbf690ada3b2197ea71ed60a0e
|
|
Attempting to create a zero-len shm segment causes a panic since we
try to allocate a zero-len filemem region. The existing code had a
guard to disallow this, but the check didn't encode the fact that
requesting a private segment implies a segment creation regardless of
whether IPC_CREAT is explicitly specified.
PiperOrigin-RevId: 218405743
Change-Id: I30aef1232b2125ebba50333a73352c2f907977da
|
|
PiperOrigin-RevId: 218390517
Change-Id: Ic891c1626e62a6c4ed57f8180740872bcd1be177
|
|
This should be determined by the filesystem.
PiperOrigin-RevId: 218376553
Change-Id: I55d176e2cdf8acdd6642789af057b98bb8ca25b8
|
|
The channels {cancel,resCh} have roughly the same lifetime and are used for
roughly the same purpose as an entry's waiters; we can unify the state
management of the two mechanisms, while also reducing unncessary mutex locking
and unlocking.
Made some cosmetic changes while I'm here.
PiperOrigin-RevId: 218343915
Change-Id: Ic69546a2b7b390162b2231f07f335dd6199472d7
|
|
This change also adds extensive testing to the p9 package via mocks. The sanity
checks and type checks are moved from the gofer into the core package, where
they can be more easily validated.
PiperOrigin-RevId: 218296768
Change-Id: I4fc3c326e7bf1e0e140a454cbacbcc6fd617ab55
|
|
This allows us to release messages in the queue when all users close.
PiperOrigin-RevId: 218033550
Change-Id: I2f6e87650fced87a3977e3b74c64775c7b885c1b
|
|
Added events for *ctl syscalls that may have multiple different commands.
For runsc, each syscall event is only logged once. For *ctl syscalls, use
the cmd as identifier, not only the syscall number.
PiperOrigin-RevId: 218015941
Change-Id: Ie3c19131ae36124861e9b492a7dbe1765d9e5e59
|
|
PiperOrigin-RevId: 217951017
Change-Id: Ie08bf6987f98467d07457bcf35b5f1ff6e43c035
|
|
This should improve performance.
PiperOrigin-RevId: 217610560
Change-Id: I370f196ea2396f1715a460b168ecbee197f94d6c
|
|
This reduces the number of goroutines and runtime timers when
ITIMER_VIRTUAL or ITIMER_PROF are enabled, or when RLIMIT_CPU is set.
This also ensures that thread group CPU timers only advance if running
tasks are observed at the time the CPU clock advances, mostly
eliminating the possibility that a CPU timer expiration observes no
running tasks and falls back to the group leader.
PiperOrigin-RevId: 217603396
Change-Id: Ia24ce934d5574334857d9afb5ad8ca0b6a6e65f4
|
|
This queue only has a single user, so there is no need for it to use an
interface. Merging it into the same package as its sole user allows us to avoid
a circular dependency.
This simplifies the code and should slightly improve performance.
PiperOrigin-RevId: 217595889
Change-Id: Iabbd5164240b935f79933618c61581bc8dcd2822
|
|
PiperOrigin-RevId: 217576188
Change-Id: I82e45c306c5c9161e207311c7dbb8a983820c1df
|
|
PiperOrigin-RevId: 217573168
Change-Id: Ic1914d0ef71bab020e3ee11cf9c4a50a702bd8dd
|
|
Now containers run with "docker run -it" support control characters like ^C and
^Z.
This required refactoring our signal handling a bit. Signals delivered to the
"runsc boot" process are turned into loader.Signal calls with the appropriate
delivery mode. Previously they were always sent directly to PID 1.
PiperOrigin-RevId: 217566770
Change-Id: I5b7220d9a0f2b591a56335479454a200c6de8732
|