summaryrefslogtreecommitdiffhomepage
AgeCommit message (Collapse)Author
2021-08-20Fix instructions refer to `tools/go_mod.sh`Daniel Dao
`tools/go_mod.sh` is not in the repo. In order to update the WORKSPACE dependencies, we can use the same gazelle command in BUILD file. Also changed `go mod get` to `go get`, the former does not exist anymore. Signed-off-by: Daniel Dao <dqminh89@gmail.com>
2021-08-19Cache verity dentriesChong Cai
Add an LRU cache to cache verity dentries when ref count drop to 0. This way we don't need to hash and verify the previous opened files or directories each time. PiperOrigin-RevId: 391880157
2021-08-19Merge Read calls in verity merkle treeChong Cai
Read all data into memory in one Read call and verify them block by block instead of read each block during verification. This is for performance purpose to avoid invoking multiple syscalls. PiperOrigin-RevId: 391877937
2021-08-19Use MM-mapped I/O instead of buffered copies in gofer.specialFileFD.Jamie Liu
The rationale given for using buffered copies is still valid, but it's unclear whether holding MM locks or allocating buffers is better in practice, and the former is at least consistent with gofer.regularFileFD (and VFS1), making performance easier to reason about. PiperOrigin-RevId: 391877913
2021-08-19Add loopback interface as an ethernet-based deviceGhanan Gowripalan
...to match Linux behaviour. We can see evidence of Linux representing loopback as an ethernet-based device below: ``` # EUI-48 based MAC addresses. $ ip link show lo 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 # tcpdump showing ethernet frames when sniffing loopback and logging the # link-type as EN10MB (Ethernet). $ sudo tcpdump -i lo -e -c 2 -n tcpdump: verbose output suppressed, use -v[v]... for full protocol decode listening on lo, link-type EN10MB (Ethernet), snapshot length 262144 bytes 03:09:05.002034 00:00:00:00:00:00 > 00:00:00:00:00:00, ethertype IPv4 (0x0800), length 66: 127.0.0.1.9557 > 127.0.0.1.36828: Flags [.], ack 3562800815, win 15342, options [nop,nop,TS val 843174495 ecr 843159493], length 0 03:09:05.002094 00:00:00:00:00:00 > 00:00:00:00:00:00, ethertype IPv4 (0x0800), length 66: 127.0.0.1.36828 > 127.0.0.1.9557: Flags [.], ack 1, win 6160, options [nop,nop,TS val 843174496 ecr 843159493], length 0 2 packets captured 116 packets received by filter 0 packets dropped by kernel ``` Wireshark shows a similar result as the tcpdump example above. Linux's loopback setup: https://github.com/torvalds/linux/blob/5bfc75d92efd494db37f5c4c173d3639d4772966/drivers/net/loopback.c#L162 PiperOrigin-RevId: 391836719
2021-08-19Use a hash function to generate tcp timestamp offsetZeling Feng
Also fix an option parsing error in checker.TCPTimestampChecker while I am here. PiperOrigin-RevId: 391828329
2021-08-18Split TCP secrets from Stack to tcp.protocolZeling Feng
Use different secrets for different purposes (port picking, ISN generation, tsOffset generation) and moved the secrets from stack.Stack to tcp.protocol. PiperOrigin-RevId: 391641238
2021-08-18Add control configsChong Cai
Also plumber the controls through runsc PiperOrigin-RevId: 391594318
2021-08-18Declare default outputs from nogo_stdlibMichael Pratt
nogo_stdlib propogate facts and findings to downstream nogo_aspects via NogoStdlibInfo. This all works fine except one case: directly building a nogo_stdlib. e.g., bazel build //tools/nogo:stdlib. In this case, nothing is requesting the NogoStdlibInfo, and thus the target has nothing to do. This can be rather confusing when trying to debug failures in :stdlib, as building :stdlib does nothing. Fix this by declaring the facts and findings as default outputs from nogo_stdlib. Now direct bazel build will request these outputs and actually trigger the analysis. Standard aspect builds are unaffected. PiperOrigin-RevId: 391580126
2021-08-17[op] Deflake SNMP Metric proc_net tests.Ayush Ranjan
Earlier the tests were checking for equality of system-wide metrics before and after some network related operations. That is inherently racy for native tests because depending on the testing infrastructure, multiple tests might run parallely hence trampling over each other's metrics. Tests should only compare metrics that are increasing in nature. The comparison should not be a hard comparison, instead a less-than/greater-than relation test. I have changed the checks and also removed tests for tcpCurrEstab metric which has "SYNTAX Gauge" and hence can not be tested reliably. PiperOrigin-RevId: 391460081
2021-08-17Merge pull request #6262 from sudo-sturbia:msgqueue/syscalls3gVisor bot
PiperOrigin-RevId: 391416650
2021-08-17Deflake test/perf:randread_benchmarkAndrei Vagin
The test expects that pread reads the full buffer, it means that the pread offset has to be equal or less than file_size - buffer_size. PiperOrigin-RevId: 391356863
2021-08-17Implement stub for msgctl(2).Zyad A. Ali
Add support for msgctl and enable tests. Fixes #135
2021-08-17Implement control operations on msgqueue.Zyad A. Ali
For IPCInfo, update value of MSGSEG constant in abi to avoid overflow in MsgInfo.MsgSeg. MSGSEG was originaly simplified in abi, and is unused (by us and within the kernel), so updating it is okay. Updates #135
2021-08-17Implement ipc.Object.Set and use it in ipc mechanisms.Zyad A. Ali
Set provides functionality of {sem,shm,msg}ctl(IPC_SET).
2021-08-17Add tests for msgctl(2).Zyad A. Ali
Updates #135
2021-08-17Internal change.gVisor bot
PiperOrigin-RevId: 391331401
2021-08-16Internal change.gVisor bot
PiperOrigin-RevId: 391217339
2021-08-16test/syscalls/proc_net: /proc/net/snmp can contain system-wide statisticsAndrei Vagin
This is a new kernel feature that are controlled by the net.core.mibs_allocation sysctl. PiperOrigin-RevId: 391215784
2021-08-16imges/syzkaller: add --allow-releaseinfo-change to apt updateAndrei Vagin
Otherwise, it fails with this error: Get:3 http://security.debian.org/debian-security buster/updates InRelease Reading package lists... E: Repository 'http://deb.debian.org/debian buster InRelease' changed its 'Suite' value from 'stable' to 'oldstable' PiperOrigin-RevId: 391155532
2021-08-13[syserror] Remove pkg syserror.Zach Koopmans
Removes package syserror and moves still relevant code to either linuxerr or to syserr (to be later removed). Internal errors are converted from random types to *errors.Error types used in linuxerr. Internal errors are in linuxerr/internal.go. PiperOrigin-RevId: 390724202
2021-08-13[benchmarks] Update BenchmarkStartEmpty benchmark.Zach Koopmans
Update the start benchmark on empty to only "Start" a container, not wait for its completion. TL:DR only measure the actual start call for the empty container. Previously, we were measuring the completion of /bin/true in container alpine AND the cleanup. This was fine until profiling started failing all the time on ptrace. This is a cost that runc is not paying. These changes will reduce the over all timing of the benchmark, but it will give more sane results. Instead, use "Spawn" which is similar to `docker run --detach alpine /bin/sleep 100`. Call sleep so containers stick around long enough for the profiler to read profile data from them. PiperOrigin-RevId: 390705431
2021-08-13Add Event controlsChong Cai
Add Event controls and implement "stream" commands. PiperOrigin-RevId: 390691702
2021-08-13Fix minor typoMichael Pratt
PiperOrigin-RevId: 390659097
2021-08-13Disable SA1019 (deprecation check)Michael Pratt
On Go tip (pre-1.18), http://golang.org/issue/44195 is making SA1019 mistake uses of reflect.Value.Len for reflect.Value.InterfaceData, which is deprecated. It is thus mistakenly raising deprecation errors on uses of reflect.Value.Len. Suppress these errors by disabling SA1019 entirely. This is a bit overkill, but it is unclear to me if we want hard errors on deprecation anyways. That can be reevaluated when http://golang.org/issue/44195 is fixed. The other staticcheck analyzers are moved to alphabetical order. Updates golang/go#44195 PiperOrigin-RevId: 390655918
2021-08-13Update `core` allowed dependenciesMichael Pratt
This list has gotten a little out-of-date. Note that `clockwork` used to be used but was removed in gvisor.dev/pr/5384. PiperOrigin-RevId: 390644841
2021-08-13Free multicastMemberships on UDP endpoint closeGhanan Gowripalan
tcpip.Endpoint.Close is documented to free all resources associated with an endpoint so we don't need to create an empty map to clear the multicast memberships. PiperOrigin-RevId: 390609826
2021-08-12Add Usage controlsChong Cai
Add Usage controls and implement "usage/usagefd" commands. PiperOrigin-RevId: 390507423
2021-08-12[syserror] Convert remaining syserror definitions to linuxerr.Zach Koopmans
Convert remaining public errors (e.g. EINTR) from syserror to linuxerr. PiperOrigin-RevId: 390471763
2021-08-12Clear Merkle files before measuring verity fsChong Cai
PiperOrigin-RevId: 390467957
2021-08-12fix typoKevin Krakauer
PiperOrigin-RevId: 390463819
2021-08-12Automated rollback of changelist 390346783gVisor bot
PiperOrigin-RevId: 390405182
2021-08-12test/pipe: use futex() for sync with the signal handerAndrei Vagin
PiperOrigin-RevId: 390399815
2021-08-12Internal change.gVisor bot
PiperOrigin-RevId: 390346783
2021-08-12Internal change.gVisor bot
PiperOrigin-RevId: 390318725
2021-08-12Add support for TCP send buffer auto tuning.Nayana Bidari
Send buffer size in TCP indicates the amount of bytes available for the sender to transmit. This change will allow TCP to update the send buffer size when - TCP enters established state. - ACK is received. The auto tuning is disabled when the send buffer size is set with the SO_SNDBUF option. PiperOrigin-RevId: 390312274
2021-08-11Add verity stat benchmark testChong Cai
PiperOrigin-RevId: 390284683
2021-08-11[op] Make PacketBuffer Clone() do a deeper copy.Ayush Ranjan
Earlier PacketBuffer.Clone() would do a shallow top level copy of the packet buffer - which involved sharing the *buffer.Buffer between packets. Reading or writing to the buffer in one packet would impact the other. This caused modifications in one packet to affect the other's pkt.Views() which is not desired. Change the clone to do a deeper copy of the underlying buffer list and buffer pointers. The payload buffers (which are immutable) are still shared. This change makes the Clone() operation more expensive as we now need to allocate the entire buffer list. Added unit test to test integrity of packet data after cloning. Reported-by: syzbot+7ffff9a82a227b8f2e31@syzkaller.appspotmail.com Reported-by: syzbot+7d241de0d9072b2b6075@syzkaller.appspotmail.com Reported-by: syzbot+212bc4d75802fa461521@syzkaller.appspotmail.com PiperOrigin-RevId: 390277713
2021-08-11Do not clear merkle files when creating dentryChong Cai
The dentry for each file/directory can be created/destroyed multiple times during sandbox lifetime. We should not clear the Merkle file each time a dentry is created. PiperOrigin-RevId: 390277107
2021-08-11Popluate verity directory children namesChong Cai
We were relying on children adding its name to parent's dentry to populate parent's children list. However, this may not work since the parent dentry could be destroyed if its reference count drops to zero. In that case, a new dentry will be created when enabling the parent and it does not contain the children names info. Therefore we need to populate the child names list again to avoid missing children in the directory. PiperOrigin-RevId: 390270227
2021-08-11Run packet socket tests on FuchsiaGhanan Gowripalan
+ Do not check for CAP_NET_RAW on Fuchsia Fuchsia does not support capabilities the same way Linux does. Instead emulate the check for CAP_NET_RAW by checking if a packet socket may be created. Bug: https://fxbug.dev/79016, https://fxbug.dev/81592 PiperOrigin-RevId: 390263666
2021-08-11Initial cgroupfs support for subcontainersRahat Mahmood
Allow creation and management of subcontainers through cgroupfs directory syscalls. Also add a mechanism to specify a default root container to start new jobs in. This implements the filesystem support for subcontainers, but doesn't implement hierarchical resource accounting or task migration. PiperOrigin-RevId: 390254870
2021-08-11Fix FSSupportsMap checkAdam Barth
Previously, this check always failed because we did not provide MAP_SHARED or MAP_PRIVATE. PiperOrigin-RevId: 390251086
2021-08-11Wrap test queues in Queue object on creation.Rahat Mahmood
PiperOrigin-RevId: 390245901
2021-08-11Fix LinkTest.OldnameDoesNotExistAdam Barth
Previous, this test was the same as OldnameIsEmpty. This CL makes the test check what happens if the old name does not exist. PiperOrigin-RevId: 390243070
2021-08-11[op] Fix //debian:debian.Ayush Ranjan
Co-authored-by: Andrei Vagin <avagin@google.com> PiperOrigin-RevId: 390232925
2021-08-09platform/kvm: fix a race condition in vCPU.unlock()Andrei Vagin
Right now, it contains the code: origState := atomic.LoadUint32(&c.state) atomicbitops.AndUint32(&c.state, ^vCPUUser) The problem here is that vCPU.bounce that is called from another thread can add vCPUWaiter when origState has been read but vCPUUser isn't cleared yet. In this case, vCPU.unlock doesn't notify other threads about changes and c.bounce will be stuck in the futex_wait call. PiperOrigin-RevId: 389697411
2021-08-09Run raw IP socket syscall tests on FuchsiaGhanan Gowripalan
+ Do not check for CAP_NET_RAW on Fuchsia Fuchsia does not support capabilities the same way Linux does. Instead emulate the check for CAP_NET_RAW by checking if a raw IP sockets may be created. PiperOrigin-RevId: 389663218
2021-08-06[SMT] Refactor runsc mititgateZach Koopmans
Refactor mitigate to use /sys/devices/system/cpu/smt/control instead of individual CPU control files. PiperOrigin-RevId: 389215975
2021-08-05Correctly handle interruptions in blocking msgqueue syscalls.Rahat Mahmood
Reported-by: syzbot+63bde04529f701c76168@syzkaller.appspotmail.com Reported-by: syzbot+69866b9a16ec29993e6a@syzkaller.appspotmail.com PiperOrigin-RevId: 389084629