summaryrefslogtreecommitdiffhomepage
AgeCommit message (Collapse)Author
2020-03-31checkpoint/restore: make sure the donated stdioFDs have the same valueAaron Lu
Suppose I start a runsc container using kvm platform like this: $ sudo runsc --debug=true --debug-log=1.txt --platform=kvm run rootbash The donating FD and the corresponding cmdline for runsc-sandbox is: D0313 17:50:12.608203 44389 x:0] Donating FD 3: "1.txt" D0313 17:50:12.608214 44389 x:0] Donating FD 4: "control_server_socket" D0313 17:50:12.608224 44389 x:0] Donating FD 5: "|0" D0313 17:50:12.608229 44389 x:0] Donating FD 6: "/home/ziqian.lzq/bundle/bash/runsc/config.json" D0313 17:50:12.608234 44389 x:0] Donating FD 7: "|1" D0313 17:50:12.608238 44389 x:0] Donating FD 8: "sandbox IO FD" D0313 17:50:12.608242 44389 x:0] Donating FD 9: "/dev/kvm" D0313 17:50:12.608246 44389 x:0] Donating FD 10: "/dev/stdin" D0313 17:50:12.608249 44389 x:0] Donating FD 11: "/dev/stdout" D0313 17:50:12.608253 44389 x:0] Donating FD 12: "/dev/stderr" D0313 17:50:12.608257 44389 x:0] Starting sandbox: /proc/self/exe [runsc-sandbox --root=/run/containerd/runsc/default --debug=true --log= --max-threads=256 --reclaim-period=5 --log-format=text --debug-log=1.txt --debug-log-format=text --file-access=exclusive --overlay=false --fsgofer-host-uds=false --network=sandbox --log-packets=false --platform=kvm --strace=false --strace-syscalls=--strace-log-size=1024 --watchdog-action=Panic --panic-signal=-1 --profile=false --net-raw=true --num-network-channels=1 --rootless=false --alsologtostderr=false --ref-leak-mode=disabled --gso=true --software-gso=true --overlayfs-stale-read=false --shared-volume= --debug-log-fd=3 --panic-signal=15 boot --bundle=/home/ziqian.lzq/bundle/bash/runsc --controller-fd=4 --mounts-fd=5 --spec-fd=6 --start-sync-fd=7 --io-fds=8 --device-fd=9 --stdio-fds=10 --stdio-fds=11 --stdio-fds=12 --pidns=true --setup-root --cpu-num 32 --total-memory 4294967296 rootbash] Note stdioFDs starts from 10 with kvm platform and stderr's FD is 12. If I restore a container from the checkpoint image which is derived by checkpointing the above rootbash container, but either omit the platform switch or specify to use ptrace platform explicitely: $ sudo runsc --debug=true --debug-log=1.txt restore --image-path=some_path restored_rootbash the donating FD and corresponding cmdline for runsc-sandbox is: D0313 17:50:15.258632 44452 x:0] Donating FD 3: "1.txt" D0313 17:50:15.258640 44452 x:0] Donating FD 4: "control_server_socket" D0313 17:50:15.258645 44452 x:0] Donating FD 5: "|0" D0313 17:50:15.258648 44452 x:0] Donating FD 6: "/home/ziqian.lzq/bundle/bash/runsc/config.json" D0313 17:50:15.258653 44452 x:0] Donating FD 7: "|1" D0313 17:50:15.258657 44452 x:0] Donating FD 8: "sandbox IO FD" D0313 17:50:15.258661 44452 x:0] Donating FD 9: "/dev/stdin" D0313 17:50:15.258675 44452 x:0] Donating FD 10: "/dev/stdout" D0313 17:50:15.258680 44452 x:0] Donating FD 11: "/dev/stderr" D0313 17:50:15.258684 44452 x:0] Starting sandbox: /proc/self/exe [runsc-sandbox --root=/run/containerd/runsc/default --debug=true --log= --max-threads=256 --reclaim-period=5 --log-format=text --debug-log=1.txt --debug-log-format=text --file-access=exclusive --overlay=false --fsgofer-host-uds=false --network=sandbox --log-packets=false --platform=ptrace --strace=false --strace-syscalls= --strace-log-size=1024 --watchdog-action=Panic --panic-signal=-1 --profile=false --net-raw=true --num-network-channels=1 --rootless=false --alsologtostderr=false --ref-leak-mode=disabled --gso=true --software-gso=true --overlayfs-stale-read=false --shared-volume= --debug-log-fd=3 --panic-signal=15 boot --bundle=/home/ziqian.lzq/bundle/bash/runsc --controller-fd=4 --mounts-fd=5 --spec-fd=6 --start-sync-fd=7 --io-fds=8 --stdio-fds=9 --stdio-fds=10 --stdio-fds=11 --setup-root --cpu-num 32 --total-memory 4294967296 restored_rootbash] Note this time, stdioFDs starts from 9 and stderr's FD is 11(so the saved host.descritor.origFD which is 12 for stderr is no longer valid). For the three host FD based files, The s.Dev and s.Ino derived from fstat(fd) shall all be the same and since the two fields are used as device.MultiDeviceKey, the host.inodeFileState.sattr.InodeId which is the value of MultiDevice.Map(MultiDeviceKey), shall also all be the same. Note that for MultiDevice m, m.cache records the mapping of key to value and m.rcache records the mapping of value to key. If same value doesn't map to the same key, it will panic on restore. Now that stderr's origFD 12 is no longer valid(it happens to be /memfd:runsc-memory in my test on restore), the s.Dev and s.Ino derived from fstat(fd=12) in host.inodeFileState.afterLoad() will neither be correct. But its InodeID is still the same as saved, MultiDevice.Load() will complain about the same value(InodeID) being mapped to different keys (different from stdin and stdout's) and panic with: "MultiDevice's caches are inconsistent". Solve this problem by making sure stdioFDs for root container's init task are always the same on initial start and on restore time, no matter what cmdline user has used: debug log specified or not, platform changed or not etc. shall not affect the ability to restore. Fixes #1844.
2020-03-30Add AMD Rome CPUID flag.Michael Pratt
This flag is set on Rome CPUs, but it is not documented. PiperOrigin-RevId: 303825532
2020-03-30BigQuery schema for benchmark-tools dashboard.Zach Koopmans
PiperOrigin-RevId: 303805784
2020-03-30kvm: handle exit reasons even under EINTR.Adin Scannell
In the case of other signals (preemption), inject a normal bounce and defer the signal until the vCPU has been returned from guest mode. PiperOrigin-RevId: 303799678
2020-03-30Internal change.Zach Koopmans
PiperOrigin-RevId: 303773475
2020-03-30Merge pull request #2265 from amscanne:arm64_nogogVisor bot
PiperOrigin-RevId: 303753027
2020-03-27Add vfs.PathnameReachable().Jamie Liu
/proc/[pid]/mount* omit mounts whose mount point is outside the chroot, which is checked (indirectly) via __d_path(). PiperOrigin-RevId: 303434226
2020-03-27Add FilesystemType.Name method, and FilesystemType field to Filesystem struct.Nicolas Lacasse
Both have analogues in Linux: * struct file_system_type has a char *name field. * struct super_block keeps a pointer to the file_system_type. These fields are necessary to support the `filesystem type` field in /proc/[pid]/mountinfo. PiperOrigin-RevId: 303434063
2020-03-27Support Hop By Hop and Destination Options ext hdrGhanan Gowripalan
Enables handling the Hop by Hop and Destination Options extension headers, but options are not yet supported. All options will be treated as unknown and their respective action will be followed. Note, the stack does not yet support sending ICMPv6 error messages in response to options that cannot be handled/parsed. That will come in a later change (Issue #2211). Tests: - header_test.TestIPv6UnknownExtHdrOption - header_test.TestIPv6OptionsExtHdrIterErr - header_test.TestIPv6OptionsExtHdrIter - ipv6_test.TestReceiveIPv6ExtHdrs PiperOrigin-RevId: 303433085
2020-03-26Add BoundEndpointAt filesystem operation.Dean Deng
BoundEndpointAt() is needed to support Unix sockets bound at a file path, corresponding to BoundEndpoint() in VFS1. Updates #1476. PiperOrigin-RevId: 303258251
2020-03-26Use host-defined file owner and mode, when possible, for imported fds.Dean Deng
Using the host-defined file owner matches VFS1. It is more correct to use the host-defined mode, since the cached value may become out of date. However, kernfs.Inode.Mode() does not return an error--other filesystems on kernfs are in-memory so retrieving mode should not fail. Therefore, if the host syscall fails, we rely on a cached value instead. Updates #1672. PiperOrigin-RevId: 303220864
2020-03-26Use panic instead of log.FatalfGhanan Gowripalan
PiperOrigin-RevId: 303212189
2020-03-26Merge pull request #2130 from nybidari:iptablesgVisor bot
PiperOrigin-RevId: 303208407
2020-03-26Handle IPv6 Fragment & Routing extension headersGhanan Gowripalan
Enables the reassembly of fragmented IPv6 packets and handling of the Routing extension header with a Segments Left value of 0. Atomic fragments are handled as described in RFC 6946 to not interfere with "normal" fragment traffic. No specific routing header type is supported. Note, the stack does not yet support sending ICMPv6 error messages in response to IPv6 packets that cannot be handled/parsed. That will come in a later change (Issue #2211). Test: - header_test.TestIPv6RoutingExtHdr - header_test.TestIPv6FragmentExtHdr - header_test.TestIPv6ExtHdrIterErr - header_test.TestIPv6ExtHdrIter - ipv6_test.TestReceiveIPv6ExtHdrs - ipv6_test.TestReceiveIPv6Fragments RELNOTES: n/a PiperOrigin-RevId: 303189584
2020-03-26Add unique ID to Mount type.Nicolas Lacasse
Analagous to Linux's mount.mnt_id. This ID is displayed in /proc/[pid]/mountinfo. PiperOrigin-RevId: 303185564
2020-03-26Add nogo exemption for machine_arm64_unsafe.goAdin Scannell
2020-03-26Support owner matching for iptables.Nayana Bidari
This feature will match UID and GID of the packet creator, for locally generated packets. This match is only valid in the OUTPUT and POSTROUTING chains. Forwarded packets do not have any socket associated with them. Packets from kernel threads do have a socket, but usually no owner.
2020-03-26Merge pull request #2254 from kevinGC:container-timeoutgVisor bot
PiperOrigin-RevId: 303159175
2020-03-26Merge pull request #2177 from xiaobo55x:sysret_testgVisor bot
PiperOrigin-RevId: 303158421
2020-03-26Add IPv4 to bind_to_device distribution testJay Zhuang
PiperOrigin-RevId: 303156734
2020-03-26Check error in DropTCP*Port tests and fix comment.Kevin Krakauer
PiperOrigin-RevId: 303147253
2020-03-26Clean up transport_demuxer.go and testJay Zhuang
- Change receiver of endpoint lookup functions - Remove unused struct fields and functions in test - s/%v/%s/ for errors - Capitalize NIC https://github.com/golang/go/wiki/CodeReviewComments#initialisms PiperOrigin-RevId: 303119580
2020-03-26Merge pull request #1986 from lubinszARM:pr_ring0_clean_1gVisor bot
PiperOrigin-RevId: 303105826
2020-03-26Combine file mode and isDir argumentsFabricio Voznika
Updates #1035 PiperOrigin-RevId: 303021328
2020-03-25iptable: fix tests timeoutsKevin Krakauer
Tests were run assuming a runtime of "runsc" was present, and did not have --net-raw enabled.
2020-03-25Merge pull request #2238 from amscanne:nogogVisor bot
PiperOrigin-RevId: 303010530
2020-03-25nogo: enable sanitizers.Adin Scannell
This enables all relevant santizers (though most analyzers will not find much, it will prevent instances from creeping in), and codifies existing exceptions in tools/nogo.js to be fixed.
2020-03-25Fix go_marshal Example name.Adin Scannell
There is a canonical naming convention for Examples, which are checked by analyzers. This must be fixed since adding exceptions for generated code will be more challenging.
2020-03-25Merge pull request #2151 from xiaobo55x:seccomp_testgVisor bot
PiperOrigin-RevId: 302987344
2020-03-25Fix race in TestRunEnvHasHomeFabricio Voznika
It's possible to execute the command that checks user's $HOME dir before the user is created. Move the code that creates the user inside exec so it can be serialized. PiperOrigin-RevId: 302986184
2020-03-25Remove TODO to push down exec permission checkFabricio Voznika
Pushing it down requires all implementation to check for exec individualy which is not maintanable. Making it part of GenericCheckPermissions add extra cost to everyone that calls it. So it's better to keep is in VirtualFilesystem.OpenAt. Updates #1193 PiperOrigin-RevId: 302982993
2020-03-25Misc fixes to make stat_test pass (almost)Fabricio Voznika
The only test failing now requires socket which is not available in VFS2 yet. Updates #1198 PiperOrigin-RevId: 302976572
2020-03-25Set file mode and type to attributeFabricio Voznika
Makes less error prone to find file type. Updates #1197 PiperOrigin-RevId: 302974244
2020-03-25travis: exclude copybara branchesAndrei Vagin
When copybara migrates changes, it creates a new branch and then creates a pull-requests which is based on this branch. In this case, travis-ci triggers build twice for the branch and for the pull-request. PiperOrigin-RevId: 302930634
2020-03-25Fix futex_benchmark.Jamie Liu
- Fix definitions of Futex* wrappers. - Correctly handle glibc syscall() (which returns -1 and sets errno instead of returning the raw syscall return value). - De-parameterize FutexWaitBitset, which was apparently intended to test with deadlines of between 0 and 100000 nanoseconds after the Unix epoch, but was broken due to the preceding two issues. - Use wall time to measure the durations of tests that are expected to block (and thus stop accumulating CPU time). - Require 5s for all tests to improve robustness in the presence of sentry GC. - Remove FutexContend and FutexContendDeadline; it's unclear what these are supposed to measure, given that (1) FutexLock is unrealistically inefficient and (2) the benchmark rewards slow scheduling (since this reduces contention). PiperOrigin-RevId: 302925246
2020-03-25Fix data-race in endpoint.ReadinessBhasker Hariharan
PiperOrigin-RevId: 302924789
2020-03-25Automated rollback of changelist 301837227Bhasker Hariharan
PiperOrigin-RevId: 302891559
2020-03-24Add support for setting TCP segment hash.Bhasker Hariharan
This allows the link layer endpoints to consistenly hash a TCP segment to a single underlying queue in case a link layer endpoint does support multiple underlying queues. Updates #231 PiperOrigin-RevId: 302760664
2020-03-24Open a temp directory before changing capabilities and user ID-sAndrei Vagin
In cl/302130790, we started using a temp directory which is provided by bazel. By default, a test process has enough permissions to open it, but there is not any guarantee that it still will be able to do this after changing credentials. PiperOrigin-RevId: 302702337
2020-03-24Move tcpip.PacketBuffer and IPTables to stack package.Bhasker Hariharan
This is a precursor to be being able to build an intrusive list of PacketBuffers for use in queuing disciplines being implemented. Updates #2214 PiperOrigin-RevId: 302677662
2020-03-23Support basic /proc/net/dev metrics for netstackIan Lewis
Fixes #506 PiperOrigin-RevId: 302540404
2020-03-23Fix data race in SetSockOpt.Bhasker Hariharan
PiperOrigin-RevId: 302539171
2020-03-23Correctly release taskPathOperation for accessAt.Dean Deng
PiperOrigin-RevId: 302518924
2020-03-23iptables: enable iptables tests as nonblockingKevin Krakauer
PiperOrigin-RevId: 302506064
2020-03-20Statically link libpthread for static c++ binaries.Eyal Soha
The posix_server works fine when run in locally or in docker but fails in the kokoro GCP build environment. Linking libpthread statically fixes it. PiperOrigin-RevId: 302139082
2020-03-20test: Create a separate /tmp mount only for tests with the shared tagAndrei Vagin
The root mount is not shared by default, but all other mounts are shared. So if we create the /tmp mount, this means that we run tests on a shared mount even if tests run without the --shared option. PiperOrigin-RevId: 302130790
2020-03-20Actually wrap rand.Reader in bufio.Reader.Bhasker Hariharan
Updates #231 PiperOrigin-RevId: 302127697
2020-03-20Remove unused variable `sndNxtList`.Ting-Yu Wang
PiperOrigin-RevId: 302110328
2020-03-19Whitelist utimensat(2).Dean Deng
utimensat is used by hostfs for setting timestamps on imported fds. Previously, this would crash the sandbox since utimensat was not allowed. Correct the VFS2 version of hostfs to match the call in VFS1. PiperOrigin-RevId: 301970121
2020-03-19Improve error message when pivot_root failsFabricio Voznika
PiperOrigin-RevId: 301949722