summaryrefslogtreecommitdiffhomepage
path: root/runsc
AgeCommit message (Collapse)Author
2021-11-03Merge pull request #6499 from dqminh:cgroup-interfacegVisor bot
PiperOrigin-RevId: 407392578
2021-11-02Merge pull request #6803 from pkit:pkit/copy_arpgVisor bot
PiperOrigin-RevId: 407177936
2021-11-02Allow SetAttr and Allocate for deleted filesFabricio Voznika
It's safe to call SetAttr and Allocate on fsgofer because the file path is not used to open the file, if needed. Fixes #3654 PiperOrigin-RevId: 407149393
2021-11-02copy PERM ARP entries from namespace on bootConstantine Peresypkin
copy and setup PERMANENT (static) ARP entries from CNI namespace to the sandbox Fixes #3301
2021-11-01Add common Cgroup interfaceAndrei Vagin
This is part of cgroupv2 patch set. Here we add a Cgroup interface that both v1 and v2 need to conform to, and port cgroupv1 to use that first. Signed-off-by: Daniel Dao <dqminh89@gmail.com>
2021-10-27Replace bespoke WaitGroupErr with errgroupTamir Duberstein
PiperOrigin-RevId: 406027220
2021-10-26Simplify vfs.NewDisconnectedMount signature and callpoints.Ayush Ranjan
vfs.NewDisconnectedMount has no error paths. Its much prettier without the error return value. Also simplify MountDisconnected which would immediately drop the refs taken by NewDisconnectedMount. Instead make it directly call newMount. PiperOrigin-RevId: 405767966
2021-10-21Merge pull request #6345 from sudo-sturbia:mq/syscallsgVisor bot
PiperOrigin-RevId: 404901660
2021-10-21Enable VFS2 by defaultFabricio Voznika
This change enables VFS2 by default. VFS2 is much faster than the previous implementation and it's also more compatible. VFS1 is no longer supported and will be deleted from the code. Use `--vfs2=false` if you need to disable it. Make sure to report a bug if you have the need to disable VFS2 or something is not working for you. Closes #1035 PiperOrigin-RevId: 404898135
2021-10-20Add Debug to log file headerFabricio Voznika
PiperOrigin-RevId: 404635832
2021-10-19Drop accept from sentryctl socket filtersMichael Pratt
Now that we use x/sys/unix beyond https://golang.org/cl/313690 we always use accept4 in place of accept. PiperOrigin-RevId: 404265340
2021-10-18Change test to use VFS2Fabricio Voznika
Updates #1035 PiperOrigin-RevId: 404043283
2021-10-18Mount namespace can be nil after task exitsFabricio Voznika
Updates #1035 PiperOrigin-RevId: 404017795
2021-10-14Report total memory based on limit or hostFabricio Voznika
gVisor was previously reporting the lower of cgroup limit or 2GB as total memory. This may cause applications to make bad decisions based on amount of memory available to them when more than 2GB is required. This change makes the lower of cgroup limit or the host total memory to be reported inside the sandbox. This also is more inline with docker which always reports host total memory. Note that reporting cgroup limit is strictly better than host total memory when there is a limit set. Fixes #5608 PiperOrigin-RevId: 403241608
2021-10-13runsc: allow to run rootless containers on cgroupV2Andrei Vagin
Before cl/402392291 and cl/402614820, it worked without any problem. In this case, we just ignore a cgroup configuration. We do the same thing, when we don't have permissions to create new cgroups on cgroupV1. PiperOrigin-RevId: 402913129
2021-10-12Make cgroup creation/deletion more robustFabricio Voznika
- Don't attempt to create directory is controller is not present in the system - Ensure that all files being written exist in cgroupfs - Attempt to delete directories during Uninstall even if other deletions have failed Fixes #6446 PiperOrigin-RevId: 402614820
2021-10-11Create subcontainer cgroups for compatibilityFabricio Voznika
Tools (e.g. cAdvisor) watches for changes inside /sys/fs/cgroup to detect when containers are created and deleted. With gVisor, container cgroups were not created because the containers are not visible to the host. This change enables the creation of [empty] subcontainer cgroups that can be used by tools to detect creation/deletion of subcontainers. This change required a new annotation to be added so that the shim can communicate the pod cgroup path to runsc, so pod and container cgroups can be identified, Fixes #6500 PiperOrigin-RevId: 402392291
2021-10-06Add global lisafs kernel flag.Ayush Ranjan
PiperOrigin-RevId: 401296116
2021-10-04Update bazel packagesAndrei Vagin
2021-10-01Use root context to mount volumesFabricio Voznika
Fixes #6643 PiperOrigin-RevId: 400218778
2021-09-30Add timer_create and timer_settime to filtersMichael Pratt
Go 1.18 (as of golang.org/cl/324129) uses per-thread timers created and set with timer_create/timer_settime for more accurate CPU pprof profiling. Add these syscalls to the allowed syscall filters. PiperOrigin-RevId: 399941561
2021-09-28Register mqfs.FilesystemTypeZyad A. Ali
Updates #136
2021-09-27Move `sighandling` package out of `sentry`.Etienne Perot
PiperOrigin-RevId: 399295737
2021-09-22Remove terminal usage from `runsc spec`Fabricio Voznika
Most usages of `runsc spec`+`runsc run` do not expect stdios to be a terminal. Updates #6619 PiperOrigin-RevId: 398288237
2021-09-21[lisa] Implement lisafs protocol methods in VFS2 gofer client and fsgofer.Ayush Ranjan
Introduces RPC methods in lisafs. Makes that gofer client use lisafs RPCs instead of p9 when lisafs is enabled. Implements the handlers for those methods in fsgofer. Fixes #5465 PiperOrigin-RevId: 398080310
2021-09-20[lisa] Plumb lisafs through runsc.Ayush Ranjan
lisafs is only supported in VFS2. Added a runsc flag which enables lisafs. When the flag is enabled, the gofer process and the client communicate using lisafs protocol instead of 9P. Added a filesystem option in fsimpl/gofer which indicates if lisafs is being used. That will be used to gate lisafs on the gofer client. Note that this change does not make the gofer client use lisafs just yet. Updates #5465 PiperOrigin-RevId: 397917844
2021-09-16Merge pull request #6579 from prattmic:runsc_do_profilegVisor bot
PiperOrigin-RevId: 397114051
2021-09-16runsc: add global profile collection flagsMichael Pratt
Add global flags -profile-{block,cpu,heap,mutex} and -trace which enable collection of the specified profile for the entire duration of a container execution. This provides a way to definitively start profiling before that application starts, rather than attempting to race with an out-of-band `runsc debug`. Note that only the main boot process is profiled. This exposed a bug in Task.traceExecEvent: a crash when tracing and -race are enabled. traceExecEvent is called off of the task goroutine, but uses the Task as a context, which is a violation of the Task contract. Switching to the AsyncContext fixes the issue. Fixes #220
2021-09-15Merge pull request #6581 from prattmic:runsc_rootlessgVisor bot
PiperOrigin-RevId: 396938550
2021-09-15Pass address properties in a single structTony Gong
Replaced the current AddAddressWithOptions method with AddAddressWithProperties which passes all address properties in a single AddressProperties type. More properties that need to be configured in the future are expected, so adding a type makes adding them easier. PiperOrigin-RevId: 396930729
2021-09-14Remove extra newlineMichael Pratt
PiperOrigin-RevId: 396754242
2021-09-14runsc: allow rootless mode for runsc runMichael Pratt
Rootless mode seems to work fine for simple containers with runsc run, so allow its use. Since runsc run is more widely used, require a workable --network option is passed rather than automatically switching like runsc do does. Fixes #3036
2021-09-13runsc/cmd: alphabetize runsc debug profiling optionsMichael Pratt
Updates #220
2021-09-09Use accessor for runsc ControlConfig proto.Rahat Mahmood
PiperOrigin-RevId: 395859347
2021-09-09Remove link/packetsocketGhanan Gowripalan
This change removes NetworkDispatcher.DeliverOutboundPacket. Since all packet writes go through the NIC (the only NetworkDispatcher), we can deliver outgoing packets to interested packet endpoints before writing the packet to the link endpoint as the stack expects that all packets that get delivered to a link endpoint are transmitted on the wire. That is, link endpoints no longer need to let the stack know when it writes a packet as the stack already knows about the packet it writes through a link endpoint. PiperOrigin-RevId: 395761629
2021-09-09Add EthernetHeader only if underlying NIC has a mac address.Bhasker Hariharan
Fixes #6532 PiperOrigin-RevId: 395741741
2021-09-01Support sending with packet socketsGhanan Gowripalan
...through the loopback interface, only. This change only supports sending on packet sockets through the loopback interface as the loopback interface is the only interface used in packet socket syscall tests - the other link endpoints are not excercised with the existing test infrastructure. Support for sending on packet sockets through the other interfaces will be added as needed. BUG: https://fxbug.dev/81592 PiperOrigin-RevId: 394368899
2021-08-20[op] Prevent file leak in MultiGetAttr's error path.Ayush Ranjan
The old implementation was mostly correct but error prone - making way for the issue in question here. In its error path, it would leak the intermediate file being walked. Each return/break needed explicit cleanup. This change implements a more clean way to cleaning up intermediate directories. If the code were to evolve to be more complex, it would still work. PiperOrigin-RevId: 392102826
2021-08-19Add loopback interface as an ethernet-based deviceGhanan Gowripalan
...to match Linux behaviour. We can see evidence of Linux representing loopback as an ethernet-based device below: ``` # EUI-48 based MAC addresses. $ ip link show lo 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 # tcpdump showing ethernet frames when sniffing loopback and logging the # link-type as EN10MB (Ethernet). $ sudo tcpdump -i lo -e -c 2 -n tcpdump: verbose output suppressed, use -v[v]... for full protocol decode listening on lo, link-type EN10MB (Ethernet), snapshot length 262144 bytes 03:09:05.002034 00:00:00:00:00:00 > 00:00:00:00:00:00, ethertype IPv4 (0x0800), length 66: 127.0.0.1.9557 > 127.0.0.1.36828: Flags [.], ack 3562800815, win 15342, options [nop,nop,TS val 843174495 ecr 843159493], length 0 03:09:05.002094 00:00:00:00:00:00 > 00:00:00:00:00:00, ethertype IPv4 (0x0800), length 66: 127.0.0.1.36828 > 127.0.0.1.9557: Flags [.], ack 1, win 6160, options [nop,nop,TS val 843174496 ecr 843159493], length 0 2 packets captured 116 packets received by filter 0 packets dropped by kernel ``` Wireshark shows a similar result as the tcpdump example above. Linux's loopback setup: https://github.com/torvalds/linux/blob/5bfc75d92efd494db37f5c4c173d3639d4772966/drivers/net/loopback.c#L162 PiperOrigin-RevId: 391836719
2021-08-18Add control configsChong Cai
Also plumber the controls through runsc PiperOrigin-RevId: 391594318
2021-08-13Add Event controlsChong Cai
Add Event controls and implement "stream" commands. PiperOrigin-RevId: 390691702
2021-08-12Add Usage controlsChong Cai
Add Usage controls and implement "usage/usagefd" commands. PiperOrigin-RevId: 390507423
2021-08-12Clear Merkle files before measuring verity fsChong Cai
PiperOrigin-RevId: 390467957
2021-08-06[SMT] Refactor runsc mititgateZach Koopmans
Refactor mitigate to use /sys/devices/system/cpu/smt/control instead of individual CPU control files. PiperOrigin-RevId: 389215975
2021-08-04Add Fs controlsChong Cai
Add Fs controls and implement "cat" command. PiperOrigin-RevId: 388812540
2021-08-03Add Lifecycle controlsChong Cai
Also change runsc pause/resume cmd to access Lifecycle instead of containerManager. PiperOrigin-RevId: 388534928
2021-07-26Merge pull request #6292 from btw616:local-timezonegVisor bot
PiperOrigin-RevId: 386988406
2021-07-23Add support for SIOCGIFCONF ioctl in hostinet.Lucas Manning
PiperOrigin-RevId: 386511818
2021-07-22runsc: Wait child processes without timeoutsAndrei Vagin
* First, we don't need to poll child processes. * Second, the 5 seconds timeout is too small if a host is overloaded. * Third, this can hide bugs in the code when we wait a process that isn't going to exit. PiperOrigin-RevId: 386337586
2021-07-20Don't kill container when volume is unmountedFabricio Voznika
The gofer session is killed when a gofer backed volume is unmounted. The gofer monitor catches the disconnect and kills the container. This changes the gofer monitor to only care about the rootfs connections, which cannot be unmounted. Fixes #6259 PiperOrigin-RevId: 385929039