gvisor - Container Runtime Sandbox

Age	Commit message (Collapse)	Author
2018-10-01	Make multi-container the default mode for runsc	Fabricio Voznika
	And remove multicontainer option. PiperOrigin-RevId: 215236981 Change-Id: I9fd1d963d987e421e63d5817f91a25c819ced6cb
2018-09-30	Don't fail if Root is readonly and is not a mount point	Fabricio Voznika
	This makes runsc more friendly to run without docker or K8s. PiperOrigin-RevId: 215165586 Change-Id: Id45a9fc24a3c09b1645f60dbaf70e64711a7a4cd
2018-09-30	Removed duplicate/stale TODOs	Fabricio Voznika
	PiperOrigin-RevId: 215162121 Change-Id: I35f06ac3235cf31c9e8a158dcf6261a7ded6c4c4
2018-09-28	Add test for 'signall --all' with stopped container	Fabricio Voznika
	PiperOrigin-RevId: 215025517 Change-Id: I04b9d8022b3d9dfe279e466ddb91310b9860b9af
2018-09-28	Made a few changes to make testutil.Docker easier to use	Fabricio Voznika
	PiperOrigin-RevId: 215023376 Change-Id: I139569bd15c013e5dd0f60d0c98a64eaa0ba9e8e
2018-09-28	runsc: allow `kill --all` when container is in stopped state.	Lantao Liu
	PiperOrigin-RevId: 215009105 Change-Id: I1ab12eddf7694c4db98f6dafca9dae352a33f7c4
2018-09-28	Add ruby image tests	Fabricio Voznika
	PiperOrigin-RevId: 215009066 Change-Id: I54ab920fa649cf4d0817f7cb8ea76f9126523330
2018-09-28	Make runsc kill and delete more conformant to the "spec"	Fabricio Voznika
	PiperOrigin-RevId: 214976251 Change-Id: I631348c3886f41f63d0e77e7c4f21b3ede2ab521
2018-09-28	Change tcpip.Route.Mask to tcpip.AddressMask.	Googler
	PiperOrigin-RevId: 214975659 Change-Id: I7bd31a2c54f03ff52203109da312e4206701c44c
2018-09-28	Clarify CLA requirements and Gerrit error	Michael Pratt
	Call out the error that Gerrit returns if there is no CLA on file. PiperOrigin-RevId: 214964718 Change-Id: I3d92e3eb73f178e8c4c52b5defbe8d21db536215
2018-09-28	Require AF_UNIX sockets from the gofer	Michael Pratt
	host.endpoint already has the check, but it is missing from host.ConnectedEndpoint. PiperOrigin-RevId: 214962762 Change-Id: I88bb13a5c5871775e4e7bf2608433df8a3d348e6
2018-09-28	Block for link address resolution	Sepehr Raissian
	Previously, if address resolution for UDP or Ping sockets required sending packets using Write in Transport layer, Resolve would return ErrWouldBlock and Write would return ErrNoLinkAddress. Meanwhile startAddressResolution would run in background. Further calls to Write using same address would also return ErrNoLinkAddress until resolution has been completed successfully. Since Write is not allowed to block and System Calls need to be interruptible in System Call layer, the caller to Write is responsible for blocking upon return of ErrWouldBlock. Now, when startAddressResolution is called a notification channel for the completion of the address resolution is returned. The channel will traverse up to the calling function of Write as well as ErrNoLinkAddress. Once address resolution is complete (success or not) the channel is closed. The caller would call Write again to send packets and check if address resolution was compeleted successfully or not. Fixes google/gvisor#5 Change-Id: Idafaf31982bee1915ca084da39ae7bd468cebd93 PiperOrigin-RevId: 214962200
2018-09-28	Switch to root in userns when CAP_SYS_CHROOT is also missing	Fabricio Voznika
	Some tests check current capabilities and re-run the tests as root inside userns if required capabibilities are missing. It was checking for CAP_SYS_ADMIN only, CAP_SYS_CHROOT is also required now. PiperOrigin-RevId: 214949226 Change-Id: Ic81363969fa76c04da408fae8ea7520653266312
2018-09-27	Merge Loader.containerRootTGs and execProcess into a single map	Fabricio Voznika
	It's easier to manage a single map with processes that we're interested to track. This will make the next change to clean up the map on destroy easier. PiperOrigin-RevId: 214894210 Change-Id: I099247323a0487cd0767120df47ba786fac0926d
2018-09-27	Move common test code to function	Fabricio Voznika
	PiperOrigin-RevId: 214890335 Change-Id: I42743f0ce46a5a42834133bce2f32d187194fc87
2018-09-27	Forward ioctl(TCSETSF) calls on host ttys to the host kernel.	Nicolas Lacasse
	We already forward TCSETS and TCSETSW. TCSETSF is roughly equivalent but discards pending input. The filters were relaxed to allow host ioctls with TCSETSF argument. This fixes programs like "passwd" that prevent user input from being displayed on the terminal. Before: root@b8a0240fc836:/# passwd Enter new UNIX password: 123 Retype new UNIX password: 123 passwd: password updated successfully After: root@ae6f5dabe402:/# passwd Enter new UNIX password: Retype new UNIX password: passwd: password updated successfully PiperOrigin-RevId: 214869788 Change-Id: I31b4d1373c1388f7b51d0f2f45ce40aa8e8b0b58
2018-09-27	Implement 'runsc kill --all'	Fabricio Voznika
	In order to implement kill --all correctly, the Sentry needs to track all tasks that belong to a given container. This change introduces ContainerID to the task, that gets inherited by all children. 'kill --all' then iterates over all tasks comparing the ContainerID field to find all processes that need to be signalled. PiperOrigin-RevId: 214841768 Change-Id: I693b2374be8692d88cc441ef13a0ae34abf73ac6
2018-09-27	netstack: make go:linkname work for all architectures	Anton Gyllenberg
	The //go:linkname directive requires the presence of assembly files in the package. Even an empty file will do. There was an empty assembly file commit_arm64.s, but that is limited to GOARCH=arm64. Renaming to empty.s will remove the unnecessary build constraint and allow building netstack for other architectures than amd64 and arm64. Without this, building directly with go (not bazel) for e.g., GOARCH=arm gives: sleep/sleep_unsafe.go:88:6: missing function body sleep/sleep_unsafe.go:91:6: missing function body Change-Id: I29d1d13e1ff31506a174d4595b8cd57fa58bf52b PiperOrigin-RevId: 214820299
2018-09-27	sentry: export cpuTime function.	Zhaozhong Ni
	PiperOrigin-RevId: 214798278 Change-Id: Id59d1ceb35037cda0689d3a1c4844e96c6957615
2018-09-27	Refactor 'runsc boot' to take container ID as argument	Fabricio Voznika
	This makes the flow slightly simpler (no need to call Loader.SetRootContainer). And this is required change to tag tasks with container ID inside the Sentry. PiperOrigin-RevId: 214795210 Change-Id: I6ff4af12e73bb07157f7058bb15fd5bb88760884
2018-09-27	Move uds_test_app to common test_app	Fabricio Voznika
	This was done so it's easier to add more functionality to this file for other tests. PiperOrigin-RevId: 214782043 Change-Id: I1f38b9ee1219b3ce7b789044ada8e52bdc1e6279
2018-09-26	Return correct parent PID	Fabricio Voznika
	Old code was returning ID of the thread that created the child process. It should be returning the ID of the parent process instead. PiperOrigin-RevId: 214720910 Change-Id: I95715c535bcf468ecf1ae771cccd04a4cd345b36
2018-09-26	runsc: fix pid file race condition in exec detach mode.	Lantao Liu
	PiperOrigin-RevId: 214700295 Change-Id: I73d8490572eebe5da584af91914650d1953aeb91
2018-09-26	Use the ICMP target address in responses	Tamir Duberstein
	There is a subtle bug that is the result of two changes made when upstreaming ICMPv6 support from Fuchsia: 1) ipv6.endpoint.WritePacket writes the local address it was initialized with, rather than the provided route's local address 2) ipv6.endpoint.handleICMP doesn't set its route's local address to the ICMP target address before writing the response The result is that the ICMP response erroneously uses the target ipv6 address (rather than icmp) as its source address in the response. When trying to debug this by fixing (2), we ran into problems with bad ipv6 checksums because (1) didn't respect the local address of the route being passed to it. This fixes both problems. PiperOrigin-RevId: 214650822 Change-Id: Ib6148bf432e6428d760ef9da35faef8e4b610d69
2018-09-26	Export ipv6 address helpers	Tamir Duberstein
	This is useful for Fuchsia. PiperOrigin-RevId: 214619681 Change-Id: If5a60dd82365c2eae51a12bbc819e5aae8c76ee9
2018-09-24	runsc: All non-root bind mounts should be shared.	Nicolas Lacasse
	This CL changes the semantics of the "--file-access" flag so that it only affects the root filesystem. The default remains "exclusive" which is the common use case, as neither Docker nor K8s supports sharing the root. Keeping the root fs as "exclusive" means that the fs-intensive work done during application startup will mostly be cacheable, and thus faster. Non-root bind mounts will always be shared. This CL also removes some redundant FSAccessType validations. We validate this flag in main(), so we can assume it is valid afterwards. PiperOrigin-RevId: 214359936 Change-Id: I7e75d7bf52dbd7fa834d0aacd4034868314f3b51
2018-09-21	Remove unnecessary defer	Ian Gudger
	PiperOrigin-RevId: 214073949 Change-Id: I8fab916cd77362c13dac2c9dcf2ecc1710d87a5e
2018-09-21	Run gofmt -s on everything	Ian Gudger
	PiperOrigin-RevId: 214040901 Change-Id: I74d79497a053da3624921ad2b7c5193ca4a87942
2018-09-21	Extend tcpip.Address.String to ipv6 addresses	Tamir Duberstein
	PiperOrigin-RevId: 214039349 Change-Id: Ia7d09c5f85eddd1e5634f3c21b0bd60b10be6bd2
2018-09-21	The "action" in container.Signal should be "signal".	Nicolas Lacasse
	PiperOrigin-RevId: 214038776 Change-Id: I4ad212540ec4ef4fb5ab5fdcb7f0865c4f746895
2018-09-21	Deflake TestSimpleReceive	Tamir Duberstein
	...by increasing the allotted timeout and using direct comparison rather than reflect.DeepEqual (which should be faster). PiperOrigin-RevId: 214027024 Change-Id: I0a2690e65c7e14b4cc118c7312dbbf5267dc78bc
2018-09-21	Export read-only tcpip.Subnet.Mask	Tamir Duberstein
	PiperOrigin-RevId: 214023383 Change-Id: I5a7572f949840fb68a3ffb7342e6a3524bd00864
2018-09-21	runsc: Synchronize container metadata changes with a file lock.	Nicolas Lacasse
	Each container has associated metadata (particularly the container status) that is manipulated by various runsc commands. This metadata is stored in a file identified by the container id. Different runsc processes may manipulate the same container metadata, and each will read/write to the metadata file. This CL adds a file lock per container which must be held when reading the container metadata file, and when modifying and writing the container metadata. PiperOrigin-RevId: 214019179 Change-Id: Ice4390ad233bc7f216c9a9a6cf05fb456c9ec0ad
2018-09-20	Set Sandbox.Chroot so it gets cleaned up upon destruction	Fabricio Voznika
	I've made several attempts to create a test, but the lack of permission from the test user makes it nearly impossible to test anything useful. PiperOrigin-RevId: 213922174 Change-Id: I5b502ca70cb7a6645f8836f028fb203354b4c625
2018-09-20	runsc: allow `runsc wait` on a container for multiple times.	Lantao Liu
	PiperOrigin-RevId: 213908919 Change-Id: I74eff99a5360bb03511b946f4cb5658bb5fc40c7
2018-09-20	Wait for all async fs operations to complete before returning from Destroy.	Nicolas Lacasse
	Destroy flushes dirent references, which triggers many async close operations. We must wait for those to finish before returning from Destroy, otherwise we may kill the gofer, causing a cascade of failing RPCs and leading to an inconsistent FS state. PiperOrigin-RevId: 213884637 Change-Id: Id054b47fc0f97adc5e596d747c08d3b97a1d1f71
2018-09-20	runsc: Fix a bug that `runsc wait` doesn't work after container exits.	Lantao Liu
	PiperOrigin-RevId: 213849165 Change-Id: I5120b2f568850c0c42a08e8706e7f8653ef1bd94
2018-09-19	runsc: Fix stdin/stdout/stderr in multi-container mode.	Kevin Krakauer
	The issue with the previous change was that the stdin/stdout/stderr passed to the sentry were dup'd by host.ImportFile. This left a dangling FD that by never closing caused containerd to timeout waiting on container stop. PiperOrigin-RevId: 213753032 Change-Id: Ia5e4c0565c42c8610d3b59f65599a5643b0901e4
2018-09-19	Add container.Destroy urpc method.	Nicolas Lacasse
	This method will: 1. Stop the container process if it is still running. 2. Unmount all sanadbox-internal mounts for the container. 3. Delete the contaner root directory inside the sandbox. Destroy is idempotent, and safe to call concurrantly. This fixes a bug where after stopping a container, we cannot unmount the container root directory on the host. This bug occured because the sandbox dirent cache was holding a dirent with a host fd corresponding to a file inside the container root on the host. The dirent cache did not know that the container had exited, and kept the FD open, preventing us from unmounting on the host. Now that we unmount (and flush) all container mounts inside the sandbox, any host FDs donated by the gofer will be closed, and we can unmount the container root on the host. PiperOrigin-RevId: 213737693 Change-Id: I28c0ff4cd19a08014cdd72fec5154497e92aacc9
2018-09-19	Update gocapability commit to get bug fix	Fabricio Voznika
	PiperOrigin-RevId: 213734203 Change-Id: I9cf5d3885fb88b41444c686168d4cab00f09988a
2018-09-19	runsc: Mark container_test flaky.	Kevin Krakauer
	PiperOrigin-RevId: 213732520 Change-Id: Ife292987ec8b1de4c2e7e3b7d4452b00c1582e91
2018-09-19	Fix data race on tcp.endpoint.hardError in tcp.(*endpoint).Read	Ian Gudger
	tcp.endpoint.hardError is protected by tcp.endpoint.mu. PiperOrigin-RevId: 213730698 Change-Id: I4e4f322ac272b145b500b1a652fbee0c7b985be2
2018-09-19	Fix sandbox and gofer capabilities	Fabricio Voznika
	Capabilities.Set() adds capabilities, but doesn't remove existing ones that might have been loaded. Fixed the code and added tests. PiperOrigin-RevId: 213726369 Change-Id: Id7fa6fce53abf26c29b13b9157bb4c6616986fba
2018-09-19	runsc: Don't create __runsc_containers__ unless we are in multi-container mode.	Nicolas Lacasse
	PiperOrigin-RevId: 213715511 Change-Id: I3e41b583c6138edbdeba036dfb9df4864134fc12
2018-09-19	Pass local link address to DeliverNetworkPacket	Bert Muthalaly
	This allows a NetworkDispatcher to implement transparent bridging, assuming all implementations of LinkEndpoint.WritePacket call eth.Encode with header.EthernetFields.SrcAddr set to the passed Route.LocalLinkAddress, if it is provided. PiperOrigin-RevId: 213686651 Change-Id: I446a4ac070970202f0724ef796ff1056ae4dd72a
2018-09-19	Add docker command line args support for --cpuset-cpus and --cpus	Lingfu
	`docker run --cpuset-cpus=/--cpus=` will generate cpu resource info in config.json (runtime spec file). When nginx worker_connections is configured as auto, the worker is generated according to the number of CPUs. If the cgroup is already set on the host, but it is not displayed correctly in the sandbox, performance may be degraded. This patch can get cpus info from spec file and apply to sentry on bootup, so the /proc/cpuinfo can show the correct cpu numbers. `lscpu` and other commands rely on `/sys/devices/system/cpu/online` are also affected by this patch. e.g. --cpuset-cpus=2,3 -> cpu number:2 --cpuset-cpus=4-7 -> cpu number:4 --cpus=2.8 -> cpu number:3 --cpus=0.5 -> cpu number:1 Change-Id: Ideb22e125758d4322a12be7c51795f8018e3d316 PiperOrigin-RevId: 213685199
2018-09-19	Fix RTT estimation when timestamp option is enabled.	Bhasker Hariharan
	From RFC7323#Section-4 The [RFC6298] RTT estimator has weighting factors, alpha and beta, based on an implicit assumption that at most one RTTM will be sampled per RTT. When multiple RTTMs per RTT are available to update the RTT estimator, an implementation SHOULD try to adhere to the spirit of the history specified in [RFC6298]. An implementation suggestion is detailed in Appendix G. From RFC7323#appendix-G Appendix G. RTO Calculation Modification Taking multiple RTT samples per window would shorten the history calculated by the RTO mechanism in [RFC6298], and the below algorithm aims to maintain a similar history as originally intended by [RFC6298]. It is roughly known how many samples a congestion window worth of data will yield, not accounting for ACK compression, and ACK losses. Such events will result in more history of the path being reflected in the final value for RTO, and are uncritical. This modification will ensure that a similar amount of time is taken into account for the RTO estimation, regardless of how many samples are taken per window: ExpectedSamples = ceiling(FlightSize / (SMSS * 2)) alpha' = alpha / ExpectedSamples beta' = beta / ExpectedSamples Note that the factor 2 in ExpectedSamples is due to "Delayed ACKs". Instead of using alpha and beta in the algorithm of [RFC6298], use alpha' and beta' instead: RTTVAR <- (1 - beta') * RTTVAR + beta' * \|SRTT - R'\| SRTT <- (1 - alpha') * SRTT + alpha' * R' (for each sample R') PiperOrigin-RevId: 213644795 Change-Id: I52278b703540408938a8edb8c38be97b37f4a10e
2018-09-18	Added state machine checks for Container.Status	Fabricio Voznika
	For my own sanitity when thinking about possible transitions and state. PiperOrigin-RevId: 213559482 Change-Id: I25588c86cf6098be4eda01f4e7321c102ceef33c
2018-09-18	Short-circuit Readdir calls on overlay files when the dirent is frozen.	Nicolas Lacasse
	If we have an overlay file whose corresponding Dirent is frozen, then we should not bother calling Readdir on the upper or lower files, since DirentReaddir will calculate children based on the frozen Dirent tree. A test was added that fails without this change. PiperOrigin-RevId: 213531215 Change-Id: I4d6c98f1416541a476a34418f664ba58f936a81d
2018-09-18	Handle children processes better in tests	Fabricio Voznika
	Reap children more systematically in container tests. Previously, container_test was taking ~5 mins to run because constainer.Destroy() would timeout waiting for the sandbox process to exit. Now the test running in less than a minute. Also made the contract around Container and Sandbox destroy clearer. PiperOrigin-RevId: 213527471 Change-Id: Icca84ee1212bbdcb62bdfc9cc7b71b12c6d1688d