summaryrefslogtreecommitdiffhomepage
path: root/g3doc/proposals
diff options
context:
space:
mode:
Diffstat (limited to 'g3doc/proposals')
-rw-r--r--g3doc/proposals/BUILD16
-rw-r--r--g3doc/proposals/gsoc-2021-ideas.md71
-rw-r--r--g3doc/proposals/runtime_dedicate_os_thread.md188
3 files changed, 0 insertions, 275 deletions
diff --git a/g3doc/proposals/BUILD b/g3doc/proposals/BUILD
deleted file mode 100644
index 3a7a377c3..000000000
--- a/g3doc/proposals/BUILD
+++ /dev/null
@@ -1,16 +0,0 @@
-load("//website:defs.bzl", "doc")
-
-package(
- default_visibility = ["//website:__pkg__"],
- licenses = ["notice"],
-)
-
-doc(
- name = "gsoc_2021",
- src = "gsoc-2021-ideas.md",
- category = "Project",
- include_in_menu = False,
- permalink = "/community/gsoc_2021",
- subcategory = "Community",
- weight = "99",
-)
diff --git a/g3doc/proposals/gsoc-2021-ideas.md b/g3doc/proposals/gsoc-2021-ideas.md
deleted file mode 100644
index 0c2e2afaa..000000000
--- a/g3doc/proposals/gsoc-2021-ideas.md
+++ /dev/null
@@ -1,71 +0,0 @@
-# Project Ideas for Google Summer of Code 2021
-
-This is a collection of project ideas for
-[Google Summer of Code 2021][gsoc-2021-site]. These projects are intended to be
-relatively self-contained and should be good starting projects for new
-contributors to gVisor. We expect individual contributors to be able to make
-reasonable progress on these projects over the course of several weeks.
-Familiarity with Golang and knowledge about systems programming in Linux will be
-helpful.
-
-If you're interested in contributing to gVisor through Google Summer of Code
-2021, but would like to propose your own idea for a project, please see our
-[roadmap](../roadmap.md) for areas of development, and get in touch through our
-[mailing list][gvisor-mailing-list] or [chat][gvisor-chat]!
-
-## Implement `io_uring`
-
-`io_uring` is the lastest asynchronous I/O API in Linux. This project will
-involve implementing the system interfaces required to support `io_uring` in
-gVisor. A successful implementation should have similar relatively performance
-and scalability characteristics compared to synchronous I/O syscalls, as in
-Linux.
-
-The core of the `io_uring` interface is deceptively simple, involving only three
-new syscalls:
-
-- `io_uring_setup(2)` creates a new `io_uring` instance represented by a file
- descriptor, including a set of request submission and completion queues
- backed by shared memory ring buffers.
-
-- `io_uring_register(2)` optionally binds kernel resources such as files and
- memory buffers to handles, which can then be passed to `io_uring`
- operations. Pre-registering resources in this way moves the cost of looking
- up and validating these resources to registration time rather than paying
- the cost during the operation.
-
-- `io_uring_enter(2)` is the syscall used to submit queued operations and wait
- for completions. This is the most complex part of the mechanism, requiring
- the kernel to process queued request from the submission queue, dispatching
- the appropriate I/O operation based on the request arguments and blocking
- for the requested number of operations to be completed before returning.
-
-An `io_uring` request is effectively an opcode specifying the I/O operation to
-perform, and corresponding arguments. The opcodes and arguments closely relate
-to the the corresponding synchronous I/O syscall. In addition, there are some
-`io_uring`-specific arguments that specify things like how to process requests,
-how to interpret the arguments and communicate the status of the ring buffers.
-
-For a detailed description of the `io_uring` interface, see the
-[design doc][io-uring-doc] by the `io_uring` authors.
-
-Due to the complexity of the full `io_uring` mechanism and the numerous
-supported operations, it should be implemented in two stages:
-
-In the first stage, a simplified version of the `io_uring_setup` and
-`io_uring_enter` syscalls should be implemented, which will only support a
-minimal set of arguments and just one or two simple opcodes. This simplified
-implementation can be used to figure out how to integreate `io_uring` with
-gVisor's virtual filesystem and memory management subsystems, as well as
-benchmark the implementation to ensure it has the desired performance
-characteristics. The goal in this stage should be to implement the smallest
-subset of features required to perform a basic operation through `io_uring`s.
-
-In the second stage, support can be added for all the I/O operations supported
-by Linux, as well as advanced `io_uring` features such as fixed files and
-buffers (via `io_uring_register`), polled I/O and kernel-side request polling.
-
-[gsoc-2021-site]: https://summerofcode.withgoogle.com
-[gvisor-chat]: https://gitter.im/gvisor/community
-[gvisor-mailing-list]: https://groups.google.com/g/gvisor-dev
-[io-uring-doc]: https://kernel.dk/io_uring.pdf
diff --git a/g3doc/proposals/runtime_dedicate_os_thread.md b/g3doc/proposals/runtime_dedicate_os_thread.md
deleted file mode 100644
index dc70055b0..000000000
--- a/g3doc/proposals/runtime_dedicate_os_thread.md
+++ /dev/null
@@ -1,188 +0,0 @@
-# `runtime.DedicateOSThread`
-
-Status as of 2020-09-18: Deprioritized; initial studies in #2180 suggest that
-this may be difficult to support in the Go runtime due to issues with GC.
-
-## Summary
-
-Allow goroutines to bind to kernel threads in a way that allows their scheduling
-to be kernel-managed rather than runtime-managed.
-
-## Objectives
-
-* Reduce Go runtime overhead in the gVisor sentry (#2184).
-
-* Minimize intrusiveness of changes to the Go runtime.
-
-## Background
-
-In Go, execution contexts are referred to as goroutines, which the runtime calls
-Gs. The Go runtime maintains a variably-sized pool of threads (called Ms by the
-runtime) on which Gs are executed, as well as a pool of "virtual processors"
-(called Ps by the runtime) of size equal to `runtime.GOMAXPROCS()`. Usually,
-each M requires a P in order to execute Gs, limiting the number of concurrently
-executing goroutines to `runtime.GOMAXPROCS()`.
-
-The `runtime.LockOSThread` function temporarily locks the invoking goroutine to
-its current thread. It is primarily useful for interacting with OS or non-Go
-library facilities that are per-thread. It does not reduce interactions with the
-Go runtime scheduler: locked Ms relinquish their P when they become blocked, and
-only continue execution after another M "chooses" their locked G to run and
-donates their P to the locked M instead.
-
-## Problems
-
-### Context Switch Overhead
-
-Most goroutines in the gVisor sentry are task goroutines, which back application
-threads. Task goroutines spend large amounts of time blocked on syscalls that
-execute untrusted application code. When invoking said syscall (which varies by
-gVisor platform), the task goroutine may interact with the Go runtime in one of
-three ways:
-
-* It can invoke the syscall without informing the runtime. In this case, the
- task goroutine will continue to hold its P during the syscall, limiting the
- number of application threads that can run concurrently to
- `runtime.GOMAXPROCS()`. This is problematic because the Go runtime scheduler
- is known to scale poorly with `GOMAXPROCS`; see #1942 and
- https://github.com/golang/go/issues/28808. It also means that preemption of
- application threads must be driven by sentry or runtime code, which is
- strictly slower than kernel-driven preemption (since the sentry must invoke
- another syscall to preempt the application thread).
-
-* It can call `runtime.entersyscallblock` before invoking the syscall, and
- `runtime.exitsyscall` after the syscall returns. In this case, the task
- goroutine will release its P while the syscall is executing. This allows the
- number of threads concurrently executing application code to exceed
- `GOMAXPROCS`. However, this incurs additional latency on syscall entry (to
- hand off the released P to another M, often requiring a `futex(FUTEX_WAKE)`
- syscall) and on syscall exit (to acquire a new P). It also drastically
- increases the number of threads that concurrently interact with the runtime
- scheduler, which is also problematic for performance (both in terms of CPU
- utilization and in terms of context switch latency); see #205.
-
-- It can call `runtime.entersyscall` before invoking the syscall, and
- `runtime.exitsyscall` after the syscall returns. In this case, the task
- goroutine "lazily releases" its P, allowing the runtime's "sysmon" thread to
- steal it on behalf of another M after a 20us delay. This mitigates the
- context switch latency problem when there are few task goroutines and the
- interval between switches to application code (i.e. the interval between
- application syscalls, page faults, or signal delivery) is short. (Cynically,
- this means that it's most effective in microbenchmarks). However, the delay
- before a P is stolen can also be problematic for performance when there are
- both many task goroutines switching to application code (lazily releasing
- their Ps) *and* many task goroutines switching to sentry code (contending
- for Ps), which is likely in larger heterogeneous workloads.
-
-### Blocking Overhead
-
-Task goroutines block on behalf of application syscalls like `futex` and
-`epoll_wait` by receiving from a Go channel. (Future work may convert task
-goroutine blocking to use the `syncevent` package to avoid overhead associated
-with channels and `select`, but this does not change how blocking interacts with
-the Go runtime scheduler.)
-
-If `runtime.LockOSThread()` is not in effect when a task goroutine blocks, then
-when the task goroutine is unblocked (by e.g. an application `FUTEX_WAKE`,
-signal delivery, or a timeout) by sending to the blocked channel,
-`runtime.ready` migrates the unblocked G to the unblocking P. In most cases,
-this implies that every application thread block/unblock cycle results in a
-migration of the thread between Ps, and therefore Ms, and therefore cores,
-resulting in reduced application performance due to loss of CPU caches.
-Furthermore, in most cases, the unblocking P cannot immediately switch to the
-unblocked G (instead resuming execution of its current application thread after
-completing the application's `futex(FUTEX_WAKE)`, `tgkill`, etc. syscall), often
-requiring that another P steal the unblocked G before it can resume execution.
-
-If `runtime.LockOSThread()` is in effect when a task goroutine blocks, then the
-G will remain locked to its M, avoiding the core migration described above;
-however, wakeup latency is significantly increased since, as described in
-"Background", the G still needs to be selected by the scheduler before it can
-run, and the M that selects the G then needs to transfer its P to the locked M,
-incurring an additional `FUTEX_WAKE` syscall and round of kernel scheduling.
-
-## Proposal
-
-We propose to add a function, tentatively called `DedicateOSThread`, to the Go
-`runtime` package, documented as follows:
-
-```go
-// DedicateOSThread wires the calling goroutine to its current operating system
-// thread, and exempts it from counting against GOMAXPROCS. The calling
-// goroutine will always execute in that thread, and no other goroutine will
-// execute in it, until the calling goroutine has made as many calls to
-// UndedicateOSThread as to DedicateOSThread. If the calling goroutine exits
-// without unlocking the thread, the thread will be terminated.
-//
-// DedicateOSThread should only be used by long-lived goroutines that usually
-// block due to blocking system calls, rather than interaction with other
-// goroutines.
-func DedicateOSThread()
-```
-
-Mechanically, `DedicateOSThread` implies `LockOSThread` (i.e. it locks the
-invoking G to a M), but additionally locks the invoking M to a P. Ps locked by
-`DedicateOSThread` are not counted against `GOMAXPROCS`; that is, the actual
-number of Ps in the system (`len(runtime.allp)`) is `GOMAXPROCS` plus the number
-of bound Ps (plus some slack to avoid frequent changes to `runtime.allp`).
-Corollaries:
-
-* If `runtime.ready` observes that a readied G is locked to a M locked to a P,
- it immediately wakes the locked M without migrating the G to the readying P
- or waiting for a future call to `runtime.schedule` to select the readied G
- in `runtime.findrunnable`.
-
-* `runtime.stoplockedm` and `runtime.reentersyscall` skip the release of
- locked Ps; the latter also skips sysmon wakeup. `runtime.stoplockedm` and
- `runtime.exitsyscall` skip re-acquisition of Ps if one is locked.
-
-* sysmon does not attempt to preempt Gs that are locked to Ps, avoiding
- fruitless overhead from `tgkill` syscalls and signal delivery.
-
-* `runtime.findrunnable`'s work stealing skips locked Ps (suggesting that
- unlocked Ps be tracked in a separate array). `runtime.findrunnable` on
- locked Ps skip the global run queue, work stealing, and possibly netpoll.
-
-* New goroutines created by goroutines with locked Ps are enqueued on the
- global run queue rather than the invoking P's local run queue.
-
-While gVisor's use case does not strictly require that the association is
-reversible (with `runtime.UndedicateOSThread`), such a feature is required to
-allow reuse of locked Ms, which is likely to be critical for performance.
-
-## Alternatives Considered
-
-* Make the runtime scale well with `GOMAXPROCS`. While we are also
- concurrently investigating this problem, this would not address the issues
- of increased preemption cost or blocking overhead.
-
-* Make the runtime scale well with number of Ms. It is unclear if this is
- actually feasible, and would not address blocking overhead.
-
-* Make P-locking part of `LockOSThread`'s behavior. This would likely
- introduce performance regressions in existing uses of `LockOSThread` that do
- not fit this usage pattern. In particular, since `DedicateOSThread`
- transitions the invoker's P from "counted against `GOMAXPROCS`" to "not
- counted against `GOMAXPROCS`", it may need to wake another M to run a new P
- (that is counted against `GOMAXPROCS`), and the converse applies to
- `UndedicateOSThread`.
-
-* Rewrite the gVisor sentry in a language that does not force userspace
- scheduling. This is a last resort due to the amount of code involved.
-
-## Related Issues
-
-The proposed functionality is directly analogous to `spawn_blocking` in Rust
-async runtimes
-[`async_std`](https://docs.rs/async-std/1.8.0/async_std/task/fn.spawn_blocking.html)
-and [`tokio`](https://docs.rs/tokio/0.3.5/tokio/task/fn.spawn_blocking.html).
-
-Outside of gVisor:
-
-* https://github.com/golang/go/issues/21827#issuecomment-595152452 describes a
- use case for this feature in go-delve, where the goroutine that would use
- this feature spends much of its time blocked in `ptrace` syscalls.
-
-* This feature may improve performance in the use case described in
- https://github.com/golang/go/issues/18237, given the prominence of
- syscall.Syscall in the profile given in that bug report.