From cc772f3d54d46b65c663c8cf7812103df31f17d3 Mon Sep 17 00:00:00 2001 From: Ian Lewis Date: Thu, 22 Oct 2020 21:21:16 -0700 Subject: Add a platform portability blog post Also fixes the docker_image bazel rule, and website-server make target. Fixes #3273 PiperOrigin-RevId: 338606668 --- Makefile | 2 +- images/defs.bzl | 17 ++-- website/BUILD | 2 +- website/_config.yml | 6 ++ website/blog/2020-10-22-platform-portability.md | 120 ++++++++++++++++++++++++ website/blog/BUILD | 11 +++ 6 files changed, 149 insertions(+), 9 deletions(-) create mode 100644 website/blog/2020-10-22-platform-portability.md diff --git a/Makefile b/Makefile index afc25557e..88f23de8d 100644 --- a/Makefile +++ b/Makefile @@ -260,7 +260,7 @@ website-build: load-jekyll ## Build the site image locally. .PHONY: website-build website-server: website-build ## Run a local server for development. - @docker run -i -p 8080:8080 gvisor.dev/images/website + @docker run -i -p 8080:8080 $(WEBSITE_IMAGE) .PHONY: website-server website-push: website-build ## Push a new image and update the service. diff --git a/images/defs.bzl b/images/defs.bzl index 61d7bbf73..c1f96e312 100644 --- a/images/defs.bzl +++ b/images/defs.bzl @@ -2,30 +2,33 @@ def _docker_image_impl(ctx): importer = ctx.actions.declare_file(ctx.label.name) + importer_content = [ "#!/bin/bash", "set -euo pipefail", + "source_file='%s'" % ctx.file.data.path, + "if [[ ! -f \"$source_file\" ]]; then", + " source_file='%s'" % ctx.file.data.short_path, + "fi", "exec docker import " + " ".join([ "-c '%s'" % attr for attr in ctx.attr.statements - ]) + " " + " ".join([ - "'%s'" % f.path - for f in ctx.files.data - ]) + " $1", + ]) + " \"$source_file\" $1", "", ] + ctx.actions.write(importer, "\n".join(importer_content), is_executable = True) return [DefaultInfo( - runfiles = ctx.runfiles(ctx.files.data), + runfiles = ctx.runfiles([ctx.file.data]), executable = importer, )] docker_image = rule( implementation = _docker_image_impl, - doc = "Tool to load a Docker image; takes a single parameter (image name).", + doc = "Tool to import a Docker image; takes a single parameter (image name).", attrs = { "statements": attr.string_list(doc = "Extra Dockerfile directives."), - "data": attr.label_list(doc = "All image data."), + "data": attr.label(doc = "Image filesystem tarball", allow_single_file = [".tgz", ".tar.gz"]), }, executable = True, ) diff --git a/website/BUILD b/website/BUILD index f3642b903..173d30ff0 100644 --- a/website/BUILD +++ b/website/BUILD @@ -6,7 +6,7 @@ package(licenses = ["notice"]) docker_image( name = "website", - data = [":files"], + data = ":files", statements = [ "EXPOSE 8080/tcp", 'ENTRYPOINT ["/server"]', diff --git a/website/_config.yml b/website/_config.yml index 20fbb3d2d..51cb8e13c 100644 --- a/website/_config.yml +++ b/website/_config.yml @@ -37,3 +37,9 @@ authors: fvoznika: name: Fabricio Voznika email: fvoznika@google.com + ianlewis: + name: Ian Lewis + email: ianlewis@google.com + mpratt: + name: Michael Pratt + email: mpratt@google.com diff --git a/website/blog/2020-10-22-platform-portability.md b/website/blog/2020-10-22-platform-portability.md new file mode 100644 index 000000000..4d82940f9 --- /dev/null +++ b/website/blog/2020-10-22-platform-portability.md @@ -0,0 +1,120 @@ +# Platform Portability + +Hardware virtualization is often seen as a requirement to provide an additional +isolation layer for untrusted applications. However, hardware virtualization +requires expensive bare-metal machines or cloud instances to run safely with +good performance, increasing cost and complexity for Cloud users. gVisor, +however, takes a more flexible approach. + +One of the pillars of gVisor's architecture is portability, allowing it to run +anywhere that runs Linux. Modern Cloud-Native applications run in containers in +many different places, from bare metal to virtual machines, and can't always +rely on nested virtualization. It is important for gVisor to be able to support +the environments where you run containers. + +gVisor achieves portability through an abstraction called a _Platform_. +Platforms can have many implementations, and each implementation can cover +different environments, making use of available software or hardware features. + +## Background + +Before we can understand how gVisor achieves portability using platforms, we +should take a step back and understand how applications interact with their +host. + +Container sandboxes can provide an isolation layer between the host and +application by virtualizing one of the layers below it, including the hardware +or operating system. Many sandboxes virtualize the hardware layer by running +applications in virtual machines. gVisor takes a different approach by +virtualizing the OS layer. + +When an application is run in a normal situation the host operating system loads +the application into user memory and schedules it for execution. The operating +system scheduler eventually schedules the application to a CPU and begins +executing it. It then handles the application's requests, such as for memory and +the lifecycle of the application. gVisor virtualizes these interactions, such as +system calls, and context switching that happen between an application and OS. + +[System calls](https://en.wikipedia.org/wiki/System_call) allow applications to +ask the OS to perform some task for it. System calls look like a normal function +call in most programming languages though works a bit differently under the +hood. When an application system call is encountered some special processing +takes place to do a +[context switch](https://en.wikipedia.org/wiki/Context_switch) into kernel mode +and begin executing code in the kernel before returning a result to the +application. Context switching may happen in other situations as well. For +example, to respond to an interrupt. + +## The Platform Interface + +gVisor provides a sandbox which implements the Linux OS interface, intercepting +OS interactions such as system calls and implements them in the sandbox kernel. + +It does this to limit interactions with the host, and protect the host from an +untrusted application running in the sandbox. The Platform is the bottom layer +of gVisor which provides the environment necessary for gVisor to control and +manage applications. In general, the Platform must: + +1. Provide the ability to create and manage memory address spaces. +2. Provide execution contexts for running applications in those memory address + spaces. +3. Provide the ability to change execution context and return control to gVisor + at specific times (e.g. system call, page fault) + +This interface is conceptually simple, but very powerful. Since the Platform +interface only requires these three capabilities, it gives gVisor enough control +for it to act as the application's OS, while still allowing the use of very +different isolation technologies under the hood. You can learn more about the +Platform interface in the +[Platform Guide](https://gvisor.dev/docs/architecture_guide/platforms/). + +## Implementations of the Platform Interface + +While gVisor can make use of technologies like hardware virtualization, it +doesn't necessarily rely on any one technology to provide a similar level of +isolation. The flexibility of the Platform interface allows for implementations +that use technologies other than hardware virtualization. This allows gVisor to +run in VMs without nested virtualization, for example. By providing an +abstraction for the underlying platform, each implementation can make various +tradeoffs regarding performance or hardware requirements. + +Currently gVisor provides two gVisor Platform implementations; the Ptrace +Platform, and the KVM Platform, each using very different methods to implement +the Platform interface. + +![gVisor Platforms](../../../../../docs/architecture_guide/platforms/platforms.png "Platforms") + +The Ptrace Platform uses +[PTRACE\_SYSEMU](http://man7.org/linux/man-pages/man2/ptrace.2.html) to trap +syscalls, and uses the host for memory mapping and context switching. This +platform can run anywhere that ptrace is available, which includes most Linux +systems, VMs or otherwise. + +The KVM Platform uses virtualization, but in an unconventional way. gVisor runs +in a virtual machine but as both guest OS and VMM, and presents no virtualized +hardware layer. This provides a simpler interface that can avoid hardware +initialization for fast start up, while taking advantage of hardware +virtualization support to improve memory isolation and performance of context +switching. + +The flexibility of the Platform interface allows for a lot of room to improve +the existing KVM and ptrace platforms, as well as the ability to utilize new +methods for improving gVisor's performance or portability in future Platform +implementations. + +## Portability + +Through the Platform interface, gVisor is able to support bare metal, virtual +machines, and Cloud environments while still providing a highly secure sandbox +for running untrusted applications. This is especially important for Cloud and +Kubernetes users because it allows gVisor to run anywhere that Kubernetes can +run and provide similar experiences in multi-region, hybrid, multi-platform +environments. + +Give gVisor's open source platforms a try. Using a Platform is as easy as +providing the `--platform` flag to `runsc`. See the documentation on +[changing platforms](https://gvisor.dev/docs/user_guide/platforms/) for how to +use different platforms with Docker. We would love to hear about your experience +so come chat with us in our +[Gitter channel](https://gitter.im/gvisor/community), or send us an +[issue on Github](https://gvisor.dev/issue) if you run into any problems. diff --git a/website/blog/BUILD b/website/blog/BUILD index 865e403da..ab37bfef0 100644 --- a/website/blog/BUILD +++ b/website/blog/BUILD @@ -38,6 +38,17 @@ doc( permalink = "/blog/2020/09/18/containing-a-real-vulnerability/", ) +doc( + name = "platform_portability", + src = "2020-10-22-platform-portability.md", + authors = [ + "ianlewis", + "mpratt", + ], + layout = "post", + permalink = "/blog/2020/09/22/platform-portability/", +) + docs( name = "posts", deps = [ -- cgit v1.2.3