diff options
Diffstat (limited to 'content/blog/1_security_basics/index.md')
-rw-r--r-- | content/blog/1_security_basics/index.md | 34 |
1 files changed, 23 insertions, 11 deletions
diff --git a/content/blog/1_security_basics/index.md b/content/blog/1_security_basics/index.md index afbd6ca7e..5d6e82be0 100644 --- a/content/blog/1_security_basics/index.md +++ b/content/blog/1_security_basics/index.md @@ -10,7 +10,7 @@ author: Jeremiah Spradlin & Zach Koopmans This blog is a space for engineers and community members to share perspectives and deep dives on technology and design within the gVisor project. -Though our logo suggests we’re in the business of space exploration (or perhaps fighting sea monsters), we’re actually in the business of sandboxing Linux containers. +Though our logo suggests we're in the business of space exploration (or perhaps fighting sea monsters), we're actually in the business of sandboxing Linux containers. When we created gVisor, we had three specific goals in mind; _container-native security_, _resource efficiency_, and _platform portability_. To put it simply, gVisor provides _efficient defense-in-depth for containers anywhere_. @@ -32,7 +32,7 @@ For example, if you are a cloud container hosting service, running containers fr gVisor was designed around the premise that any security boundary could potentially be compromised with enough time and resources. We tried to optimize for a solution that was as costly and time-consuming for an attacker as possible, at every layer. -Consequently, gVisor was built through a combination of intentional design principles and specific technology choices that work together to provide the security isolation needed for running hostile containers on a host. We’ll dig into it in the next section! +Consequently, gVisor was built through a combination of intentional design principles and specific technology choices that work together to provide the security isolation needed for running hostile containers on a host. We'll dig into it in the next section! # Design Principles @@ -73,7 +73,19 @@ It may seem strange that we would want our own software components to distrust e And this leads us to how Defense-in-Depth is applied to gVisor: no single vulnerability should compromise the host. -In the "Attacker’s Advantage / Defender’s Dilemm +In the "Attacker's Advantage / Defender's Dilemma," the defender must succeed all the time while the attacker only needs to succeed once. Defense in Depth inverts this principle: once the attacker successfully compromises any given software component, they are immediately faced with needing to compromise a subsequent, distinct layer in order to move laterally or acquire more privilege. + +For example, the untrusted container is isolated from the Sentry. The Sentry is isolated from host I/O operations by serving those requests in separate processes called Gofers. And both the untrusted container and its associated Gofers are isolated from the host process that is running the sandbox. + +An additional benefit is that this generally leads to more robust and stable software, forcing interfaces to be strictly defined and tested to ensure all inputs are properly parsed and bounds checked. + + +## Least-Privilege + +The principle of Least-Privilege implies that each software component has only the permissions it needs to function, and no more. + +Least-Privilege is applied throughout gVisor. Each component and more importantly, each interface between the components, is designed so that only the minimum level of permission is required for it to perform its function. Specifically, the closer you are to the untrusted application, the less privilege you have. + ____ ![Figure 2](./figure2.png) @@ -81,7 +93,7 @@ ____ Figure 2: runsc components and their privileges. ____ -This is evident in how runsc (the drop in gVisor binary for Docker/Kubernetes) constructs the sandbox. The Sentry has the least privilege possible (it can’t even open a file!). Gofers are only allowed file access, so even if it were compromised, the host network would be unavailable. Only the runsc binary itself has full access to the host OS, and even runsc’s access to the host OS is often limited through capabilities / chroot / namespacing. +This is evident in how runsc (the drop in gVisor binary for Docker/Kubernetes) constructs the sandbox. The Sentry has the least privilege possible (it can't even open a file!). Gofers are only allowed file access, so even if it were compromised, the host network would be unavailable. Only the runsc binary itself has full access to the host OS, and even runsc's access to the host OS is often limited through capabilities / chroot / namespacing. Designing a system with Defense-in-Depth and Least-Privilege in mind encourages small, separate, single-purpose components, each with very restricted privileges. @@ -104,7 +116,7 @@ Furthermore, any exploited vulnerabilities in the implemented syscalls (or Sentr ### Sentry/Host OS Interface: -The Sentry’s interactions with the Host OS are restricted in many ways. For instance, no syscall is "passed-through" from the untrusted application to the host OS. All syscalls are intercepted and interpreted. In the case where the Sentry needs to call the Host OS, we severely limit the syscalls that the Sentry itself is allowed to make to the host kernel[^6]. +The Sentry's interactions with the Host OS are restricted in many ways. For instance, no syscall is "passed-through" from the untrusted application to the host OS. All syscalls are intercepted and interpreted. In the case where the Sentry needs to call the Host OS, we severely limit the syscalls that the Sentry itself is allowed to make to the host kernel[^6]. For example, there are many file-system based attacks, where manipulation of files or their paths, can lead to compromise of the host[^7]. As a result, the Sentry does not allow any syscall that creates or opens a file descriptor. All file descriptors must be donated to the sandbox. By disallowing open or creation of file descriptors, we eliminate entire categories of these file-based attacks. @@ -153,16 +165,16 @@ For example, the Sentry and Gofers are implemented using Go. Go was chosen for a Using these features enabled safe array and pointer handling. This means entire classes of vulnerabilities were eliminated, such as buffer overflows and use-after-free. -Another example is our use of very strict syscall switching to ensure that the Sentry is always the first software component that parses and interprets the calls being made by the untrusted container. Here is an instance where different platforms use different solutions, but all of them share this common trait, whether it is through the use of ptrace "a la PTRACE_ATTACH"[^11] or kvm’s ring0[^12]. +Another example is our use of very strict syscall switching to ensure that the Sentry is always the first software component that parses and interprets the calls being made by the untrusted container. Here is an instance where different platforms use different solutions, but all of them share this common trait, whether it is through the use of ptrace "a la PTRACE_ATTACH"[^11] or kvm's ring0[^12]. Finally, one of the most restrictive choices was to use seccomp, to restrict the Sentry from being able to open or create a file descriptor on the host. All file I/O is required to go through Gofers. Preventing the opening or creation of file descriptions eliminates whole categories of bugs around file permissions [like this one](https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2016-4557)[^13]. # To be continued - Part 2 -In part 2 of this blog post, we will explore gVisor from an attacker’s point of view. We will use it as an opportunity to examine the specific strengths and weaknesses of each gVisor component. +In part 2 of this blog post, we will explore gVisor from an attacker's point of view. We will use it as an opportunity to examine the specific strengths and weaknesses of each gVisor component. -We will also use it to introduce Google’s Vulnerability Reward Program[^14], and other ways the community can contribute to help make gVisor safe, fast and stable. +We will also use it to introduce Google's Vulnerability Reward Program[^14], and other ways the community can contribute to help make gVisor safe, fast and stable. <!-- Footnotes themselves at the bottom. --> @@ -202,11 +214,11 @@ We will also use it to introduce Google’s Vulnerability Reward Program[^14], a [https://github.com/google/gvisor/blob/c7e901f47a09eaac56bd4813227edff016fa6bff/pkg/sentry/platform/ptrace/subprocess.go#L390](https://github.com/google/gvisor/blob/c7e901f47a09eaac56bd4813227edff016fa6bff/pkg/sentry/platform/ptrace/subprocess.go#L390) [^12]: - [https://github.com/google/gvisor/blob/c7e901f47a09eaac56bd4813227edff016fa6bff/pkg/sentry/platform/ring0/kernel_amd64.go#L182](https://github.com/google/gvisor/blob/c7e901f47a09eaac56bd4813227edff016fa6bff/pkg/sentry/platform/ring0/kernel_amd64.go#L182) + [https://github.com/google/gvisor/blob/c7e901f47a09eaac56bd4813227edff016fa6bff/pkg/sentry/platform/ring0/kernel_amd64.go#L182](https://github.com/google/gvisor/blob/c7e901f47a09eaac56bd4813227edff016fa6bff/pkg/sentry/platform/ring0/kernel_amd64.go#L182) [^13]: - [https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2016-4557](https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2016-4557) + [https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2016-4557](https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2016-4557) [^14]: - [https://www.google.com/about/appsecurity/reward-program/index.html](https://www.google.com/about/appsecurity/reward-program/index.html) + [https://www.google.com/about/appsecurity/reward-program/index.html](https://www.google.com/about/appsecurity/reward-program/index.html) |