summaryrefslogtreecommitdiffhomepage
AgeCommit message (Collapse)Author
2018-06-06Split PCID implementation from page tables.Adin Scannell
Instead of associating a single PCID with each set of page tables (which will reach the maximum quickly), allow a dynamic pool for each vCPU. This is the same way that Linux operates. We also split management of PCIDs out of the page tables themselves for simplicity. PiperOrigin-RevId: 199585631 Change-Id: I42f3486ada3cb2a26f623c65ac279b473ae63201
2018-06-06Add allocator abstraction for page tables.Adin Scannell
In order to prevent possible garbage collection and reuse of page table pages prior to invalidation, introduce a former allocator abstraction that can ensure entries are held during a single traversal. This also cleans up the abstraction and splits it out of the machine itself. PiperOrigin-RevId: 199581636 Change-Id: I2257d5d7ffd9c36f9b7ecd42f769261baeaf115c
2018-06-06runsc: Support abbreviated container IDs.Kevin Krakauer
Just a UI/usability addition. It's a lot easier to type "60" than "60185c721d7e10c00489f1fa210ee0d35c594873d6376b457fb1815e4fdbfc2c". PiperOrigin-RevId: 199547932 Change-Id: I19011b5061a88aba48a9ad7f8cf954a6782de854
2018-06-06Add support for rpcinet ioctl(2).Brian Geffon
This change will add support for ioctls that have previously been supported by netstack. LINE_LENGTH_IGNORE PiperOrigin-RevId: 199544114 Change-Id: I3769202c19502c3b7d05e06ea9552acfd9255893
2018-06-06Add runsc checkpoint command.Googler
Checkpoint command is plumbed through container and sandbox. Restore has also been added but it is only a stub. None of this works yet. More changes to come. PiperOrigin-RevId: 199510105 Change-Id: Ibd08d57f4737847eb25ca20b114518e487320185
2018-06-06Added a function to the controller to checkpoint a container.Googler
Functionality for checkpoint is not complete, more to come. PiperOrigin-RevId: 199500803 Change-Id: Iafb0fcde68c584270000fea898e6657a592466f7
2018-06-05Add support for rpcinet owned procfs files.Brian Geffon
This change will add support for /proc/sys/net and /proc/net which will be managed and owned by rpcinet. This will allow these inodes to be forward as rpcs. PiperOrigin-RevId: 199370799 Change-Id: I2c876005d98fe55dd126145163bee5a645458ce4
2018-06-05netstack: make TCP endpoint closed and error state cleanup work synchronous.Zhaozhong Ni
So that when saving TCP endpoint in these states, there is no pending or background activities. Also lift tcp network save rejection error to tcpip package. PiperOrigin-RevId: 199370748 Change-Id: Ief7b45c2a7338d12414cd7c23db95de6a9c22700
2018-06-04Make fsgofer attach more strictFabricio Voznika
Refuse to mount paths with "." and ".." in the path to prevent a compromised Sentry to mount "../../secrets". Only allow Attach to be called once per mount point. PiperOrigin-RevId: 199225929 Change-Id: I2a3eb7ea0b23f22eb8dde2e383e32563ec003bd5
2018-06-04Create destination mount dir if it doesn't existFabricio Voznika
PiperOrigin-RevId: 199175296 Change-Id: I694ad1cfa65572c92f77f22421fdcac818f44630
2018-06-04Return 'running' if gofer is still aliveFabricio Voznika
Containerd will start deleting container and rootfs after container is stopped. However, if gofer is still running, rootfs cleanup will fail because of device busy. This CL makes sure that gofer is not running when container state is stopped. Change from: lantaol@google.com PiperOrigin-RevId: 199172668 Change-Id: I9d874eec3ecf74fd9c8edd7f62d9f998edef66fe
2018-06-04Fix leaky FDFabricio Voznika
9P socket was being created without CLOEXEC and was being inherited by the children. This would prevent the gofer from detecting that the sandbox had exited, because the socket would not be closed. PiperOrigin-RevId: 199168959 Change-Id: I3ee1a07cbe7331b0aeb1cf2b697e728ce24f85a7
2018-06-04Refactor container_test in preparation for sandbox_testFabricio Voznika
Common code to setup and run sandbox is moved to testutil. Also, don't link "boot" and "gofer" commands with test binary. Instead, use runsc binary from the build. This not only make the test setup simpler, but also resolves a dependency issue with sandbox_tests not depending on container package. PiperOrigin-RevId: 199164478 Change-Id: I27226286ca3f914d4d381358270dd7d70ee8372f
2018-06-04Fix checksum file for today's buildFabricio Voznika
PiperOrigin-RevId: 199153448 Change-Id: Ic1f0456191080117a8586f77dd2fb44dc53754ca
2018-06-02Add SHA512 pointer to READMEFabricio Voznika
PiperOrigin-RevId: 199008198 Change-Id: I6d1a0107ae1b11f160b42a2cabaf1fb8ce419edf
2018-06-01Fix refcount bug in rpcinet socketOperations.Accept.Brian Geffon
PiperOrigin-RevId: 198931222 Change-Id: I69ee12318e87b9a6a4a94b18a9bf0ae4e39d7eaf
2018-06-01Move page tables lock into the address space.Adin Scannell
This is necessary to prevent races with invalidation. It is currently possible that page tables are garbage collected while paging caches refer to them. We must ensure that pages are held until caches can be invalidated. This is not achieved by this goal alone, but moving locking to outside the page tables themselves is a requisite. PiperOrigin-RevId: 198920784 Change-Id: I66fffecd49cb14aa2e676a84a68cabfc0c8b3e9a
2018-06-01Add SyscallRules that supports argument filteringZhengyu He
PiperOrigin-RevId: 198919043 Change-Id: I7f1f0a3b3430cd0936a4ee4fc6859aab71820bdf
2018-06-01Ignores IPv6 addresses when configuring networkFabricio Voznika
Closes #60 PiperOrigin-RevId: 198887885 Change-Id: I9bf990ee3fde9259836e57d67257bef5b85c6008
2018-05-31Add SHA512 file to nightly buildFabricio Voznika
PiperOrigin-RevId: 198745666 Change-Id: I38d4163cd65f1236b09ce4f6481197a9a9fd29f2
2018-05-30Restore FS on resume.Adin Scannell
Previously, the vCPU FS was always correct because it relied on the reset coming out of the switch. When that doesn't occur, for example, using bluepill directly, the FS value can be incorrect leading to strange corruption. This change is necessary for a subsequent change that enforces guest mode for page table modifications, and it may reduce test flakiness. (The problematic path may occur in tests, but does not occur in the actual platform.) PiperOrigin-RevId: 198648137 Change-Id: I513910a973dd8666c9a1d18cf78990964d6a644d
2018-05-30Change ring0 & page tables arguments to structs.Adin Scannell
This is a refactor of ring0 and ring0/pagetables that changes from individual arguments to opts structures. This should involve no functional changes, but sets the stage for subsequent changes. PiperOrigin-RevId: 198627556 Change-Id: Id4460340f6a73f0c793cd879324398139cd58ae9
2018-05-29Supress error when deleting non-existing container with --forceFabricio Voznika
This addresses the first issue reported in #59. CRI-O expects runsc to return success to delete when --force is used with a non-existing container. PiperOrigin-RevId: 198487418 Change-Id: If7660e8fdab1eb29549d0a7a45ea82e20a1d4f4a
2018-05-29Automated rollback of changelist 196886839Fabricio Voznika
PiperOrigin-RevId: 198457660 Change-Id: I6ea5cf0b4cfe2b5ba455325a7e5299880e5a088a
2018-05-24Poll should wake up on ECONNREFUSED with no mask.Brian Geffon
Today poll will not wake up on a ECONNREFUSED if no poll mask is specified, which is equivalent to POLLHUP | POLLERR which are implicitly added during the poll syscall. PiperOrigin-RevId: 197967183 Change-Id: I668d0730c33701228913f2d0843b48491b642efb
2018-05-24rpcinet connect doesn't handle all errnos correctly.Brian Geffon
These were causing non-blocking related errnos to be returned to the sentry when they were created as blocking FDs internally. PiperOrigin-RevId: 197962932 Change-Id: I3f843535ff87ebf4cb5827e9f3d26abfb79461b0
2018-05-24Configure sandbox as superuserFabricio Voznika
Container user might not have enough priviledge to walk directories and mount filesystems. Instead, create superuser to perform these steps of the configuration. PiperOrigin-RevId: 197953667 Change-Id: I643650ab654e665408e2af1b8e2f2aa12d58d4fb
2018-05-23Adding test case for RST acceptable ack panicBrian Geffon
PiperOrigin-RevId: 197795613 Change-Id: I759dd04995d900cba6b984649fa48bbc880946d6
2018-05-23Fix typo in TCP transportIan Gudger
PiperOrigin-RevId: 197789418 Change-Id: I86b1574c8d3b8b321348d9b101ffaef7aa15f722
2018-05-22Remove offset check to match with Linux implementation.Fabricio Voznika
PiperOrigin-RevId: 197644246 Change-Id: I63eb0a58889e69fbc4af2af8232f6fa1c399d43f
2018-05-22When sending a RST the acceptable ACK window shouldn't change.Brian Geffon
Today when we transmit a RST it's happening during the time-wait flow. Because a FIN is allowed to advance the acceptable ACK window we're incorrectly doing that for a RST. PiperOrigin-RevId: 197637565 Change-Id: I080190b06bd0225326cd68c1fbf37bd3fdbd414e
2018-05-22Change length type, and let fadvise64 return ESPIPE if file is a pipeChanwit Kaewkasi
Kernel before 2.6.16 return EINVAL, but later return ESPIPE for this case. Also change type of "length" from Uint(uint32) to Int64. Because C header uses type "size_t" (unsigned long) or "off_t" (long) for length. And it makes more sense to check length < 0 with Int64 because Uint cannot be negative. Change-Id: Ifd7fea2dcded7577a30760558d0d31f479f074c4 PiperOrigin-RevId: 197616743
2018-05-22sentry: Add simple SIOCGIFFLAGS support (IFF_RUNNING and IFF_PROMIS).Kevin Krakauer
Establishes a way of communicating interface flags between netstack and epsocket. More flags can be added over time. PiperOrigin-RevId: 197616669 Change-Id: I230448c5fb5b7d2e8d69b41a451eb4e1096a0e30
2018-05-22Clarify that syserr.New must only be called during initIan Gudger
PiperOrigin-RevId: 197599402 Change-Id: I23eb0336195ab0d3e5fb49c0c57fc9e0715a9b75
2018-05-21Fix test failure when user can't mount temp dirFabricio Voznika
PiperOrigin-RevId: 197491098 Change-Id: Ifb75bd4e4f41b84256b6d7afc4b157f6ce3839f3
2018-05-21Dramatically improve handling of KVM vCPU pool.Adin Scannell
Especially in situations with small numbers of vCPUs, the existing system resulted in excessive thrashing. Now, execution contexts co-ordinate as smoothly as they can to share a small number of cores. PiperOrigin-RevId: 197483323 Change-Id: I0afc0c5363ea9386994355baf3904bf5fe08c56c
2018-05-18sentry: Get "ip link" working.Kevin Krakauer
In Linux, many UDS ioctls are passed through to the NIC driver. We do the same here, passing ioctl calls to Unix sockets through to epsocket. In Linux you can see this path at net/socket.c:sock_ioctl, which calls sock_do_ioctl, which calls net/core/dev_ioctl.c:dev_ioctl. SIOCGIFNAME is also added. PiperOrigin-RevId: 197167508 Change-Id: I62c326a4792bd0a473e9c9108aafb6a6354f2b64
2018-05-17Move postgres to list of supported imagesFabricio Voznika
PiperOrigin-RevId: 197104043 Change-Id: I377c0727ebf0c44361ed221e1b197787825bfb7b
2018-05-17Cleanup docsMichael Pratt
This brings the proc document more up-to-date. PiperOrigin-RevId: 197070161 Change-Id: Iae2cf9dc44e3e748a33f497bb95bd3c10d0c094a
2018-05-17Fix capability check for sysv semaphores.Rahat Mahmood
Capabilities for sysv sem operations were being checked against the current task's user namespace. They should be checked against the user namespace owning the ipc namespace for the sems instead, per ipc/util.c:ipcperms(). PiperOrigin-RevId: 197063111 Change-Id: Iba29486b316f2e01ee331dda4e48a6ab7960d589
2018-05-17Implement sysv shm.Rahat Mahmood
PiperOrigin-RevId: 197058289 Change-Id: I3946c25028b7e032be4894d61acb48ac0c24d574
2018-05-17Fix sendto for dual stack UDP socketsIan Gudger
Previously, dual stack UDP sockets bound to an IPv4 address could not use sendto to communicate with IPv4 addresses. Further, dual stack UDP sockets bound to an IPv6 address could use sendto to communicate with IPv4 addresses. Neither of these behaviors are consistent with Linux. PiperOrigin-RevId: 197036024 Change-Id: Ic3713efc569f26196e35bb41e6ad63f23675fc90
2018-05-17Push signal-delivery and wait into the sandbox.Nicolas Lacasse
This is another step towards multi-container support. Previously, we delivered signals directly to the sandbox process (which then forwarded the signal to PID 1 inside the sandbox). Similarly, we waited on a container by waiting on the sandbox process itself. This approach will not work when there are multiple containers inside the sandbox, and we need to signal/wait on individual containers. This CL adds two new messages, ContainerSignal and ContainerWait. These messages include the id of the container to signal/wait. The controller inside the sandbox receives these messages and signals/waits on the appropriate process inside the sandbox. The container id is plumbed into the sandbox, but it currently is not used. We still end up signaling/waiting on PID 1 in all cases. Once we actually have multiple containers inside the sandbox, we will need to keep some sort of map of container id -> pid (or possibly pid namespace), and signal/kill the appropriate process for the container. PiperOrigin-RevId: 197028366 Change-Id: I07b4d5dc91ecd2affc1447e6b4bdd6b0b7360895
2018-05-16Fix another socket Dirent refcount.Christopher Koch
PiperOrigin-RevId: 196893452 Change-Id: I5ea0f851fcabc5eac5859e61f15213323d996337
2018-05-16Verify that when offset address is not null, infile must be seekableChanwit Kaewkasi
Change-Id: Id247399baeac58f6cd774acabd5d1da05e5b5697 PiperOrigin-RevId: 196887768
2018-05-16netstack: make TCP endpoint closed and error state cleanup work synchronous.Zhaozhong Ni
So that when saving TCP endpoint in these states, there is no pending or background activities. Also lift tcp network save rejection error to tcpip package. PiperOrigin-RevId: 196886839 Change-Id: I0fe73750f2743ec7e62d139eb2cec758c5dd6698
2018-05-16Refcount socket Dirents correctly.Christopher Koch
This should fix the socket Dirent memory leak. fs.NewFile takes a new reference. It should hold the *only* reference. DecRef that socket Dirent. Before the globalDirentMap was introduced, a mis-refcounted Dirent would be garbage collected when all references to it were gone. For socket Dirents, this meant that they would be garbage collected when the associated fs.Files disappeared. After the globalDirentMap, Dirents *must* be reference-counted correctly to be garbage collected, as Dirents remove themselves from the global map when their refcount goes to -1 (see Dirent.destroy). That removes the last pointer to that Dirent. PiperOrigin-RevId: 196878973 Change-Id: Ic7afcd1de97c7101ccb13be5fc31de0fb50963f0
2018-05-16Release mutex in BidirectionalConnect to avoid deadlock.Brian Geffon
When doing a BidirectionalConnect we don't need to continue holding the ConnectingEndpoint's mutex when creating the NewConnectedEndpoint as it was held during the Connect. Additionally, we're not holding the baseEndpoint mutex while Unregistering an event. PiperOrigin-RevId: 196875557 Change-Id: Ied4ceed89de883121c6cba81bc62aa3a8549b1e9
2018-05-15Fix KVM EFAULT handling.Adin Scannell
PiperOrigin-RevId: 196781718 Change-Id: I889766eed871929cdc247c6b9aa634398adea9c9
2018-05-15Simplify KVM invalidation logic.Adin Scannell
PiperOrigin-RevId: 196780209 Change-Id: I89f39eec914ce54a7c6c4f28e1b6d5ff5a7dd38d