summaryrefslogtreecommitdiffhomepage
path: root/pkg
AgeCommit message (Collapse)Author
2018-06-11Make page tables split-safe.Adin Scannell
In order to minimize the likelihood of exit during page table modifications, make the full set of page table functions split-safe. This is not strictly necessary (and you may still incur splits due to allocations from the allocator pool) but should make retries a very rare occurance. PiperOrigin-RevId: 200146688 Change-Id: I8fa36aa16b807beda2f0b057be60038258e8d597
2018-06-11Handle all exception vectors.Adin Scannell
PiperOrigin-RevId: 200144655 Change-Id: I5a753c74b75007b7714d6fe34aa0d2e845dc5c41
2018-06-11Set CLOEXEC option to socketsFabricio Voznika
hostinet/socket.go: the Sentry doesn't spawn new processes, but it doesn't hurt to protect the socket from leaking. unet/unet.go: should be setting closing on exec. The FD is explicitly donated to children when needed. PiperOrigin-RevId: 200135682 Change-Id: Ia8a45ced1e00a19420c8611b12e7a8ee770f89cb
2018-06-11Rpcinet is incorrectly handling MSG_TRUNC with SOCK_STREAMBrian Geffon
SOCK_STREAM has special behavior with respect to MSG_TRUNC. Specifically, the data isn't actually copied back out to userspace when MSG_TRUNC is provided on a SOCK_STREAM. According to tcp(7): "Since version 2.4, Linux supports the use of MSG_TRUNC in the flags argument of recv(2) (and recvmsg(2)). This flag causes the received bytes of data to be discarded, rather than passed back in a caller-supplied buffer." PiperOrigin-RevId: 200134860 Change-Id: I70f17a5f60ffe7794c3f0cfafd131c069202e90d
2018-06-11rpcinet is treating EAGAIN and EWOULDBLOCK as different errnos.Brian Geffon
PiperOrigin-RevId: 200124614 Change-Id: I38a7b083f1464a2a586fe24db648e624c455fec5
2018-06-11Add O_TRUNC handling in openatFabricio Voznika
PiperOrigin-RevId: 200103677 Change-Id: I3efb565c30c64d35f8fd7b5c05ed78dcc2990c51
2018-06-11Sentry: split tty.queue into its own file.Kevin Krakauer
Minor refactor. line_discipline.go was home to 2 large structs (lineDiscipline and queue), and queue is now large enough IMO to get its own file. Also moves queue locks into the queue struct, making locking simpler. PiperOrigin-RevId: 200080301 Change-Id: Ia75a0e9b3d9ac8d7e5a0f0099a54e1f5b8bdea34
2018-06-08Fix kernel flags handling and add missing vectors.Adin Scannell
PiperOrigin-RevId: 199877174 Change-Id: I9d19ea301608c2b989df0a6123abb1e779427853
2018-06-08Add checks for short CopyOut in rpcinetBrian Geffon
PiperOrigin-RevId: 199864753 Change-Id: Ibace6a1fdf99ee6ce368ac12c390aa8a02dbdfb7
2018-06-08Fix sigaltstack semantics.Adin Scannell
Walking off the bottom of the sigaltstack, for example with recursive faults, results in forced signal delivery, not resetting the stack or pushing signal stack to whatever happens to lie below the signal stack. PiperOrigin-RevId: 199856085 Change-Id: I0004d2523f0df35d18714de2685b3eaa147837e0
2018-06-08Add a protocol option to set congestion control algorithm.Bhasker Hariharan
Also adds support to query available congestion control algorithms. PiperOrigin-RevId: 199826897 Change-Id: I2b338b709820ee9cf58bb56d83aa7b1a39f4eab2
2018-06-08rpcinet is not correctly handling MSG_TRUNC on recvmsg(2).Brian Geffon
MSG_TRUNC can cause recvmsg(2) to return a value larger than the buffer size. In this situation it's an indication that the buffer was completely filled and that the msg was truncated. Previously in rpcinet we were returning the buffer size but we should actually be returning the payload length as returned by the syscall. PiperOrigin-RevId: 199814221 Change-Id: If09aa364219c1bf193603896fcc0dc5c55e85d21
2018-06-07rpcinet should not block in read(2) rpcs.Brian Geffon
PiperOrigin-RevId: 199703609 Change-Id: I8153b0396b22a230a68d4b69c46652a5545f7630
2018-06-07Add missing rpcinet ioctls.Brian Geffon
PiperOrigin-RevId: 199669120 Change-Id: I0be88cdbba29760f967e9a5bb4144ca62c1ed7aa
2018-06-07Sentry: very basic terminal echo support.Kevin Krakauer
Adds support for echo to terminals. Echoing is just copying input back out to the user, e.g. when I type "foo" into a terminal, I expect "foo" to be echoed back to my terminal. Also makes the transform function part of the queue, eliminating the need to pass them around together and the possibility of using the wrong transform for a queue. PiperOrigin-RevId: 199655147 Change-Id: I37c490d4fc1ee91da20ae58ba1f884a5c14fd0d8
2018-06-06Ensure guest-mode for page table modifications.Adin Scannell
Because of the KVM shadow page table implementation, modifications made to guest page tables from host mode may not be syncronized correctly, resulting in undefined behavior. This is a KVM bug: page table pages should also be tracked for host modifications and resynced appropriately (e.g. the guest could "DMA" into a page table page in theory). However, since we can't rely on this being fixed everywhere, workaround the issue by forcing page table modifications to be in guest mode. This will generally be the case anyways, but now if an exit occurs during modifications, we will re-enter and perform the modifications again. PiperOrigin-RevId: 199587895 Change-Id: I83c20b4cf2a9f9fa56f59f34939601dd34538fb0
2018-06-06Split PCID implementation from page tables.Adin Scannell
Instead of associating a single PCID with each set of page tables (which will reach the maximum quickly), allow a dynamic pool for each vCPU. This is the same way that Linux operates. We also split management of PCIDs out of the page tables themselves for simplicity. PiperOrigin-RevId: 199585631 Change-Id: I42f3486ada3cb2a26f623c65ac279b473ae63201
2018-06-06Add allocator abstraction for page tables.Adin Scannell
In order to prevent possible garbage collection and reuse of page table pages prior to invalidation, introduce a former allocator abstraction that can ensure entries are held during a single traversal. This also cleans up the abstraction and splits it out of the machine itself. PiperOrigin-RevId: 199581636 Change-Id: I2257d5d7ffd9c36f9b7ecd42f769261baeaf115c
2018-06-06Add support for rpcinet ioctl(2).Brian Geffon
This change will add support for ioctls that have previously been supported by netstack. LINE_LENGTH_IGNORE PiperOrigin-RevId: 199544114 Change-Id: I3769202c19502c3b7d05e06ea9552acfd9255893
2018-06-06Added a function to the controller to checkpoint a container.Googler
Functionality for checkpoint is not complete, more to come. PiperOrigin-RevId: 199500803 Change-Id: Iafb0fcde68c584270000fea898e6657a592466f7
2018-06-05Add support for rpcinet owned procfs files.Brian Geffon
This change will add support for /proc/sys/net and /proc/net which will be managed and owned by rpcinet. This will allow these inodes to be forward as rpcs. PiperOrigin-RevId: 199370799 Change-Id: I2c876005d98fe55dd126145163bee5a645458ce4
2018-06-05netstack: make TCP endpoint closed and error state cleanup work synchronous.Zhaozhong Ni
So that when saving TCP endpoint in these states, there is no pending or background activities. Also lift tcp network save rejection error to tcpip package. PiperOrigin-RevId: 199370748 Change-Id: Ief7b45c2a7338d12414cd7c23db95de6a9c22700
2018-06-01Fix refcount bug in rpcinet socketOperations.Accept.Brian Geffon
PiperOrigin-RevId: 198931222 Change-Id: I69ee12318e87b9a6a4a94b18a9bf0ae4e39d7eaf
2018-06-01Move page tables lock into the address space.Adin Scannell
This is necessary to prevent races with invalidation. It is currently possible that page tables are garbage collected while paging caches refer to them. We must ensure that pages are held until caches can be invalidated. This is not achieved by this goal alone, but moving locking to outside the page tables themselves is a requisite. PiperOrigin-RevId: 198920784 Change-Id: I66fffecd49cb14aa2e676a84a68cabfc0c8b3e9a
2018-06-01Add SyscallRules that supports argument filteringZhengyu He
PiperOrigin-RevId: 198919043 Change-Id: I7f1f0a3b3430cd0936a4ee4fc6859aab71820bdf
2018-05-30Restore FS on resume.Adin Scannell
Previously, the vCPU FS was always correct because it relied on the reset coming out of the switch. When that doesn't occur, for example, using bluepill directly, the FS value can be incorrect leading to strange corruption. This change is necessary for a subsequent change that enforces guest mode for page table modifications, and it may reduce test flakiness. (The problematic path may occur in tests, but does not occur in the actual platform.) PiperOrigin-RevId: 198648137 Change-Id: I513910a973dd8666c9a1d18cf78990964d6a644d
2018-05-30Change ring0 & page tables arguments to structs.Adin Scannell
This is a refactor of ring0 and ring0/pagetables that changes from individual arguments to opts structures. This should involve no functional changes, but sets the stage for subsequent changes. PiperOrigin-RevId: 198627556 Change-Id: Id4460340f6a73f0c793cd879324398139cd58ae9
2018-05-29Automated rollback of changelist 196886839Fabricio Voznika
PiperOrigin-RevId: 198457660 Change-Id: I6ea5cf0b4cfe2b5ba455325a7e5299880e5a088a
2018-05-24Poll should wake up on ECONNREFUSED with no mask.Brian Geffon
Today poll will not wake up on a ECONNREFUSED if no poll mask is specified, which is equivalent to POLLHUP | POLLERR which are implicitly added during the poll syscall. PiperOrigin-RevId: 197967183 Change-Id: I668d0730c33701228913f2d0843b48491b642efb
2018-05-24rpcinet connect doesn't handle all errnos correctly.Brian Geffon
These were causing non-blocking related errnos to be returned to the sentry when they were created as blocking FDs internally. PiperOrigin-RevId: 197962932 Change-Id: I3f843535ff87ebf4cb5827e9f3d26abfb79461b0
2018-05-23Adding test case for RST acceptable ack panicBrian Geffon
PiperOrigin-RevId: 197795613 Change-Id: I759dd04995d900cba6b984649fa48bbc880946d6
2018-05-23Fix typo in TCP transportIan Gudger
PiperOrigin-RevId: 197789418 Change-Id: I86b1574c8d3b8b321348d9b101ffaef7aa15f722
2018-05-22Remove offset check to match with Linux implementation.Fabricio Voznika
PiperOrigin-RevId: 197644246 Change-Id: I63eb0a58889e69fbc4af2af8232f6fa1c399d43f
2018-05-22When sending a RST the acceptable ACK window shouldn't change.Brian Geffon
Today when we transmit a RST it's happening during the time-wait flow. Because a FIN is allowed to advance the acceptable ACK window we're incorrectly doing that for a RST. PiperOrigin-RevId: 197637565 Change-Id: I080190b06bd0225326cd68c1fbf37bd3fdbd414e
2018-05-22Change length type, and let fadvise64 return ESPIPE if file is a pipeChanwit Kaewkasi
Kernel before 2.6.16 return EINVAL, but later return ESPIPE for this case. Also change type of "length" from Uint(uint32) to Int64. Because C header uses type "size_t" (unsigned long) or "off_t" (long) for length. And it makes more sense to check length < 0 with Int64 because Uint cannot be negative. Change-Id: Ifd7fea2dcded7577a30760558d0d31f479f074c4 PiperOrigin-RevId: 197616743
2018-05-22sentry: Add simple SIOCGIFFLAGS support (IFF_RUNNING and IFF_PROMIS).Kevin Krakauer
Establishes a way of communicating interface flags between netstack and epsocket. More flags can be added over time. PiperOrigin-RevId: 197616669 Change-Id: I230448c5fb5b7d2e8d69b41a451eb4e1096a0e30
2018-05-22Clarify that syserr.New must only be called during initIan Gudger
PiperOrigin-RevId: 197599402 Change-Id: I23eb0336195ab0d3e5fb49c0c57fc9e0715a9b75
2018-05-21Dramatically improve handling of KVM vCPU pool.Adin Scannell
Especially in situations with small numbers of vCPUs, the existing system resulted in excessive thrashing. Now, execution contexts co-ordinate as smoothly as they can to share a small number of cores. PiperOrigin-RevId: 197483323 Change-Id: I0afc0c5363ea9386994355baf3904bf5fe08c56c
2018-05-18sentry: Get "ip link" working.Kevin Krakauer
In Linux, many UDS ioctls are passed through to the NIC driver. We do the same here, passing ioctl calls to Unix sockets through to epsocket. In Linux you can see this path at net/socket.c:sock_ioctl, which calls sock_do_ioctl, which calls net/core/dev_ioctl.c:dev_ioctl. SIOCGIFNAME is also added. PiperOrigin-RevId: 197167508 Change-Id: I62c326a4792bd0a473e9c9108aafb6a6354f2b64
2018-05-17Cleanup docsMichael Pratt
This brings the proc document more up-to-date. PiperOrigin-RevId: 197070161 Change-Id: Iae2cf9dc44e3e748a33f497bb95bd3c10d0c094a
2018-05-17Fix capability check for sysv semaphores.Rahat Mahmood
Capabilities for sysv sem operations were being checked against the current task's user namespace. They should be checked against the user namespace owning the ipc namespace for the sems instead, per ipc/util.c:ipcperms(). PiperOrigin-RevId: 197063111 Change-Id: Iba29486b316f2e01ee331dda4e48a6ab7960d589
2018-05-17Implement sysv shm.Rahat Mahmood
PiperOrigin-RevId: 197058289 Change-Id: I3946c25028b7e032be4894d61acb48ac0c24d574
2018-05-17Fix sendto for dual stack UDP socketsIan Gudger
Previously, dual stack UDP sockets bound to an IPv4 address could not use sendto to communicate with IPv4 addresses. Further, dual stack UDP sockets bound to an IPv6 address could use sendto to communicate with IPv4 addresses. Neither of these behaviors are consistent with Linux. PiperOrigin-RevId: 197036024 Change-Id: Ic3713efc569f26196e35bb41e6ad63f23675fc90
2018-05-16Fix another socket Dirent refcount.Christopher Koch
PiperOrigin-RevId: 196893452 Change-Id: I5ea0f851fcabc5eac5859e61f15213323d996337
2018-05-16Verify that when offset address is not null, infile must be seekableChanwit Kaewkasi
Change-Id: Id247399baeac58f6cd774acabd5d1da05e5b5697 PiperOrigin-RevId: 196887768
2018-05-16netstack: make TCP endpoint closed and error state cleanup work synchronous.Zhaozhong Ni
So that when saving TCP endpoint in these states, there is no pending or background activities. Also lift tcp network save rejection error to tcpip package. PiperOrigin-RevId: 196886839 Change-Id: I0fe73750f2743ec7e62d139eb2cec758c5dd6698
2018-05-16Refcount socket Dirents correctly.Christopher Koch
This should fix the socket Dirent memory leak. fs.NewFile takes a new reference. It should hold the *only* reference. DecRef that socket Dirent. Before the globalDirentMap was introduced, a mis-refcounted Dirent would be garbage collected when all references to it were gone. For socket Dirents, this meant that they would be garbage collected when the associated fs.Files disappeared. After the globalDirentMap, Dirents *must* be reference-counted correctly to be garbage collected, as Dirents remove themselves from the global map when their refcount goes to -1 (see Dirent.destroy). That removes the last pointer to that Dirent. PiperOrigin-RevId: 196878973 Change-Id: Ic7afcd1de97c7101ccb13be5fc31de0fb50963f0
2018-05-16Release mutex in BidirectionalConnect to avoid deadlock.Brian Geffon
When doing a BidirectionalConnect we don't need to continue holding the ConnectingEndpoint's mutex when creating the NewConnectedEndpoint as it was held during the Connect. Additionally, we're not holding the baseEndpoint mutex while Unregistering an event. PiperOrigin-RevId: 196875557 Change-Id: Ied4ceed89de883121c6cba81bc62aa3a8549b1e9
2018-05-15Fix KVM EFAULT handling.Adin Scannell
PiperOrigin-RevId: 196781718 Change-Id: I889766eed871929cdc247c6b9aa634398adea9c9
2018-05-15Simplify KVM invalidation logic.Adin Scannell
PiperOrigin-RevId: 196780209 Change-Id: I89f39eec914ce54a7c6c4f28e1b6d5ff5a7dd38d