summaryrefslogtreecommitdiffhomepage
path: root/pkg/sentry/socket
AgeCommit message (Collapse)Author
2020-06-10{S,G}etsockopt for TCP_KEEPCNT option.Nayana Bidari
TCP_KEEPCNT is used to set the maximum keepalive probes to be sent before dropping the connection. WANT_LGTM=jchacon PiperOrigin-RevId: 315758094
2020-06-10socket/unix: handle sendto address argument for connected socketsAndrei Vagin
In case of SOCK_SEQPACKET, it has to be ignored. In case of SOCK_STREAM, EISCONN or EOPNOTSUPP has to be returned. PiperOrigin-RevId: 315755972
2020-06-09Implement flock(2) in VFS2Fabricio Voznika
LockFD is the generic implementation that can be embedded in FileDescriptionImpl implementations. Unique lock ID is maintained in vfs.FileDescription and is created on demand. Updates #1480 PiperOrigin-RevId: 315604825
2020-06-07netstack: parse incoming packet headers up-frontKevin Krakauer
Netstack has traditionally parsed headers on-demand as a packet moves up the stack. This is conceptually simple and convenient, but incompatible with iptables, where headers can be inspected and mangled before even a routing decision is made. This changes header parsing to happen early in the incoming packet path, as soon as the NIC gets the packet from a link endpoint. Even if an invalid packet is found (e.g. a TCP header of insufficient length), the packet is passed up the stack for proper stats bookkeeping. PiperOrigin-RevId: 315179302
2020-06-05Fix error code returned due to Port exhaustion.Bhasker Hariharan
For TCP sockets gVisor incorrectly returns EAGAIN when no ephemeral ports are available to bind during a connect. Linux returns EADDRNOTAVAIL. This change fixes gVisor to return the correct code and adds a test for the same. This change also fixes a minor bug for ping sockets where connect() would fail with EINVAL unless the socket was bound first. Also added tests for testing UDP Port exhaustion and Ping socket port exhaustion. PiperOrigin-RevId: 314988525
2020-06-05Fix copylocks error about copying IPTables.Ting-Yu Wang
IPTables.connections contains a sync.RWMutex. Copying it will trigger copylocks analysis. Tested by manually enabling nogo tests. sync.RWMutex is added to IPTables for the additional race condition discovered. PiperOrigin-RevId: 314817019
2020-06-03Pass PacketBuffer as pointer.Ting-Yu Wang
Historically we've been passing PacketBuffer by shallow copying through out the stack. Right now, this is only correct as the caller would not use PacketBuffer after passing into the next layer in netstack. With new buffer management effort in gVisor/netstack, PacketBuffer will own a Buffer (to be added). Internally, both PacketBuffer and Buffer may have pointers and shallow copying shouldn't be used. Updates #2404. PiperOrigin-RevId: 314610879
2020-06-02Check that two sockets with different types can't be connected to each otherAndrei Vagin
PiperOrigin-RevId: 314450191
2020-05-28Enable iptables source filtering (-s/--source)Kevin Krakauer
2020-05-19Fix flaky udp tests by polling before reading.Dean Deng
On native Linux, calling recv/read right after send/write sometimes returns EWOULDBLOCK, if the data has not made it to the receiving socket (even though the endpoints are on the same host). Poll before reading to avoid this. Making this change also uncovered a hostinet bug (gvisor.dev/issue/2726), which is noted in this CL. PiperOrigin-RevId: 312320587
2020-05-15Minor formatting updates for gvisor.dev.Adin Scannell
* Aggregate architecture Overview in "What is gVisor?" as it makes more sense in one place. * Drop "user-space kernel" and use "application kernel". The term "user-space kernel" is confusing when some platform implementation do not run in user-space (instead running in guest ring zero). * Clear up the relationship between the Platform page in the user guide and the Platform page in the architecture guide, and ensure they are cross-linked. * Restore the call-to-action quick start link in the main page, and drop the GitHub link (which also appears in the top-right). * Improve image formatting by centering all doc and blog images, and move the image captions to the alt text. PiperOrigin-RevId: 311845158
2020-05-13Stub support for TCP_SYNCNT and TCP_WINDOW_CLAMP.Bhasker Hariharan
This change adds support for TCP_SYNCNT and TCP_WINDOW_CLAMP options in GetSockOpt/SetSockOpt. This change does not really change any behaviour in Netstack and only stores/returns the stored value. Actual honoring of these options will be added as required. Fixes #2626, #2625 PiperOrigin-RevId: 311453777
2020-05-12Merge pull request #2678 from nybidari:iptablesgVisor bot
PiperOrigin-RevId: 311203776
2020-05-12Merge pull request #2671 from kevinGC:skip-outputgVisor bot
PiperOrigin-RevId: 311181084
2020-05-12Don't call kernel.Task.Block() from netstack.SocketOperations.Write().Jamie Liu
kernel.Task.Block() requires that the caller is running on the task goroutine. netstack.SocketOperations.Write() uses kernel.TaskFromContext() to call kernel.Task.Block() even if it's not running on the task goroutine. Stop doing that. PiperOrigin-RevId: 311178335
2020-05-12iptables: support gid match for owner matching.Nayana Bidari
- Added support for matching gid owner and invert flag for uid and gid. $ iptables -A OUTPUT -p tcp -m owner --gid-owner root -j ACCEPT $ iptables -A OUTPUT -p tcp -m owner ! --uid-owner root -j ACCEPT $ iptables -A OUTPUT -p tcp -m owner ! --gid-owner root -j DROP - Added tests for uid, gid and invert flags.
2020-05-11iptables: check for truly unconditional rulesKevin Krakauer
We weren't properly checking whether the inserted default rule was unconditional.
2020-05-08iptables - filter packets using outgoing interface.gVisor bot
Enables commands with -o (--out-interface) for iptables rules. $ iptables -A OUTPUT -o eth0 -j ACCEPT PiperOrigin-RevId: 310642286
2020-05-07Allocate device numbers for VFS2 filesystems.Jamie Liu
Updates #1197, #1198, #1672 PiperOrigin-RevId: 310432006
2020-05-05Update vfs2 socket TODOs.Dean Deng
Three updates: - Mark all vfs2 socket syscalls as supported. - Use the same dev number and ino number generator for all types of sockets, unlike in VFS1. - Do not use host fd for hostinet metadata. Fixes #1476, #1478, #1484, 1485, #2017. PiperOrigin-RevId: 309994579
2020-05-01Support for connection tracking of TCP packets.Nayana Bidari
Connection tracking is used to track packets in prerouting and output hooks of iptables. The NAT rules modify the tuples in connections. The connection tracking code modifies the packets by looking at the modified tuples.
2020-05-01Automated rollback of changelist 308674219Kevin Krakauer
PiperOrigin-RevId: 309491861
2020-05-01Port netstack, hostinet, and netlink sockets to VFS2.Dean Deng
All three follow the same pattern: 1. Refactor VFS1 sockets into socketOpsCommon, so that most of the methods can be shared with VFS2. 2. Create a FileDescriptionImpl with the corresponding socket operations, rewriting the few that cannot be shared with VFS1. 3. Set up a VFS2 socket provider that creates a socket by setting up a dentry in the global Kernel.socketMount and connecting it with a new FileDescription. This mostly completes the work for porting sockets to VFS2, and many syscall tests can be enabled as a result. There are several networking-related syscall tests that are still not passing: 1. net gofer tests 2. socketpair gofer tests 2. sendfile tests (splice is not implemented in VFS2 yet) Updates #1478, #1484, #1485 PiperOrigin-RevId: 309457331
2020-04-29iptables: don't pollute logsKevin Krakauer
The netfilter package uses logs to make debugging the (de)serialization of structs easier. This generates a lot of (usually irrelevant) logs. Logging is now hidden behind a debug flag. PiperOrigin-RevId: 309087115
2020-04-28Fix Unix socket permissions.Dean Deng
Enforce write permission checks in BoundEndpointAt, which corresponds to the permission checks in Linux (net/unix/af_unix.c:unix_find_other). Also, create bound socket files with the correct permissions in VFS2. Fixes #2324. PiperOrigin-RevId: 308949084
2020-04-28Deduplicate unix socket Release() method.Dean Deng
PiperOrigin-RevId: 308932254
2020-04-28Support pipes and sockets in VFS2 gofer fs.Dean Deng
Named pipes and sockets can be represented in two ways in gofer fs: 1. As a file on the remote filesystem. In this case, all file operations are passed through 9p. 2. As a synthetic file that is internal to the sandbox. In this case, the dentry stores an endpoint or VFSPipe for sockets and pipes respectively, which replaces interactions with the remote fs through the gofer. In gofer.filesystem.MknodAt, we attempt to call mknod(2) through 9p, and if it fails, fall back to the synthetic version. Updates #1200. PiperOrigin-RevId: 308828161
2020-04-27Import host sockets.Dean Deng
The FileDescription implementation for hostfs sockets uses the standard Unix socket implementation (unix.SocketVFS2), but is also tied to a hostfs dentry. Updates #1672, #1476 PiperOrigin-RevId: 308716426
2020-04-27Automated rollback of changelist 308163542gVisor bot
PiperOrigin-RevId: 308674219
2020-04-24Port SCM Rights to VFS2.Dean Deng
Fixes #1477. PiperOrigin-RevId: 308317511
2020-04-23Remove View.First() and View.RemoveFirst()Kevin Krakauer
These methods let users eaily break the VectorisedView abstraction, and allowed netstack to slip into pseudo-enforcement of the "all headers are in the first View" invariant. Removing them and replacing with PullUp(n) breaks this reliance and will make it easier to add iptables support and rework network buffer management. The new View.PullUp(n) method is low cost in the common case, when when all the headers fit in the first View. PiperOrigin-RevId: 308163542
2020-04-21Sentry metrics updates.Dave Bailey
Sentry metrics with nanoseconds units are labeled as such, and non-cumulative sentry metrics are supported. PiperOrigin-RevId: 307621080
2020-04-21Automated rollback of changelist 307477185gVisor bot
PiperOrigin-RevId: 307598974
2020-04-17Remove View.First() and View.RemoveFirst()Kevin Krakauer
These methods let users eaily break the VectorisedView abstraction, and allowed netstack to slip into pseudo-enforcement of the "all headers are in the first View" invariant. Removing them and replacing with PullUp(n) breaks this reliance and will make it easier to add iptables support and rework network buffer management. The new View.PullUp(n) method is low cost in the common case, when when all the headers fit in the first View.
2020-04-17Permit setting unknown optionsTamir Duberstein
This previously changed in 305699233, but this behaviour turned out to be load bearing. PiperOrigin-RevId: 307033802
2020-04-09Replace type assertion with TaskFromContext.Ting-Yu Wang
This should fix panic at aio callback. PiperOrigin-RevId: 305798549
2020-04-09Convert int and bool socket options to use GetSockOptInt and GetSockOptBoolAndrei Vagin
PiperOrigin-RevId: 305699233
2020-04-07Remove out-of-date TODOs.Ting-Yu Wang
We already have network namespace for netstack. PiperOrigin-RevId: 305341954
2020-04-04Record VFS2 sockets in global socket map.Dean Deng
Updates #1476, #1478, #1484, #1485. PiperOrigin-RevId: 304845354
2020-04-03Add FileDescriptionImpl for Unix sockets.Dean Deng
This change involves several steps: - Refactor the VFS1 unix socket implementation to share methods between VFS1 and VFS2 where possible. Re-implement the rest. - Override the default PRead, Read, PWrite, Write, Ioctl, Release methods in FileDescriptionDefaultImpl. - Add functions to create and initialize a new Dentry/Inode and FileDescription for a Unix socket file. Updates #1476 PiperOrigin-RevId: 304689796
2020-04-02Fix typo in TODO comments.Dean Deng
PiperOrigin-RevId: 304508083
2020-04-01Add FileDescription interface for socket files.Dean Deng
Refactor the existing socket interface to share methods between VFS1 and VFS2. The method signatures do not contain anything filesystem-related, so they don't need to be re-defined for VFS2. Updates #1476, #1478, #1484, #1485. PiperOrigin-RevId: 304184545
2020-03-26Support owner matching for iptables.Nayana Bidari
This feature will match UID and GID of the packet creator, for locally generated packets. This match is only valid in the OUTPUT and POSTROUTING chains. Forwarded packets do not have any socket associated with them. Packets from kernel threads do have a socket, but usually no owner.
2020-03-25Automated rollback of changelist 301837227Bhasker Hariharan
PiperOrigin-RevId: 302891559
2020-03-24Move tcpip.PacketBuffer and IPTables to stack package.Bhasker Hariharan
This is a precursor to be being able to build an intrusive list of PacketBuffers for use in queuing disciplines being implemented. Updates #2214 PiperOrigin-RevId: 302677662
2020-03-23Support basic /proc/net/dev metrics for netstackIan Lewis
Fixes #506 PiperOrigin-RevId: 302540404
2020-03-23Fix data race in SetSockOpt.Bhasker Hariharan
PiperOrigin-RevId: 302539171
2020-03-19Change SocketOperations.readMu to an RWMutex.Bhasker Hariharan
Also get rid of the readViewHasData as it's not required anymore. Updates #231, #357 PiperOrigin-RevId: 301837227
2020-03-19Remove workMu from tcpip.Endpoint.Bhasker Hariharan
workMu is removed and e.mu is now a mutex that supports TryLock. The packet processing path tries to lock the mutex and if its locked it will just queue the packet and move on. The endpoint.UnlockUser() will process any backlog of packets before unlocking the socket. This simplifies the locking inside tcp endpoints a lot. Further the endpoint.LockUser() implements spinning as long as the lock is not held by another syscall goroutine. This ensures low latency as not spinning leads to the task thread being put to sleep if the lock is held by the packet dispatch path. This is suboptimal as the lower layer rarely holds the lock for long so implementing spinning here helps. If the lock is held by another task goroutine then we just proceed to call LockUser() and the task could be put to sleep. The protocol goroutines themselves just call e.mu.Lock() and block if the lock is currently not available. Updates #231, #357 PiperOrigin-RevId: 301808349
2020-03-16Merge pull request #1943 from kevinGC:ipt-filter-ipgVisor bot
PiperOrigin-RevId: 301197007