summaryrefslogtreecommitdiffhomepage
path: root/pkg/sentry/socket/netstack
AgeCommit message (Collapse)Author
2020-05-15Minor formatting updates for gvisor.dev.Adin Scannell
* Aggregate architecture Overview in "What is gVisor?" as it makes more sense in one place. * Drop "user-space kernel" and use "application kernel". The term "user-space kernel" is confusing when some platform implementation do not run in user-space (instead running in guest ring zero). * Clear up the relationship between the Platform page in the user guide and the Platform page in the architecture guide, and ensure they are cross-linked. * Restore the call-to-action quick start link in the main page, and drop the GitHub link (which also appears in the top-right). * Improve image formatting by centering all doc and blog images, and move the image captions to the alt text. PiperOrigin-RevId: 311845158
2020-05-13Stub support for TCP_SYNCNT and TCP_WINDOW_CLAMP.Bhasker Hariharan
This change adds support for TCP_SYNCNT and TCP_WINDOW_CLAMP options in GetSockOpt/SetSockOpt. This change does not really change any behaviour in Netstack and only stores/returns the stored value. Actual honoring of these options will be added as required. Fixes #2626, #2625 PiperOrigin-RevId: 311453777
2020-05-12Don't call kernel.Task.Block() from netstack.SocketOperations.Write().Jamie Liu
kernel.Task.Block() requires that the caller is running on the task goroutine. netstack.SocketOperations.Write() uses kernel.TaskFromContext() to call kernel.Task.Block() even if it's not running on the task goroutine. Stop doing that. PiperOrigin-RevId: 311178335
2020-05-07Allocate device numbers for VFS2 filesystems.Jamie Liu
Updates #1197, #1198, #1672 PiperOrigin-RevId: 310432006
2020-05-05Update vfs2 socket TODOs.Dean Deng
Three updates: - Mark all vfs2 socket syscalls as supported. - Use the same dev number and ino number generator for all types of sockets, unlike in VFS1. - Do not use host fd for hostinet metadata. Fixes #1476, #1478, #1484, 1485, #2017. PiperOrigin-RevId: 309994579
2020-05-01Port netstack, hostinet, and netlink sockets to VFS2.Dean Deng
All three follow the same pattern: 1. Refactor VFS1 sockets into socketOpsCommon, so that most of the methods can be shared with VFS2. 2. Create a FileDescriptionImpl with the corresponding socket operations, rewriting the few that cannot be shared with VFS1. 3. Set up a VFS2 socket provider that creates a socket by setting up a dentry in the global Kernel.socketMount and connecting it with a new FileDescription. This mostly completes the work for porting sockets to VFS2, and many syscall tests can be enabled as a result. There are several networking-related syscall tests that are still not passing: 1. net gofer tests 2. socketpair gofer tests 2. sendfile tests (splice is not implemented in VFS2 yet) Updates #1478, #1484, #1485 PiperOrigin-RevId: 309457331
2020-04-21Sentry metrics updates.Dave Bailey
Sentry metrics with nanoseconds units are labeled as such, and non-cumulative sentry metrics are supported. PiperOrigin-RevId: 307621080
2020-04-09Replace type assertion with TaskFromContext.Ting-Yu Wang
This should fix panic at aio callback. PiperOrigin-RevId: 305798549
2020-04-09Convert int and bool socket options to use GetSockOptInt and GetSockOptBoolAndrei Vagin
PiperOrigin-RevId: 305699233
2020-04-07Remove out-of-date TODOs.Ting-Yu Wang
We already have network namespace for netstack. PiperOrigin-RevId: 305341954
2020-04-03Add FileDescriptionImpl for Unix sockets.Dean Deng
This change involves several steps: - Refactor the VFS1 unix socket implementation to share methods between VFS1 and VFS2 where possible. Re-implement the rest. - Override the default PRead, Read, PWrite, Write, Ioctl, Release methods in FileDescriptionDefaultImpl. - Add functions to create and initialize a new Dentry/Inode and FileDescription for a Unix socket file. Updates #1476 PiperOrigin-RevId: 304689796
2020-04-02Fix typo in TODO comments.Dean Deng
PiperOrigin-RevId: 304508083
2020-03-26Support owner matching for iptables.Nayana Bidari
This feature will match UID and GID of the packet creator, for locally generated packets. This match is only valid in the OUTPUT and POSTROUTING chains. Forwarded packets do not have any socket associated with them. Packets from kernel threads do have a socket, but usually no owner.
2020-03-25Automated rollback of changelist 301837227Bhasker Hariharan
PiperOrigin-RevId: 302891559
2020-03-24Move tcpip.PacketBuffer and IPTables to stack package.Bhasker Hariharan
This is a precursor to be being able to build an intrusive list of PacketBuffers for use in queuing disciplines being implemented. Updates #2214 PiperOrigin-RevId: 302677662
2020-03-23Support basic /proc/net/dev metrics for netstackIan Lewis
Fixes #506 PiperOrigin-RevId: 302540404
2020-03-23Fix data race in SetSockOpt.Bhasker Hariharan
PiperOrigin-RevId: 302539171
2020-03-19Change SocketOperations.readMu to an RWMutex.Bhasker Hariharan
Also get rid of the readViewHasData as it's not required anymore. Updates #231, #357 PiperOrigin-RevId: 301837227
2020-03-19Remove workMu from tcpip.Endpoint.Bhasker Hariharan
workMu is removed and e.mu is now a mutex that supports TryLock. The packet processing path tries to lock the mutex and if its locked it will just queue the packet and move on. The endpoint.UnlockUser() will process any backlog of packets before unlocking the socket. This simplifies the locking inside tcp endpoints a lot. Further the endpoint.LockUser() implements spinning as long as the lock is not held by another syscall goroutine. This ensures low latency as not spinning leads to the task thread being put to sleep if the lock is held by the packet dispatch path. This is suboptimal as the lower layer rarely holds the lock for long so implementing spinning here helps. If the lock is held by another task goroutine then we just proceed to call LockUser() and the task could be put to sleep. The protocol goroutines themselves just call e.mu.Lock() and block if the lock is currently not available. Updates #231, #357 PiperOrigin-RevId: 301808349
2020-03-02Fix panic caused by invalid address for Bind in packet sockets.Nayana Bidari
PiperOrigin-RevId: 298476533
2020-03-02socket: take readMu to access readViewAndrei Vagin
DATA RACE in netstack.(*SocketOperations).fetchReadView Write at 0x00c001dca138 by goroutine 1001: gvisor.dev/gvisor/pkg/sentry/socket/netstack.(*SocketOperations).fetchReadView() pkg/sentry/socket/netstack/netstack.go:418 +0x85 gvisor.dev/gvisor/pkg/sentry/socket/netstack.(*SocketOperations).coalescingRead() pkg/sentry/socket/netstack/netstack.go:2309 +0x67 gvisor.dev/gvisor/pkg/sentry/socket/netstack.(*SocketOperations).nonBlockingRead() pkg/sentry/socket/netstack/netstack.go:2378 +0x183d Previous read at 0x00c001dca138 by goroutine 1111: gvisor.dev/gvisor/pkg/sentry/socket/netstack.(*SocketOperations).Ioctl() pkg/sentry/socket/netstack/netstack.go:2666 +0x533 gvisor.dev/gvisor/pkg/sentry/syscalls/linux.Ioctl() Reported-by: syzbot+d4c3885fcc346f08deb6@syzkaller.appspotmail.com PiperOrigin-RevId: 298387377
2020-02-27Internal change.Nayana Bidari
PiperOrigin-RevId: 297638665
2020-02-19Internal change.gVisor bot
PiperOrigin-RevId: 296088213
2020-02-18Enable IPV6_RECVTCLASS socket option for datagram socketsgVisor bot
Added the ability to get/set the IP_RECVTCLASS socket option on UDP endpoints. If enabled, traffic class from the incoming Network Header passed as ancillary data in the ControlMessages. Adding Get/SetSockOptBool to decrease the overhead of getting/setting simple options. (This was absorbed in a CL that will be landing before this one). Test: * Added unit test to udp_test.go that tests getting/setting as well as verifying that we receive expected TOS from incoming packet. * Added a syscall test for verifying getting/setting * Removed test skip for existing syscall test to enable end to end test. PiperOrigin-RevId: 295840218
2020-02-13Internal change.gVisor bot
PiperOrigin-RevId: 294952610
2020-02-05recv() on a closed TCP socket returns ENOTCONNEyal Soha
From RFC 793 s3.9 p58 Event Processing: If RECEIVE Call arrives in CLOSED state and the user has access to such a connection, the return should be "error: connection does not exist" Fixes #1598 PiperOrigin-RevId: 293494287
2020-02-04Support RTM_NEWADDR and RTM_GETLINK in (rt)netlink.Ting-Yu Wang
PiperOrigin-RevId: 293271055
2020-01-29Add support for TCP_DEFER_ACCEPT.Bhasker Hariharan
PiperOrigin-RevId: 292233574
2020-01-27Update package locations.Adin Scannell
Because the abi will depend on the core types for marshalling (usermem, context, safemem, safecopy), these need to be flattened from the sentry directory. These packages contain no sentry-specific details. PiperOrigin-RevId: 291811289
2020-01-27Standardize on tools directory.Adin Scannell
PiperOrigin-RevId: 291745021
2020-01-21Add a new TCP stat for current open connections.Mithun Iyer
Such a stat accounts for all connections that are currently established and not yet transitioned to close state. Also fix bug in double increment of CurrentEstablished stat. Fixes #1579 PiperOrigin-RevId: 290827365
2020-01-21Merge pull request #1558 from kevinGC:iptables-write-input-dropgVisor bot
PiperOrigin-RevId: 290793754
2020-01-17Filter out received packets with a local source IP address.Eyal Soha
CERT Advisory CA-96.21 III. Solution advises that devices drop packets which could not have correctly arrived on the wire, such as receiving a packet where the source IP address is owned by the device that sent it. Fixes #1507 PiperOrigin-RevId: 290378240
2020-01-14Implement {g,s}etsockopt(IP_RECVTOS) for UDP socketsTamir Duberstein
PiperOrigin-RevId: 289718534
2020-01-13Allow dual stack sockets to operate on AF_INETTamir Duberstein
Fixes #1490 Fixes #1495 PiperOrigin-RevId: 289523250
2020-01-13Merge branch 'master' into iptables-write-input-dropKevin Krakauer
2020-01-13Merge pull request #1528 from kevinGC:iptables-writegVisor bot
PiperOrigin-RevId: 289479774
2020-01-09New sync package.Ian Gudger
* Rename syncutil to sync. * Add aliases to sync types. * Replace existing usage of standard library sync package. This will make it easier to swap out synchronization primitives. For example, this will allow us to use primitives from github.com/sasha-s/go-deadlock to check for lock ordering violations. Updates #1472 PiperOrigin-RevId: 289033387
2020-01-09Change BindToDeviceOption to store NICIDEyal Soha
This makes it possible to call the sockopt from go even when the NIC has no name. PiperOrigin-RevId: 288955236
2020-01-08Merge branch 'iptables-write' into iptables-write-input-dropKevin Krakauer
2020-01-08Addressed GH commentsKevin Krakauer
2020-01-08Getting a panic when running tests. For some reason the filter table isKevin Krakauer
ending up with the wrong chains and is indexing -1 into rules.
2020-01-08Introduce tcpip.SockOptBoolTamir Duberstein
...and port V6OnlyOption to it. PiperOrigin-RevId: 288789451
2020-01-08Rename tcpip.SockOpt{,Int}Tamir Duberstein
PiperOrigin-RevId: 288772878
2020-01-08Minor fixes to comments and loggingKevin Krakauer
2020-01-08Write simple ACCEPT rules to the filter table.Kevin Krakauer
This gets us closer to passing the iptables tests and opens up iptables so it can be worked on by multiple people. A few restrictions are enforced for security (i.e. we don't want to let users write a bunch of iptables rules and then just not enforce them): - Only the filter table is writable. - Only ACCEPT rules with no matching criteria can be added.
2019-12-26Automated rollback of changelist 287029703gVisor bot
PiperOrigin-RevId: 287217899
2019-12-24Enable IP_RECVTOS socket option for datagram socketsRyan Heacock
Added the ability to get/set the IP_RECVTOS socket option on UDP endpoints. If enabled, TOS from the incoming Network Header passed as ancillary data in the ControlMessages. Test: * Added unit test to udp_test.go that tests getting/setting as well as verifying that we receive expected TOS from incoming packet. * Added a syscall test PiperOrigin-RevId: 287029703
2019-12-12unix: allow to bind unix sockets only to AF_UNIX addressesAndrei Vagin
Reported-by: syzbot+2c0bcfd87fb4e8b7b009@syzkaller.appspotmail.com PiperOrigin-RevId: 285228312
2019-12-11Add support for TCP_USER_TIMEOUT option.Bhasker Hariharan
The implementation follows the linux behavior where specifying a TCP_USER_TIMEOUT will cause the resend timer to honor the user specified timeout rather than the default rto based timeout. Further it alters when connections are timedout due to keepalive failures. It does not alter the behavior of when keepalives are sent. This is as per the linux behavior. PiperOrigin-RevId: 285099795