summaryrefslogtreecommitdiffhomepage
path: root/pkg/sentry/socket
AgeCommit message (Collapse)Author
2020-02-04VFS2 gofer clientJamie Liu
Updates #1198 Opening host pipes (by spinning in fdpipe) and host sockets is not yet complete, and will be done in a future CL. Major differences from VFS1 gofer client (sentry/fs/gofer), with varying levels of backportability: - "Cache policies" are replaced by InteropMode, which control the behavior of timestamps in addition to caching. Under InteropModeExclusive (analogous to cacheAll) and InteropModeWritethrough (analogous to cacheAllWritethrough), client timestamps are *not* written back to the server (it is not possible in 9P or Linux for clients to set ctime, so writing back client-authoritative timestamps results in incoherence between atime/mtime and ctime). Under InteropModeShared (analogous to cacheRemoteRevalidating), client timestamps are not used at all (remote filesystem clocks are authoritative). cacheNone is translated to InteropModeShared + new option filesystemOptions.specialRegularFiles. - Under InteropModeShared, "unstable attribute" reloading for permission checks, lookup, and revalidation are fused, which is feasible in VFS2 since gofer.filesystem controls path resolution. This results in a ~33% reduction in RPCs for filesystem operations compared to cacheRemoteRevalidating. For example, consider stat("/foo/bar/baz") where "/foo/bar/baz" fails revalidation, resulting in the instantiation of a new dentry: VFS1 RPCs: getattr("/") // fs.MountNamespace.FindLink() => fs.Inode.CheckPermission() => gofer.inodeOperations.check() => gofer.inodeOperations.UnstableAttr() walkgetattr("/", "foo") = fid1 // fs.Dirent.walk() => gofer.session.Revalidate() => gofer.cachePolicy.Revalidate() clunk(fid1) getattr("/foo") // CheckPermission walkgetattr("/foo", "bar") = fid2 // Revalidate clunk(fid2) getattr("/foo/bar") // CheckPermission walkgetattr("/foo/bar", "baz") = fid3 // Revalidate clunk(fid3) walkgetattr("/foo/bar", "baz") = fid4 // fs.Dirent.walk() => gofer.inodeOperations.Lookup getattr("/foo/bar/baz") // linux.stat() => gofer.inodeOperations.UnstableAttr() VFS2 RPCs: getattr("/") // gofer.filesystem.walkExistingLocked() walkgetattr("/", "foo") = fid1 // gofer.filesystem.stepExistingLocked() clunk(fid1) // No getattr: walkgetattr already updated metadata for permission check walkgetattr("/foo", "bar") = fid2 clunk(fid2) walkgetattr("/foo/bar", "baz") = fid3 // No clunk: fid3 used for new gofer.dentry // No getattr: walkgetattr already updated metadata for stat() - gofer.filesystem.unlinkAt() does not require instantiation of a dentry that represents the file to be deleted. Updates #898. - gofer.regularFileFD.OnClose() skips Tflushf for regular files under InteropModeExclusive, as it's nonsensical to request a remote file flush without flushing locally-buffered writes to that remote file first. - Symlink targets are cached when InteropModeShared is not in effect. - p9.QID.Path (which is already required to be unique for each file within a server, and is accordingly already synthesized from device/inode numbers in all known gofers) is used as-is for inode numbers, rather than being mapped along with attr.RDev in the client to yet another synthetic inode number. - Relevant parts of fsutil.CachingInodeOperations are inlined directly into gofer package code. This avoids having to duplicate part of its functionality in fsutil.HostMappable. PiperOrigin-RevId: 293190213
2020-01-29Add support for TCP_DEFER_ACCEPT.Bhasker Hariharan
PiperOrigin-RevId: 292233574
2020-01-28Prevent arbitrary size allocation when sending UDS messages.Dean Deng
Currently, Send() will copy data into a new byte slice without regard to the original size. Size checks should be performed before the allocation takes place. Note that for the sake of performance, we avoid putting the buffer allocation into the critical section. As a result, the size checks need to be performed again within Enqueue() in case the limit has changed. PiperOrigin-RevId: 292058147
2020-01-28netlink: add support for RTM_F_LOOKUP_TABLEJianfeng Tan
Test command: $ ip route get 1.1.1.1 Fixes: #1099 Signed-off-by: Jianfeng Tan <henry.tjf@antfin.com> COPYBARA_INTEGRATE_REVIEW=https://github.com/google/gvisor/pull/1121 from tanjianfeng:fix-1099 e6919f3d4ede5aa51a48b3d2be0d7a4b482dd53d PiperOrigin-RevId: 291990716
2020-01-27Update package locations.Adin Scannell
Because the abi will depend on the core types for marshalling (usermem, context, safemem, safecopy), these need to be flattened from the sentry directory. These packages contain no sentry-specific details. PiperOrigin-RevId: 291811289
2020-01-27Standardize on tools directory.Adin Scannell
PiperOrigin-RevId: 291745021
2020-01-23Merge pull request #1617 from kevinGC:iptables-write-filter-protogVisor bot
PiperOrigin-RevId: 291249314
2020-01-21Add a new TCP stat for current open connections.Mithun Iyer
Such a stat accounts for all connections that are currently established and not yet transitioned to close state. Also fix bug in double increment of CurrentEstablished stat. Fixes #1579 PiperOrigin-RevId: 290827365
2020-01-21More little fixes.Kevin Krakauer
2020-01-21Fixing stuffKevin Krakauer
2020-01-21Merge branch 'master' into iptables-write-filter-protoKevin Krakauer
2020-01-21Merge pull request #1558 from kevinGC:iptables-write-input-dropgVisor bot
PiperOrigin-RevId: 290793754
2020-01-17Filter out received packets with a local source IP address.Eyal Soha
CERT Advisory CA-96.21 III. Solution advises that devices drop packets which could not have correctly arrived on the wire, such as receiving a packet where the source IP address is owned by the device that sent it. Fixes #1507 PiperOrigin-RevId: 290378240
2020-01-16Remove unused rpcinet.Adin Scannell
PiperOrigin-RevId: 290198756
2020-01-14Implement {g,s}etsockopt(IP_RECVTOS) for UDP socketsTamir Duberstein
PiperOrigin-RevId: 289718534
2020-01-13Merge branch 'iptables-write-input-drop' into iptables-write-filter-protoKevin Krakauer
2020-01-13Allow dual stack sockets to operate on AF_INETTamir Duberstein
Fixes #1490 Fixes #1495 PiperOrigin-RevId: 289523250
2020-01-13Only allow INPUT modifications.Kevin Krakauer
2020-01-13Merge branch 'master' into iptables-write-input-dropKevin Krakauer
2020-01-13Merge pull request #1528 from kevinGC:iptables-writegVisor bot
PiperOrigin-RevId: 289479774
2020-01-10I think INPUT works with protocolKevin Krakauer
2020-01-09New sync package.Ian Gudger
* Rename syncutil to sync. * Add aliases to sync types. * Replace existing usage of standard library sync package. This will make it easier to swap out synchronization primitives. For example, this will allow us to use primitives from github.com/sasha-s/go-deadlock to check for lock ordering violations. Updates #1472 PiperOrigin-RevId: 289033387
2020-01-09Added a test that we don't pass yetKevin Krakauer
2020-01-09Change BindToDeviceOption to store NICIDEyal Soha
This makes it possible to call the sockopt from go even when the NIC has no name. PiperOrigin-RevId: 288955236
2020-01-08It works! It drops some packets.Kevin Krakauer
2020-01-08Merge branch 'iptables-write' into iptables-write-input-dropKevin Krakauer
2020-01-08More GH comments.Kevin Krakauer
2020-01-08Return correct length with MSG_TRUNC for unix sockets.Ian Lewis
This change calls a new Truncate method on the EndpointReader in RecvMsg for both netlink and unix sockets. This allows readers such as sockets to peek at the length of data without actually reading it to a buffer. Fixes #993 #1240 PiperOrigin-RevId: 288800167
2020-01-08Addressed GH commentsKevin Krakauer
2020-01-08Fix slice bounds out of range panic in parsing socket control message.Ting-Yu Wang
Panic found by syzakller. PiperOrigin-RevId: 288799046
2020-01-08Getting a panic when running tests. For some reason the filter table isKevin Krakauer
ending up with the wrong chains and is indexing -1 into rules.
2020-01-08Introduce tcpip.SockOptBoolTamir Duberstein
...and port V6OnlyOption to it. PiperOrigin-RevId: 288789451
2020-01-08Built dead-simple traversal, but now getting depedency cycle error :'(Kevin Krakauer
2020-01-08Rename tcpip.SockOpt{,Int}Tamir Duberstein
PiperOrigin-RevId: 288772878
2020-01-08First commit -- re-adding DROPKevin Krakauer
2020-01-08Comment cleanup.Kevin Krakauer
2020-01-08Minor fixes to comments and loggingKevin Krakauer
2020-01-08Write simple ACCEPT rules to the filter table.Kevin Krakauer
This gets us closer to passing the iptables tests and opens up iptables so it can be worked on by multiple people. A few restrictions are enforced for security (i.e. we don't want to let users write a bunch of iptables rules and then just not enforce them): - Only the filter table is writable. - Only ACCEPT rules with no matching criteria can be added.
2019-12-26Automated rollback of changelist 287029703gVisor bot
PiperOrigin-RevId: 287217899
2019-12-24Enable IP_RECVTOS socket option for datagram socketsRyan Heacock
Added the ability to get/set the IP_RECVTOS socket option on UDP endpoints. If enabled, TOS from the incoming Network Header passed as ancillary data in the ControlMessages. Test: * Added unit test to udp_test.go that tests getting/setting as well as verifying that we receive expected TOS from incoming packet. * Added a syscall test PiperOrigin-RevId: 287029703
2019-12-12unix: allow to bind unix sockets only to AF_UNIX addressesAndrei Vagin
Reported-by: syzbot+2c0bcfd87fb4e8b7b009@syzkaller.appspotmail.com PiperOrigin-RevId: 285228312
2019-12-11Add support for TCP_USER_TIMEOUT option.Bhasker Hariharan
The implementation follows the linux behavior where specifying a TCP_USER_TIMEOUT will cause the resend timer to honor the user specified timeout rather than the default rto based timeout. Further it alters when connections are timedout due to keepalive failures. It does not alter the behavior of when keepalives are sent. This is as per the linux behavior. PiperOrigin-RevId: 285099795
2019-12-10Deduplicate and simplify control message processing for recvmsg and sendmsg.Dean Deng
Also, improve performance by calculating how much space is needed before making an allocation for sendmsg in hostinet. PiperOrigin-RevId: 284898581
2019-12-10Let socket.ControlMessages Release() the underlying transport.ControlMessages.Dean Deng
PiperOrigin-RevId: 284804370
2019-12-10Make comments clearer for control message handling.Dean Deng
PiperOrigin-RevId: 284791600
2019-12-06Add TCP stats for connection close and keep-alive timeouts.Mithun Iyer
Fix bugs in updates to TCP CurrentEstablished stat. Fixes #1277 PiperOrigin-RevId: 284292459
2019-12-03Remove TODO for obsolete bug.Zach Koopmans
PiperOrigin-RevId: 283571456
2019-12-03Support IP_TOS and IPV6_TCLASS socket options for hostinet sockets.Dean Deng
There are two potential ways of sending a TOS byte with outgoing packets: including a control message in sendmsg, or setting the IP_TOS/IPV6_TCLASS socket options (for IPV4 and IPV6 respectively). This change lets hostinet support the latter. Fixes #1188 PiperOrigin-RevId: 283550925
2019-12-02Support sending IP_TOS and IPV6_TCLASS control messages with hostinet sockets.Dean Deng
There are two potential ways of sending a TOS byte with outgoing packets: including a control message in sendmsg, or setting the IP_TOS/IPV6_TCLASS socket options (for IPV4 and IPV6 respectively). This change lets hostinet support the former. PiperOrigin-RevId: 283346737
2019-11-27Add support for receiving TOS and TCLASS control messages in hostinet.Dean Deng
This involves allowing getsockopt/setsockopt for the corresponding socket options, as well as allowing hostinet to process control messages received from the actual recvmsg syscall. PiperOrigin-RevId: 282851425