summaryrefslogtreecommitdiffhomepage
path: root/pkg/sentry/socket
AgeCommit message (Collapse)Author
2018-12-28Implement SO_REUSEPORT for TCP and UDP socketsAndrei Vagin
This option allows multiple sockets to be bound to the same port. Incoming packets are distributed to sockets using a hash based on source and destination addresses. This means that all packets from one sender will be received by the same server socket. PiperOrigin-RevId: 227153413 Change-Id: I59b6edda9c2209d5b8968671e9129adb675920cf
2018-12-26Plumb IP_MULTICAST_TTL to netstack.Ian Gudger
PiperOrigin-RevId: 226993086 Change-Id: I71757f231436538081d494da32ca69f709bc71c7
2018-12-21Stub out SO_OOBINLINE.Ian Gudger
We don't explicitly support out-of-band data and treat it like normal in-band data. This is equilivent to SO_OOBINLINE being enabled, so always report that it is enabled. PiperOrigin-RevId: 226572742 Change-Id: I4c30ccb83265e76c30dea631cbf86822e6ee1c1b
2018-12-21Implement SO_KEEPALIVE, TCP_KEEPIDLE, and TCP_KEEPINTVL.Ian Gudger
Within gVisor, plumb new socket options to netstack. Within netstack, fix GetSockOpt and SetSockOpt return value logic. PiperOrigin-RevId: 226532229 Change-Id: If40734e119eed633335f40b4c26facbebc791c74
2018-12-17Fix recv blocking for connectionless Unix sockets.Ian Gudger
Connectionless Unix sockets (DGRAM Unix sockets created with the socket system call) inherently only have a read queue. They do not establish bidirectional connections, instead, the connect system call only sets a default send location. Writes give the data to the other endpoint which has its own read queue. To simplify the code, connectionless Unix sockets still get read and write queues, but the write queue is a dummy and never waited on. The read queue is the connectionless endpoint's queue. This change fixes a bug where the dummy queue was incorrectly set as the read queue and the endpoint's queue was incorrectly set as the write queue. This meant that read notifications went to the dummy queue and were black holed. PiperOrigin-RevId: 225921042 Change-Id: I8d9059def787a2c3c305185b92d05093fbd2be2a
2018-12-14Move fdnotifier package to reduce internal confusion.Adin Scannell
PiperOrigin-RevId: 225632398 Change-Id: I909e7e2925aa369adc28e844c284d9a6108e85ce
2018-12-14Implement SO_SNDTIMEOIan Gudger
PiperOrigin-RevId: 225620490 Change-Id: Ia726107b3f58093a5f881634f90b071b32d2c269
2018-12-13Fix WAITALL and RCVTIMEO interactionIan Gudger
PiperOrigin-RevId: 225424296 Change-Id: I60fcc2b859339dca9963cb32227a287e719ab765
2018-12-10Implement MSG_WAITALLIan Gudger
MSG_WAITALL requests that recv family calls do not perform short reads. It only has an effect for SOCK_STREAM sockets, other types ignore it. PiperOrigin-RevId: 224918540 Change-Id: Id97fbf972f1f7cbd4e08eec0138f8cbdf1c94fe7
2018-12-09Stub out TCP_QUICKACKIan Gudger
PiperOrigin-RevId: 224696233 Change-Id: I45c425d9e32adee5dcce29ca7439a06567b26014
2018-12-06Fix tcpip.Endpoint.Write contract regarding short writesIan Gudger
* Clarify tcpip.Endpoint.Write contract regarding short writes. * Enforce tcpip.Endpoint.Write contract regarding short writes. * Update relevant users of tcpip.Endpoint.Write. PiperOrigin-RevId: 224377586 Change-Id: I24299ecce902eb11317ee13dae3b8d8a7c5b097d
2018-12-04Partial writes should loop in rpcinet.Brian Geffon
FileOperations.Write should return ErrWouldBlock to allow the upper layer to loop and sendmsg should continue writing where it left off on a partial write. PiperOrigin-RevId: 224081631 Change-Id: Ic61f6943ea6b7abbd82e4279decea215347eac48
2018-12-04Max link traversals should be for an entire path.Brian Geffon
The number of symbolic links that are allowed to be followed are for a full path and not just a chain of symbolic links. PiperOrigin-RevId: 224047321 Change-Id: I5e3c4caf66a93c17eeddcc7f046d1e8bb9434a40
2018-12-03Return an int32 for netlink SO_RCVBUFIan Gudger
Untyped integer constants default to type int and the binary package will panic if one tries to encode an int. PiperOrigin-RevId: 223890001 Change-Id: Iccc3afd6d74bad24c35d764508e450fd317b76ec
2018-11-27Save shutdown flags first.Brian Geffon
With rpcinet if shutdown flags are not saved before making the rpc a race is possible where blocked threads are woken up before the flags have been persisted. This would mean that threads can block indefinitely in a recvmsg after a shutdown(SHUT_RD) has happened. PiperOrigin-RevId: 223089783 Change-Id: If595e7add12aece54bcdf668ab64c570910d061a
2018-11-20Add unsupported syscall events for get/setsockoptFabricio Voznika
PiperOrigin-RevId: 222148953 Change-Id: I21500a9f08939c45314a6414e0824490a973e5aa
2018-11-13Implement TCP_NODELAY and TCP_CORKIan Gudger
Previously, TCP_NODELAY was always enabled and we would lie about it being configurable. TCP_NODELAY is now disabled by default (to match Linux) in the socket layer so that non-gVisor users don't automatically start using this questionable optimization. PiperOrigin-RevId: 221368472 Change-Id: Ib0240f66d94455081f4e0ca94f09d9338b2c1356
2018-10-24Convert Unix transport to syserrIan Gudger
Previously this code used the tcpip error space. Since it is no longer part of netstack, it can use the sentry's error space (except for a few cases where there is still some shared code. This reduces the number of error space conversions required for hot Unix socket operations. PiperOrigin-RevId: 218541611 Change-Id: I3d13047006a8245b5dfda73364d37b8a453784bb
2018-10-23Track paths and provide a rename hook.Adin Scannell
This change also adds extensive testing to the p9 package via mocks. The sanity checks and type checks are moved from the gofer into the core package, where they can be more easily validated. PiperOrigin-RevId: 218296768 Change-Id: I4fc3c326e7bf1e0e140a454cbacbcc6fd617ab55
2018-10-20Refcount Unix transport queueIan Gudger
This allows us to release messages in the queue when all users close. PiperOrigin-RevId: 218033550 Change-Id: I2f6e87650fced87a3977e3b74c64775c7b885c1b
2018-10-20Add more unimplemented syscall eventsFabricio Voznika
Added events for *ctl syscalls that may have multiple different commands. For runsc, each syscall event is only logged once. For *ctl syscalls, use the cmd as identifier, not only the syscall number. PiperOrigin-RevId: 218015941 Change-Id: Ie3c19131ae36124861e9b492a7dbe1765d9e5e59
2018-10-19Use correct company name in copyright headerIan Gudger
PiperOrigin-RevId: 217951017 Change-Id: Ie08bf6987f98467d07457bcf35b5f1ff6e43c035
2018-10-17Use generic ilist in Unix transport queueIan Gudger
This should improve performance. PiperOrigin-RevId: 217610560 Change-Id: I370f196ea2396f1715a460b168ecbee197f94d6c
2018-10-17Merge queue into Unix transportIan Gudger
This queue only has a single user, so there is no need for it to use an interface. Merging it into the same package as its sole user allows us to avoid a circular dependency. This simplifies the code and should slightly improve performance. PiperOrigin-RevId: 217595889 Change-Id: Iabbd5164240b935f79933618c61581bc8dcd2822
2018-10-17Move Unix transport out of netstackIan Gudger
PiperOrigin-RevId: 217557656 Change-Id: I63d27635b1a6c12877279995d2d9847b6a19da9b
2018-10-12runsc: Support retrieving MTU via netdevice ioctl.Kevin Krakauer
This enables ifconfig to display MTU. PiperOrigin-RevId: 216917021 Change-Id: Id513b23d9d76899bcb71b0b6a25036f41629a923
2018-10-10Enforce message size limits and avoid host calls with too many iovecsMichael Pratt
Currently, in the face of FileMem fragmentation and a large sendmsg or recvmsg call, host sockets may pass > 1024 iovecs to the host, which will immediately cause the host to return EMSGSIZE. When we detect this case, use a single intermediate buffer to pass to the kernel, copying to/from the src/dst buffer. To avoid creating unbounded intermediate buffers, enforce message size checks and truncation w.r.t. the send buffer size. The same functionality is added to netstack unix sockets for feature parity. PiperOrigin-RevId: 216590198 Change-Id: I719a32e71c7b1098d5097f35e6daf7dd5190eff7
2018-10-09Add new netstack metrics to the sentryIan Gudger
PiperOrigin-RevId: 216431260 Change-Id: Ia6e5c8d506940148d10ff2884cf4440f470e5820
2018-09-28Block for link address resolutionSepehr Raissian
Previously, if address resolution for UDP or Ping sockets required sending packets using Write in Transport layer, Resolve would return ErrWouldBlock and Write would return ErrNoLinkAddress. Meanwhile startAddressResolution would run in background. Further calls to Write using same address would also return ErrNoLinkAddress until resolution has been completed successfully. Since Write is not allowed to block and System Calls need to be interruptible in System Call layer, the caller to Write is responsible for blocking upon return of ErrWouldBlock. Now, when startAddressResolution is called a notification channel for the completion of the address resolution is returned. The channel will traverse up to the calling function of Write as well as ErrNoLinkAddress. Once address resolution is complete (success or not) the channel is closed. The caller would call Write again to send packets and check if address resolution was compeleted successfully or not. Fixes google/gvisor#5 Change-Id: Idafaf31982bee1915ca084da39ae7bd468cebd93 PiperOrigin-RevId: 214962200
2018-08-20Fix handling of abstract Unix socket addressesIan Gudger
* Don't truncate abstract addresses at second null. * Properly handle abstract addresses with length < 108 bytes. PiperOrigin-RevId: 209502703 Change-Id: I49053f2d18b5a78208c3f640c27dbbdaece4f1a9
2018-08-14Reduce map lookups in syserrIan Gudger
PiperOrigin-RevId: 208755352 Change-Id: Ia24630f452a4a42940ab73a8113a2fd5ea2cfca2
2018-08-14Enforce Unix socket address length limitIan Gudger
PiperOrigin-RevId: 208720936 Change-Id: Ic943a88b6efeff49574306d4d4e1f113116ae32e
2018-08-10Enable checkpoint/restore in cases of UDS use.Brielle Broder
Previously, processes which used file-system Unix Domain Sockets could not be checkpoint-ed in runsc because the sockets were saved with their inode numbers which do not necessarily remain the same upon restore. Now, the sockets are also saved with their paths so that the new inodes can be determined for the sockets based on these paths after restoring. Tests for cases with UDS use are included. Test cleanup to come. PiperOrigin-RevId: 208268781 Change-Id: Ieaa5d5d9a64914ca105cae199fd8492710b1d7ec
2018-08-08Basic support for ip link/addr and ifconfigFabricio Voznika
Closes #94 PiperOrigin-RevId: 207997580 Change-Id: I19b426f1586b5ec12f8b0cd5884d5b401d334924
2018-08-08Enable SACK in runscFabricio Voznika
SACK is disabled by default and needs to be manually enabled. It not only improves performance, but also fixes hangs downloading files from certain websites. PiperOrigin-RevId: 207906742 Change-Id: I4fb7277b67bfdf83ac8195f1b9c38265a0d51e8b
2018-08-02Automated rollback of changelist 207037226Zhaozhong Ni
PiperOrigin-RevId: 207125440 Change-Id: I6c572afb4d693ee72a0c458a988b0e96d191cd49
2018-08-01Automated rollback of changelist 207007153Michael Pratt
PiperOrigin-RevId: 207037226 Change-Id: I8b5f1a056d4f3eab17846f2e0193bb737ecb5428
2018-08-01stateify: convert all packages to use explicit mode.Zhaozhong Ni
PiperOrigin-RevId: 207007153 Change-Id: Ifedf1cc3758dc18be16647a4ece9c840c1c636c9
2018-07-27stateify: support explicit annotation mode; convert refs and stack packages.Zhaozhong Ni
We have been unnecessarily creating too many savable types implicitly. PiperOrigin-RevId: 206334201 Change-Id: Idc5a3a14bfb7ee125c4f2bb2b1c53164e46f29a8
2018-06-22Add rpcinet support for SIOCGIFCONF.Brian Geffon
The interfaces and their addresses are already available via the stack Intefaces and InterfaceAddrs. Also add some tests as we had no tests around SIOCGIFCONF. I also added the socket_netgofer lifecycle for IOCTL tests. PiperOrigin-RevId: 201744863 Change-Id: Ie0a285a2a2f859fa0cafada13201d5941b95499a
2018-06-22Netstack should return EOF on closed read.Brian Geffon
The shutdown behavior where we return EAGAIN for sockets which are non-blocking is only correct for packet based sockets. SOCK_STREAM sockets should return EOF. PiperOrigin-RevId: 201703055 Change-Id: I20b25ceca7286c37766936475855959706fc5397
2018-06-19Epsocket has incorrect recv(2) behavior after SHUT_RD.Brian Geffon
After shutdown(SHUT_RD) calls to recv /w MSG_DONTWAIT or with O_NONBLOCK should result in a EAGAIN and not 0. Blocking sockets should return 0 as they would have otherwise blocked indefinitely. PiperOrigin-RevId: 201271123 Change-Id: If589b69c17fa5b9ff05bcf9e44024da9588c8876
2018-06-19Rpcinet is racy around shutdown flags.Brian Geffon
Correct a data race in rpcinet where a shutdown and recvmsg can race around shutown flags. PiperOrigin-RevId: 201238366 Change-Id: I5eb06df4a2b4eba331eeb5de19076213081d581f
2018-06-19Rpcinet needs to track shutdown state for blocking sockets.Brian Geffon
Because rpcinet will emulate a blocking socket backed by an rpc based non-blocking socket. In the event of a shutdown(SHUT_RD) followed by a read a non-blocking socket is allowed to return an EWOULDBLOCK however since a blocking socket knows it cannot receive anymore data it would block indefinitely and in this situation linux returns 0. We have to track this on the rpcinet sentry side so we can emulate that behavior because the remote side has no way to know if the socket is actually blocking within the sentry. PiperOrigin-RevId: 201201618 Change-Id: I4ac3a7b74b5dae471ab97c2e7d33b83f425aedac
2018-06-17Add rpcinet support for control messages.Brian Geffon
Add support for control messages, but at this time the only control message that the sentry will support here is SO_TIMESTAMP. PiperOrigin-RevId: 200922230 Change-Id: I63a852d9305255625d9df1d989bd46a66e93c446
2018-06-13Fix missing returns in rpcinet.Brian Geffon
PiperOrigin-RevId: 200472634 Change-Id: I3f0fb9e3b2f8616e6aa1569188258f330bf1ed31
2018-06-12Rpcinet doensn't handle SO_RCVTIMEO properly.Brian Geffon
Rpcinet already inherits socket.ReceiveTimeout; however, it's never set on setsockopt(2). The value is currently forwarded as an RPC and ignored as all sockets will be non-blocking on the RPC side. PiperOrigin-RevId: 200299260 Change-Id: I6c610ea22c808ff6420c63759dccfaeab17959dd
2018-06-11Set CLOEXEC option to socketsFabricio Voznika
hostinet/socket.go: the Sentry doesn't spawn new processes, but it doesn't hurt to protect the socket from leaking. unet/unet.go: should be setting closing on exec. The FD is explicitly donated to children when needed. PiperOrigin-RevId: 200135682 Change-Id: Ia8a45ced1e00a19420c8611b12e7a8ee770f89cb
2018-06-11Rpcinet is incorrectly handling MSG_TRUNC with SOCK_STREAMBrian Geffon
SOCK_STREAM has special behavior with respect to MSG_TRUNC. Specifically, the data isn't actually copied back out to userspace when MSG_TRUNC is provided on a SOCK_STREAM. According to tcp(7): "Since version 2.4, Linux supports the use of MSG_TRUNC in the flags argument of recv(2) (and recvmsg(2)). This flag causes the received bytes of data to be discarded, rather than passed back in a caller-supplied buffer." PiperOrigin-RevId: 200134860 Change-Id: I70f17a5f60ffe7794c3f0cfafd131c069202e90d
2018-06-11rpcinet is treating EAGAIN and EWOULDBLOCK as different errnos.Brian Geffon
PiperOrigin-RevId: 200124614 Change-Id: I38a7b083f1464a2a586fe24db648e624c455fec5