summaryrefslogtreecommitdiffhomepage
path: root/pkg/sentry/socket/netstack
AgeCommit message (Collapse)Author
2021-02-09Add support for setting SO_SNDBUF for unix domain sockets.Bhasker Hariharan
The limits for snd/rcv buffers for unix domain socket is controlled by the following sysctls on linux - net.core.rmem_default - net.core.rmem_max - net.core.wmem_default - net.core.wmem_max Today in gVisor we do not expose these sysctls but we do support setting the equivalent in netstack via stack.Options() method. But AF_UNIX sockets in gVisor can be used without netstack, with hostinet or even without any networking stack at all. Which means ideally these sysctls need to live as globals in gVisor. But rather than make this a big change for now we hardcode the limits in the AF_UNIX implementation itself (which in itself is better than where we were before) where it SO_SNDBUF was hardcoded to 16KiB. Further we bump the initial limit to a default value of 208 KiB to match linux from the paltry 16 KiB we use today. Updates #5132 PiperOrigin-RevId: 356665498
2021-02-05Replace TaskFromContext(ctx).Kernel() with KernelFromContext(ctx)Ting-Yu Wang
Panic seen at some code path like control.ExecAsync where ctx does not have a Task. Reported-by: syzbot+55ce727161cf94a7b7d6@syzkaller.appspotmail.com PiperOrigin-RevId: 355960596
2021-02-01Refactor HandleControlPacket/SockErrorGhanan Gowripalan
...to remove the need for the transport layer to deduce the type of error it received. Rename HandleControlPacket to HandleError as HandleControlPacket only handles errors. tcpip.SockError now holds a tcpip.SockErrorCause interface that different errors can implement. PiperOrigin-RevId: 354994306
2021-01-28Change tcpip.Error to an interfaceTamir Duberstein
This makes it possible to add data to types that implement tcpip.Error. ErrBadLinkEndpoint is removed as it is unused. PiperOrigin-RevId: 354437314
2021-01-28Propagate reader error in ReadFromTamir Duberstein
This was removed in 6c0e1d9cfe6adbfbb32e7020d6426608ac63ad37 but turns out to be crucial to prevent flaky behaviour in sendfile. PiperOrigin-RevId: 354434144
2021-01-27Add support for more fields in netstack for TCP_INFONayana Bidari
This CL adds support for the following fields: - RTT, RTTVar, RTO - send congestion window (sndCwnd) and send slow start threshold (sndSsthresh) - congestion control state(CaState) - ReorderSeen PiperOrigin-RevId: 354195361
2021-01-26Initialize the send buffer handler in endpoint creation.Nayana Bidari
- This CL will initialize the function handler used for getting the send buffer size limits during endpoint creation and does not require the caller of SetSendBufferSize(..) to know the endpoint type(tcp/udp/..) PiperOrigin-RevId: 353992634
2021-01-26Move SO_SNDBUF to socketops.Nayana Bidari
This CL moves {S,G}etsockopt of SO_SNDBUF from all endpoints to socketops. For unix sockets, we do not support setting of this option. PiperOrigin-RevId: 353871484
2021-01-25Add per endpoint ARP statisticsArthur Sfez
The ARP stat NetworkUnreachable was removed, and was replaced by InterfaceHasNoLocalAddress. No stats are recorded when dealing with an missing endpoint (ErrNotConnected) (because if there is no endpoint, there is no valid per-endpoint stats). PiperOrigin-RevId: 353759462
2021-01-22Define tcpip.Payloader in terms of io.ReaderTamir Duberstein
Fixes #1509. PiperOrigin-RevId: 353295589
2021-01-20Remove unimplemented message for SO_LINGERNayana Bidari
- Removes the unimplemented message for SO_LINGER - Fix the length for IP_PKTINFO option PiperOrigin-RevId: 352917611
2021-01-20Move Lock/UnlockPOSIX into LockFD util.Dean Deng
PiperOrigin-RevId: 352904728
2021-01-15Remove count argument from tcpip.Endpoint.ReadTamir Duberstein
The same intent can be specified via the io.Writer. PiperOrigin-RevId: 352098747
2021-01-14Add stats for ARPArthur Sfez
Fixes #4963 Startblock: has LGTM from sbalana and then add reviewer ghanan PiperOrigin-RevId: 351886320
2021-01-13Do not resolve remote link address at transport layerGhanan Gowripalan
Link address resolution is performed at the link layer (if required) so we can defer it from the transport layer. When link resolution is required, packets will be queued and sent once link resolution completes. If link resolution fails, the transport layer will receive a control message indicating that the stack failed to route the packet. tcpip.Endpoint.Write no longer returns a channel now that writes do not wait for link resolution at the transport layer. tcpip.ErrNoLinkAddress is no longer used so it is removed. Removed calls to stack.Route.ResolveWith from the transport layer so that link resolution is performed when a route is created in response to an incoming packet (e.g. to complete TCP handshakes or send a RST). Tests: - integration_test.TestForwarding - integration_test.TestTCPLinkResolutionFailure Fixes #4458 RELNOTES: n/a PiperOrigin-RevId: 351684158
2021-01-12Remove useless cached stateTamir Duberstein
Simplify some logic while I'm here. PiperOrigin-RevId: 351491593
2021-01-07netstack: Refactor tcpip.Endpoint.ReadTing-Yu Wang
Read now takes a destination io.Writer, count, options. Keeping the method name Read, in contrast to the Write method. This enables: * direct transfer of views under VV * zero copy It also eliminates the need for sentry to keep a slice of view because userspace had requested a read that is smaller than the view returned, removing the complexity there. Read/Peek/ReadPacket are now consolidated together and some duplicate code is removed. PiperOrigin-RevId: 350636322
2021-01-06Support add/remove IPv6 multicast group sock optGhanan Gowripalan
IPv4 was always supported but UDP never supported joining/leaving IPv6 multicast groups via socket options. Add: IPPROTO_IPV6, IPV6_JOIN_GROUP/IPV6_ADD_MEMBERSHIP Remove: IPPROTO_IPV6, IPV6_LEAVE_GROUP/IPV6_DROP_MEMBERSHIP Test: integration_test.TestUDPAddRemoveMembershipSocketOption PiperOrigin-RevId: 350396072
2020-12-22Move SO_BINDTODEVICE to socketops.Nayana Bidari
PiperOrigin-RevId: 348696094
2020-12-17[netstack] Implement IP(V6)_RECVERR socket option.Ayush Ranjan
PiperOrigin-RevId: 348055514
2020-12-17[netstack] Implement MSG_ERRQUEUE flag for recvmsg(2).Ayush Ranjan
Introduces the per-socket error queue and the necessary cmsg mechanisms. PiperOrigin-RevId: 348028508
2020-12-14Move SO_LINGER option to socketops.Nayana Bidari
PiperOrigin-RevId: 347437786
2020-12-14Move SO_ERROR and SO_OOBINLINE option to socketops.Nayana Bidari
SO_OOBINLINE option is set/get as boolean value, which is the same as linux. As we currently do not support disabling this option, we always return it as true. PiperOrigin-RevId: 347413905
2020-12-11[netstack] Decouple tcpip.ControlMessages from the IP control messges.Ayush Ranjan
tcpip.ControlMessages can not contain Linux specific structures which makes it painful to convert back and forth from Linux to tcpip back to Linux when passing around control messages in hostinet and raw sockets. Now we convert to the Linux version of the control message as soon as we are out of tcpip. PiperOrigin-RevId: 347027065
2020-12-09Add support for IP_RECVORIGDSTADDR IP option.Bhasker Hariharan
Fixes #5004 PiperOrigin-RevId: 346643745
2020-12-07Export IGMP statsArthur Sfez
PiperOrigin-RevId: 346197760
2020-12-02Extract ICMPv4/v6 specific stats to their own typesArthur Sfez
This change lets us split the v4 stats from the v6 stats, which will be useful when adding stats for each network endpoint. PiperOrigin-RevId: 345322615
2020-12-02[netstack] Refactor common utils out of netstack to socket package.Ayush Ranjan
Moved AddressAndFamily() and ConvertAddress() to socket package from netstack. This helps because these utilities are used by sibling netstack packages. Such sibling dependencies can later cause circular dependencies. Common utils shared between siblings should be moved up to the parent. PiperOrigin-RevId: 345275571
2020-11-26[netstack] Add SOL_TCP options to SocketOptions.Ayush Ranjan
Ports the following options: - TCP_NODELAY - TCP_CORK - TCP_QUICKACK Also deletes the {Get/Set}SockOptBool interface methods from all implementations PiperOrigin-RevId: 344378824
2020-11-25[netstack] Add SOL_IP and SOL_IPV6 options to SocketOptions.Ayush Ranjan
We will use SocketOptions for all kinds of options, not just SOL_SOCKET options because (1) it is consistent with Linux which defines all option variables on the top level socket struct, (2) avoid code complexity. Appropriate checks have been added for matching option level to the endpoint type. Ported the following options to this new utility: - IP_MULTICAST_LOOP - IP_RECVTOS - IPV6_RECVTCLASS - IP_PKTINFO - IP_HDRINCL - IPV6_V6ONLY Changes in behavior (these are consistent with what Linux does AFAICT): - Now IP_MULTICAST_LOOP can be set for TCP (earlier it was a noop) but does not affect the endpoint itself. - We can now getsockopt IP_HDRINCL (earlier we would get an error). - Now we return ErrUnknownProtocolOption if SOL_IP or SOL_IPV6 options are used on unix sockets. - Now we return ErrUnknownProtocolOption if SOL_IPV6 options are used on non AF_INET6 endpoints. This change additionally makes the following modifications: - Add State() uint32 to commonEndpoint because both tcpip.Endpoint and transport.Endpoint interfaces have it. It proves to be quite useful. - Gets rid of SocketOptionsHandler.IsListening(). It was an anomaly as it was not a handler. It is now implemented on netstack itself. - Gets rid of tcp.endpoint.EndpointInfo and directly embeds stack.TransportEndpointInfo. There was an unnecessary level of embedding which served no purpose. - Removes some checks dual_stack_test.go that used the errors from GetSockOptBool(tcpip.V6OnlyOption) to confirm some state. This is not consistent with the new design and also seemed to be testing the implementation instead of behavior. PiperOrigin-RevId: 344354051
2020-11-18[netstack] Move SO_KEEPALIVE and SO_ACCEPTCONN option to SocketOptions.Ayush Ranjan
PiperOrigin-RevId: 343217712
2020-11-18[netstack] Move SO_REUSEPORT and SO_REUSEADDR option to SocketOptions.Ayush Ranjan
This changes also introduces: - `SocketOptionsHandler` interface which can be implemented by endpoints to handle endpoint specific behavior on SetSockOpt. This is analogous to what Linux does. - `DefaultSocketOptionsHandler` which is a default implementation of the above. This is embedded in all endpoints so that we don't have to uselessly implement empty functions. Endpoints with specific behavior can override the embedded method by manually defining its own implementation. PiperOrigin-RevId: 343158301
2020-11-18[netstack] Move SO_NO_CHECK option to SocketOptions.Ayush Ranjan
PiperOrigin-RevId: 343146856
2020-11-18[netstack] Move SO_PASSCRED option to SocketOptions.Ayush Ranjan
This change also makes the following fixes: - Make SocketOptions use atomic operations instead of having to acquire/drop locks upon each get/set option. - Make documentation more consistent. - Remove tcpip.SocketOptions from socketOpsCommon because it already exists in transport.Endpoint. - Refactors get/set socket options tests to be easily extendable. PiperOrigin-RevId: 343103780
2020-11-17Fix SO_ERROR behavior for TCP in gVisor.Bhasker Hariharan
Fixes the behaviour of SO_ERROR for tcp sockets where in linux it returns sk->sk_err and if sk->sk_err is 0 then it returns sk->sk_soft_err. In gVisor TCP we endpoint.HardError is the equivalent of sk->sk_err and endpoint.LastError holds soft errors. This change brings this into alignment with Linux such that both hard/soft errors are cleared when retrieved using getsockopt(.. SO_ERROR) is called on a socket. Fixes #3812 PiperOrigin-RevId: 342868552
2020-11-12Refactor SOL_SOCKET optionsNayana Bidari
Store all the socket level options in a struct and call {Get/Set}SockOpt on this struct. This will avoid implementing socket level options on all endpoints. This CL contains implementing one socket level option for tcp and udp endpoints. PiperOrigin-RevId: 342203981
2020-10-27Add basic address deletion to netlinkIan Lewis
Updates #3921 PiperOrigin-RevId: 339195417
2020-10-23Support VFS2 save/restore.Jamie Liu
Inode number consistency checks are now skipped in save/restore tests for reasons described in greatest detail in StatTest.StateDoesntChangeAfterRename. They pass in VFS1 due to the bug described in new test case SimpleStatTest.DifferentFilesHaveDifferentDeviceInodeNumberPairs. Fixes #1663 PiperOrigin-RevId: 338776148
2020-10-23[vfs] kernfs: Implement remaining InodeAttr fields.Ayush Ranjan
Added the following fields in kernfs.InodeAttr: - blockSize - atime - mtime - ctime Also resolved all TODOs for #1193. Fixes #1193 PiperOrigin-RevId: 338714527
2020-10-23Support getsockopt for SO_ACCEPTCONN.Nayana Bidari
The SO_ACCEPTCONN option is used only on getsockopt(). When this option is specified, getsockopt() indicates whether socket listening is enabled for the socket. A value of zero indicates that socket listening is disabled; non-zero that it is enabled. PiperOrigin-RevId: 338703206
2020-10-15sockets: ignore io.EOF from view.ReadAtAndrei Vagin
Reported-by: syzbot+5466463b7604c2902875@syzkaller.appspotmail.com PiperOrigin-RevId: 337451896
2020-09-30ip6tables: redirect supportKevin Krakauer
Adds support for the IPv6-compatible redirect target. Redirection is a limited form of DNAT, where the destination is always the localhost. Updates #3549. PiperOrigin-RevId: 334698344
2020-09-29Don't allow broadcast/multicast source addressGhanan Gowripalan
As per relevant IP RFCS (see code comments), broadcast (for IPv4) and multicast addresses are not allowed. Currently checks for these are done at the transport layer, but since it is explicitly forbidden at the IP layers, check for them there. This change also removes the UDP.InvalidSourceAddress stat since there is no longer a need for it. Test: ip_test.TestSourceAddressValidation PiperOrigin-RevId: 334490971
2020-09-29iptables: refactor to make targets extendableKevin Krakauer
Like matchers, targets should use a module-like register/lookup system. This replaces the brittle switch statements we had before. The only behavior change is supporing IPT_GET_REVISION_TARGET. This makes it much easier to add IPv6 redirect in the next change. Updates #3549. PiperOrigin-RevId: 334469418
2020-09-24Fix socket record leak in VFS2Tiwei Bie
VFS2 socket record is not removed from the system-wide socket table when the socket is released, which will lead to a memory leak. This patch fixes this issue. Fixes: #3874 Signed-off-by: Tiwei Bie <tiwei.btw@antgroup.com>
2020-09-20Merge pull request #3651 from ianlewis:ip-forwardinggVisor bot
PiperOrigin-RevId: 332760843
2020-09-18Count packets dropped by iptables in IPStatsKevin Krakauer
PiperOrigin-RevId: 332486383
2020-09-17{Set,Get} SO_LINGER on all endpoints.Nayana Bidari
SO_LINGER is a socket level option and should be stored on all endpoints even though it is used to linger only for TCP endpoints. PiperOrigin-RevId: 332369252
2020-09-17Return ENOPROTOOPT for all SOL_PACKET options.Bhasker Hariharan
This is required to make tcpdump work. tcpdump falls back to not using things like PACKET_RX_RING if setsockopt returns ENOPROTOOPT. This used to be the case before https://github.com/google/gvisor/commit/6f8fb7e0db2790ff1f5ba835780c03fe245e437f. Fixes #3981 PiperOrigin-RevId: 332326517
2020-09-16Automated rollback of changelist 329526153Nayana Bidari
PiperOrigin-RevId: 332097286