summaryrefslogtreecommitdiffhomepage
path: root/pkg/tcpip/transport/udp
AgeCommit message (Collapse)Author
2019-11-14Use PacketBuffers for outgoing packets.Kevin Krakauer
PiperOrigin-RevId: 280455453
2019-11-12Add tests for SO_REUSEADDR and SO_REUSEPORT.Ian Gudger
* Basic tests for the SO_REUSEADDR and SO_REUSEPORT options. * SO_REUSEADDR functional tests for TCP and UDP. * SO_REUSEADDR and SO_REUSEPORT interaction tests for UDP. * Stubbed support for UDP getsockopt(SO_REUSEADDR). PiperOrigin-RevId: 280049265
2019-11-06Rename nicid to nicID to follow go-readability initialismsGhanan Gowripalan
https://github.com/golang/go/wiki/CodeReviewComments#initialisms This change does not introduce any new functionality. It just renames variables from `nicid` to `nicID`. PiperOrigin-RevId: 278992966
2019-11-06Use PacketBuffers, rather than VectorisedViews, in netstack.Kevin Krakauer
PacketBuffers are analogous to Linux's sk_buff. They hold all information about a packet, headers, and payload. This is important for: * iptables to access various headers of packets * Preventing the clutter of passing different net and link headers along with VectorisedViews to packet handling functions. This change only affects the incoming packet path, and a future change will change the outgoing path. Benchmark Regular PacketBufferPtr PacketBufferConcrete -------------------------------------------------------------------------------- BM_Recvmsg 400.715MB/s 373.676MB/s 396.276MB/s BM_Sendmsg 361.832MB/s 333.003MB/s 335.571MB/s BM_Recvfrom 453.336MB/s 393.321MB/s 381.650MB/s BM_Sendto 378.052MB/s 372.134MB/s 341.342MB/s BM_SendmsgTCP/0/1k 353.711MB/s 316.216MB/s 322.747MB/s BM_SendmsgTCP/0/2k 600.681MB/s 588.776MB/s 565.050MB/s BM_SendmsgTCP/0/4k 995.301MB/s 888.808MB/s 941.888MB/s BM_SendmsgTCP/0/8k 1.517GB/s 1.274GB/s 1.345GB/s BM_SendmsgTCP/0/16k 1.872GB/s 1.586GB/s 1.698GB/s BM_SendmsgTCP/0/32k 1.017GB/s 1.020GB/s 1.133GB/s BM_SendmsgTCP/0/64k 475.626MB/s 584.587MB/s 627.027MB/s BM_SendmsgTCP/0/128k 416.371MB/s 503.434MB/s 409.850MB/s BM_SendmsgTCP/0/256k 323.449MB/s 449.599MB/s 388.852MB/s BM_SendmsgTCP/0/512k 243.992MB/s 267.676MB/s 314.474MB/s BM_SendmsgTCP/0/1M 95.138MB/s 95.874MB/s 95.417MB/s BM_SendmsgTCP/0/2M 96.261MB/s 94.977MB/s 96.005MB/s BM_SendmsgTCP/0/4M 96.512MB/s 95.978MB/s 95.370MB/s BM_SendmsgTCP/0/8M 95.603MB/s 95.541MB/s 94.935MB/s BM_SendmsgTCP/0/16M 94.598MB/s 94.696MB/s 94.521MB/s BM_SendmsgTCP/0/32M 94.006MB/s 94.671MB/s 94.768MB/s BM_SendmsgTCP/0/64M 94.133MB/s 94.333MB/s 94.746MB/s BM_SendmsgTCP/0/128M 93.615MB/s 93.497MB/s 93.573MB/s BM_SendmsgTCP/0/256M 93.241MB/s 95.100MB/s 93.272MB/s BM_SendmsgTCP/1/1k 303.644MB/s 316.074MB/s 308.430MB/s BM_SendmsgTCP/1/2k 537.093MB/s 584.962MB/s 529.020MB/s BM_SendmsgTCP/1/4k 882.362MB/s 939.087MB/s 892.285MB/s BM_SendmsgTCP/1/8k 1.272GB/s 1.394GB/s 1.296GB/s BM_SendmsgTCP/1/16k 1.802GB/s 2.019GB/s 1.830GB/s BM_SendmsgTCP/1/32k 2.084GB/s 2.173GB/s 2.156GB/s BM_SendmsgTCP/1/64k 2.515GB/s 2.463GB/s 2.473GB/s BM_SendmsgTCP/1/128k 2.811GB/s 3.004GB/s 2.946GB/s BM_SendmsgTCP/1/256k 3.008GB/s 3.159GB/s 3.171GB/s BM_SendmsgTCP/1/512k 2.980GB/s 3.150GB/s 3.126GB/s BM_SendmsgTCP/1/1M 2.165GB/s 2.233GB/s 2.163GB/s BM_SendmsgTCP/1/2M 2.370GB/s 2.219GB/s 2.453GB/s BM_SendmsgTCP/1/4M 2.005GB/s 2.091GB/s 2.214GB/s BM_SendmsgTCP/1/8M 2.111GB/s 2.013GB/s 2.109GB/s BM_SendmsgTCP/1/16M 1.902GB/s 1.868GB/s 1.897GB/s BM_SendmsgTCP/1/32M 1.655GB/s 1.665GB/s 1.635GB/s BM_SendmsgTCP/1/64M 1.575GB/s 1.547GB/s 1.575GB/s BM_SendmsgTCP/1/128M 1.524GB/s 1.584GB/s 1.580GB/s BM_SendmsgTCP/1/256M 1.579GB/s 1.607GB/s 1.593GB/s PiperOrigin-RevId: 278940079
2019-10-30Deep copy dispatcher views.Kevin Krakauer
When VectorisedViews were passed up the stack from packet_dispatchers, we were passing a sub-slice of the dispatcher's views fields. The dispatchers then immediately set those views to nil. This wasn't caught before because every implementer copied the data in these views before returning. PiperOrigin-RevId: 277615351
2019-10-30Store endpoints inside multiPortEndpoint in a sorted orderAndrei Vagin
It is required to guarantee the same order of endpoints after save/restore. PiperOrigin-RevId: 277598665
2019-10-29Allow waiting for Endpoint worker goroutines to finish.Ian Gudger
Updates #837 PiperOrigin-RevId: 277325162
2019-10-21AF_PACKET support for netstack (aka epsocket).Kevin Krakauer
Like (AF_INET, SOCK_RAW) sockets, AF_PACKET sockets require CAP_NET_RAW. With runsc, you'll need to pass `--net-raw=true` to enable them. Binding isn't supported yet. PiperOrigin-RevId: 275909366
2019-10-18Fix typo while initializing protocol for UDP endpoints.Mithun Iyer
Fixes #763 PiperOrigin-RevId: 275563222
2019-10-14Internal change.gVisor bot
PiperOrigin-RevId: 274700093
2019-10-14Reorder BUILD license and load functions in netstack.Kevin Krakauer
PiperOrigin-RevId: 274672346
2019-10-09Internal change.gVisor bot
PiperOrigin-RevId: 273861936
2019-10-07Implement IP_TTL.Ian Gudger
Also change the default TTL to 64 to match Linux. PiperOrigin-RevId: 273430341
2019-10-03Implement proper local broadcast behaviorChris Kuiper
The behavior for sending and receiving local broadcast (255.255.255.255) traffic is as follows: Outgoing -------- * A broadcast packet sent on a socket that is bound to an interface goes out that interface * A broadcast packet sent on an unbound socket follows the route table to select the outgoing interface + if an explicit route entry exists for 255.255.255.255/32, use that one + else use the default route * Broadcast packets are looped back and delivered following the rules for incoming packets (see next). This is the same behavior as for multicast packets, except that it cannot be disabled via sockopt. Incoming -------- * Sockets wishing to receive broadcast packets must bind to either INADDR_ANY (0.0.0.0) or INADDR_BROADCAST (255.255.255.255). No other socket receives broadcast packets. * Broadcast packets are multiplexed to all sockets matching it. This is the same behavior as for multicast packets. * A socket can bind to 255.255.255.255:<port> and then receive its own broadcast packets sent to 255.255.255.255:<port> In addition, this change implicitly fixes an issue with multicast reception. If two sockets want to receive a given multicast stream and one is bound to ANY while the other is bound to the multicast address, only one of them will receive the traffic. PiperOrigin-RevId: 272792377
2019-09-27Implement SO_BINDTODEVICE sockoptgVisor bot
PiperOrigin-RevId: 271644926
2019-09-25Remove centralized registration of protocols.Kevin Krakauer
Also removes the need for protocol names. PiperOrigin-RevId: 271186030
2019-09-23netstack: convert more socket options to {Set,Get}SockOptIntAndrei Vagin
PiperOrigin-RevId: 270763208
2019-09-12Implement splice methods for pipes and sockets.Adin Scannell
This also allows the tee(2) implementation to be enabled, since dup can now be properly supported via WriteTo. Note that this change necessitated some minor restructoring with the fs.FileOperations splice methods. If the *fs.File is passed through directly, then only public API methods are accessible, which will deadlock immediately since the locking is already done by fs.Splice. Instead, we pass through an abstract io.Reader or io.Writer, which elide locks and use the underlying fs.FileOperations directly. PiperOrigin-RevId: 268805207
2019-09-12Remove go_test from go_stateify and go_marshalMichael Pratt
They are no-ops, so the standard rule works fine. PiperOrigin-RevId: 268776264
2019-09-09Fix ephemeral port leak.Ian Gudger
Fix a bug where udp.(*endpoint).Disconnect [accessible in gVisor via epsocket.(*SocketOperations).Connect with AF_UNSPEC] would leak a port reservation if the socket/endpoint had an ephemeral port assigned to it. glibc's getaddrinfo uses connect with AF_UNSPEC, causing each call of getaddrinfo to leak a port. Call getaddrinfo too many times and you run out of ports (shows up as connect returning EAGAIN and getaddrinfo returning EAI_NONAME "Name or service not known"). PiperOrigin-RevId: 268071160
2019-09-06Remove reundant global tcpip.LinkEndpointID.Ian Gudger
PiperOrigin-RevId: 267709597
2019-09-03Make UDP traceroute work.Bhasker Hariharan
Adds support to generate Port Unreachable messages for UDP datagrams received on a port for which there is no valid endpoint. Fixes #703 PiperOrigin-RevId: 267034418
2019-08-29Implement /proc/net/udp.Rahat Mahmood
PiperOrigin-RevId: 266229756
2019-08-21Support binding to multicast and broadcast addressesChris Kuiper
This fixes the issue of not being able to bind to either a multicast or broadcast address as well as to send and receive data from it. The way to solve this is to treat these addresses similar to the ANY address and register their transport endpoint ID with the global stack's demuxer rather than the NIC's. That way there is no need to require an endpoint with that multicast or broadcast address. The stack's demuxer is in fact the only correct one to use, because neither broadcast- nor multicast-bound sockets care which NIC a packet was received on (for multicast a join is still needed to receive packets on a NIC). I also took the liberty of refactoring udp_test.go to consolidate a lot of duplicate code and make it easier to create repetitive tests that test the same feature for a variety of packet and socket types. For this purpose I created a "flowType" that represents two things: 1) the type of packet being sent or received and 2) the type of socket used for the test. E.g., a "multicastV4in6" flow represents a V4-mapped multicast packet run through a V6-dual socket. This allows writing significantly simpler tests. A nice example is testTTL(). PiperOrigin-RevId: 264766909
2019-08-21Use tcpip.Subnet in tcpip.RouteTamir Duberstein
This is the first step in replacing some of the redundant types with the standard library equivalents. PiperOrigin-RevId: 264706552
2019-08-16netstack: disconnect an unix socket only if the address family is AF_UNSPECAndrei Vagin
Linux allows to call connect for ANY and the zero port. PiperOrigin-RevId: 263892534
2019-08-15Don't dereference errors passed to panic()Tamir Duberstein
These errors are always pointers; there's no sense in dereferencing them in the panic call. Changed one false positive for clarity. PiperOrigin-RevId: 263611579
2019-08-15netstack: move resumption logic into *_state.goTamir Duberstein
13a98df rearranged some of this code in a way that broke compilation of the netstack-only export at github.com/google/netstack because *_state.go files are not included in that export. This commit moves resumption logic back into *_state.go, fixing the compilation breakage. PiperOrigin-RevId: 263601629
2019-08-14Replace uinptr with int64 when returning lengthsTamir Duberstein
This is in accordance with newer parts of the standard library. PiperOrigin-RevId: 263449916
2019-08-08netstack: Don't start endpoint goroutines too soon on restore.Rahat Mahmood
Endpoint protocol goroutines were previously started as part of loading the endpoint. This is potentially too soon, as resources used by these goroutine may not have been loaded. Protocol goroutines may perform meaningful work as soon as they're started (ex: incoming connect) which can cause them to indirectly access resources that haven't been loaded yet. This CL defers resuming all protocol goroutines until the end of restore. PiperOrigin-RevId: 262409429
2019-08-02Plumbing for iptables sockopts.Kevin Krakauer
PiperOrigin-RevId: 261413396
2019-07-31Test connecting UDP sockets to the ANY addressTamir Duberstein
This doesn't currently pass on gVisor. While I'm here, fix a bug where connecting to the v6-mapped v4 address doesn't work in gVisor. PiperOrigin-RevId: 260923961
2019-07-24Add support for a subnet prefix length on interface network addressesChris Kuiper
This allows the user code to add a network address with a subnet prefix length. The prefix length value is stored in the network endpoint and provided back to the user in the ProtocolAddress type. PiperOrigin-RevId: 259807693
2019-07-19Handle interfaceAddr and NIC options separately for IP_MULTICAST_IFChris Kuiper
This tweaks the handling code for IP_MULTICAST_IF to ignore the InterfaceAddr if a NICID is given. PiperOrigin-RevId: 258982541
2019-07-18net/tcp/setockopt: impelment setsockopt(fd, SOL_TCP, TCP_INQ)Andrei Vagin
PiperOrigin-RevId: 258859507
2019-07-03netstack/udp: connect with the AF_UNSPEC address family means disconnectAndrei Vagin
PiperOrigin-RevId: 256433283
2019-06-13Add support for TCP receive buffer auto tuning.Bhasker Hariharan
The implementation is similar to linux where we track the number of bytes consumed by the application to grow the receive buffer of a given TCP endpoint. This ensures that the advertised window grows at a reasonable rate to accomodate for the sender's rate and prevents large amounts of data being held in stack buffers if the application is not actively reading or not reading fast enough. The original paper that was used to implement the linux receive buffer auto- tuning is available @ https://public.lanl.gov/radiant/pubs/drs/lacsi2001.pdf NOTE: Linux does not implement DRS as defined in that paper, it's just a good reference to understand the solution space. Updates #230 PiperOrigin-RevId: 253168283
2019-06-13Update canonical repository.Adin Scannell
This can be merged after: https://github.com/google/gvisor-website/pull/77 or https://github.com/google/gvisor-website/pull/78 PiperOrigin-RevId: 253132620
2019-06-06Track and export socket state.Rahat Mahmood
This is necessary for implementing network diagnostic interfaces like /proc/net/{tcp,udp,unix} and sock_diag(7). For pass-through endpoints such as hostinet, we obtain the socket state from the backend. For netstack, we add explicit tracking of TCP states. PiperOrigin-RevId: 251934850
2019-05-24Remove unused wakersTamir Duberstein
These wakers are uselessly allocated and passed around; nothing ever listens for notifications on them. The code here appears to be vestigial, so removing it and allowing a nil waker to be passed seems appropriate. PiperOrigin-RevId: 249879320 Change-Id: Icd209fb77cc0dd4e5c49d7a9f2adc32bf88b4b71
2019-05-22UDP and TCP raw socket support.Kevin Krakauer
PiperOrigin-RevId: 249511348 Change-Id: I34539092cc85032d9473ff4dd308fc29dc9bfd6b
2019-05-02Support reception of multicast data on more than one socketChris Kuiper
This requires two changes: 1) Support for more than one socket to join a given multicast group. 2) Duplicate delivery of incoming multicast packets to all sockets listening for it. In addition, I tweaked the code (and added a test) to disallow duplicates IP_ADD_MEMBERSHIP calls for the same group and NIC. This is how Linux does it. PiperOrigin-RevId: 246437315 Change-Id: Icad8300b4a8c3f501d9b4cd283bd3beabef88b72
2019-04-29Change copyright notice to "The gVisor Authors"Michael Pratt
Based on the guidelines at https://opensource.google.com/docs/releasing/authors/. 1. $ rg -l "Google LLC" | xargs sed -i 's/Google LLC.*/The gVisor Authors./' 2. Manual fixup of "Google Inc" references. 3. Add AUTHORS file. Authors may request to be added to this file. 4. Point netstack AUTHORS to gVisor AUTHORS. Drop CONTRIBUTORS. Fixes #209 PiperOrigin-RevId: 245823212 Change-Id: I64530b24ad021a7d683137459cafc510f5ee1de9
2019-04-18tcpip/transport/udp: add Forwarder typeBen Burkert
Add a UDP forwarder for intercepting and forwarding UDP sessions. Change-Id: I2d83c900c1931adfc59a532dd4f6b33a0db406c9 PiperOrigin-RevId: 244293576
2019-04-09Add TCP checksum verification.Bhasker Hariharan
PiperOrigin-RevId: 242704699 Change-Id: I87db368ca343b3b4bf4f969b17d3aa4ce2f8bd4f
2019-04-02Add a raw socket transport endpoint and use it for raw ICMP sockets.Kevin Krakauer
Having raw socket code together will make it easier to add support for other raw network protocols. Currently, only ICMP uses the raw endpoint. However, adding support for other protocols such as UDP shouldn't be much more difficult than adding a few switch cases. PiperOrigin-RevId: 241564875 Change-Id: I77e03adafe4ce0fd29ba2d5dfdc547d2ae8f25bf
2019-03-29Fix incorrect checksums in TCP and UDP tests.Bhasker Hariharan
PiperOrigin-RevId: 241025361 Change-Id: I292e7aea9a4b294b11e4f736e107010d9524586b
2019-03-28netstack/fdbased: add generic segmentation offload (GSO) supportAndrei Vagin
The linux packet socket can handle GSO packets, so we can segment packets to 64K instead of the MTU which is usually 1500. Here are numbers for the nginx-1m test: runsc: 579330.01 [Kbytes/sec] received runsc-gso: 1794121.66 [Kbytes/sec] received runc: 2122139.06 [Kbytes/sec] received and for tcp_benchmark: $ tcp_benchmark --duration 15 --ideal [ 4] 0.0-15.0 sec 86647 MBytes 48456 Mbits/sec $ tcp_benchmark --client --duration 15 --ideal [ 4] 0.0-15.0 sec 2173 MBytes 1214 Mbits/sec $ tcp_benchmark --client --duration 15 --ideal --gso 65536 [ 4] 0.0-15.0 sec 19357 MBytes 10825 Mbits/sec PiperOrigin-RevId: 240809103 Change-Id: I2637f104db28b5d4c64e1e766c610162a195775a
2019-03-26netstack: Don't exclude length when a pseudo-header checksum is calculatedAndrei Vagin
This is a preparation for GSO changes (cl/234508902). RELNOTES[gofers]: Refactor checksum code to include length, which it already did, but in a convoluted way. Should be a no-op. PiperOrigin-RevId: 240460794 Change-Id: I537381bc670b5a9f5d70a87aa3eb7252e8f5ace2
2019-03-13Remove unused function.Kevin Krakauer
PiperOrigin-RevId: 238336475 Change-Id: I8131e04699028246ebc233953ebb3feca5673940