summaryrefslogtreecommitdiffhomepage
path: root/conn
AgeCommit message (Collapse)Author
2023-03-10conn: set SO_{SND,RCV}BUF to 7MB on the Bind UDP socketJordan Whited
The conn.Bind UDP sockets' send and receive buffers are now being sized to 7MB, whereas they were previously inheriting the system defaults. The system defaults are considerably small and can result in dropped packets on high speed links. By increasing the size of these buffers we are able to achieve higher throughput in the aforementioned case. The iperf3 results below demonstrate the effect of this commit between two Linux computers with 32-core Xeon Platinum CPUs @ 2.9Ghz. There is roughly ~125us of round trip latency between them. The first result is from commit 792b49c which uses the system defaults, e.g. net.core.{r,w}mem_max = 212992. The TCP retransmits are correlated with buffer full drops on both sides. Starting Test: protocol: TCP, 1 streams, 131072 byte blocks [ ID] Interval Transfer Bitrate Retr Cwnd [ 5] 0.00-10.00 sec 4.74 GBytes 4.08 Gbits/sec 2742 285 KBytes - - - - - - - - - - - - - - - - - - - - - - - - - Test Complete. Summary Results: [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-10.00 sec 4.74 GBytes 4.08 Gbits/sec 2742 sender [ 5] 0.00-10.04 sec 4.74 GBytes 4.06 Gbits/sec receiver The second result is after increasing SO_{SND,RCV}BUF to 7MB, i.e. applying this commit. Starting Test: protocol: TCP, 1 streams, 131072 byte blocks [ ID] Interval Transfer Bitrate Retr Cwnd [ 5] 0.00-10.00 sec 6.14 GBytes 5.27 Gbits/sec 0 3.15 MBytes - - - - - - - - - - - - - - - - - - - - - - - - - Test Complete. Summary Results: [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-10.00 sec 6.14 GBytes 5.27 Gbits/sec 0 sender [ 5] 0.00-10.04 sec 6.14 GBytes 5.25 Gbits/sec receiver The specific value of 7MB is chosen as it is the max supported by a default configuration of macOS. A value greater than 7MB may further benefit throughput for environments with higher network latency and lower CPU clocks, but will also increase latency under load (bufferbloat). Some platforms will silently clamp the value to other maximums. On Linux, we use SO_{SND,RCV}BUFFORCE in case 7MB is beyond net.core.{r,w}mem_max. Co-authored-by: James Tucker <james@tailscale.com> Signed-off-by: James Tucker <james@tailscale.com> Signed-off-by: Jordan Whited <jordan@tailscale.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2023-03-10conn, device, tun: implement vectorized I/O on LinuxJordan Whited
Implement TCP offloading via TSO and GRO for the Linux tun.Device, which is made possible by virtio extensions in the kernel's TUN driver. Delete conn.LinuxSocketEndpoint in favor of a collapsed conn.StdNetBind. conn.StdNetBind makes use of recvmmsg() and sendmmsg() on Linux. All platforms now fall under conn.StdNetBind, except for Windows, which remains in conn.WinRingBind, which still needs to be adjusted to handle multiple packets. Also refactor sticky sockets support to eventually be applicable on platforms other than just Linux. However Linux remains the sole platform that fully implements it for now. Co-authored-by: James Tucker <james@tailscale.com> Signed-off-by: James Tucker <james@tailscale.com> Signed-off-by: Jordan Whited <jordan@tailscale.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2023-03-10conn, device, tun: implement vectorized I/O plumbingJordan Whited
Accept packet vectors for reading and writing in the tun.Device and conn.Bind interfaces, so that the internal plumbing between these interfaces now passes a vector of packets. Vectors move untouched between these interfaces, i.e. if 128 packets are received from conn.Bind.Read(), 128 packets are passed to tun.Device.Write(). There is no internal buffering. Currently, existing implementations are only adjusted to have vectors of length one. Subsequent patches will improve that. Also, as a related fixup, use the unix and windows packages rather than the syscall package when possible. Co-authored-by: James Tucker <james@tailscale.com> Signed-off-by: James Tucker <james@tailscale.com> Signed-off-by: Jordan Whited <jordan@tailscale.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2023-02-07global: bump copyright yearJason A. Donenfeld
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-09-20global: bump copyright yearJason A. Donenfeld
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-09-04all: use Go 1.19 and its atomic typesBrad Fitzpatrick
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-07-04conn, device, tun: set CLOEXEC on fdsBrad Fitzpatrick
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-04-07conn: remove the final alloc per packet receiveJosh Bleecher Snyder
This does bind_std only; other platforms remain. The remaining alloc per iteration in the Throughput benchmark comes from the tuntest package, and should not appear in regular use. name old time/op new time/op delta Latency-10 25.2µs ± 1% 25.0µs ± 0% -0.58% (p=0.006 n=10+10) Throughput-10 2.44µs ± 3% 2.41µs ± 2% ~ (p=0.140 n=10+8) name old alloc/op new alloc/op delta Latency-10 854B ± 5% 741B ± 3% -13.22% (p=0.000 n=10+10) Throughput-10 265B ±34% 267B ±39% ~ (p=0.670 n=10+10) name old allocs/op new allocs/op delta Latency-10 16.0 ± 0% 14.0 ± 0% -12.50% (p=0.000 n=10+10) Throughput-10 2.00 ± 0% 1.00 ± 0% -50.00% (p=0.000 n=10+10) name old packet-loss new packet-loss delta Throughput-10 0.01 ±82% 0.01 ±282% ~ (p=0.321 n=9+8) Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-03-17conn: use netip for std bindJason A. Donenfeld
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-03-16all: update to Go 1.18Josh Bleecher Snyder
Bump go.mod and README. Switch to upstream net/netip. Use strings.Cut. Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2021-12-09global: apply gofumptJason A. Donenfeld
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-11-23global: use netip where possible nowJason A. Donenfeld
There are more places where we'll need to add it later, when Go 1.18 comes out with support for it in the "net" package. Also, allowedips still uses slices internally, which might be suboptimal. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-10-12global: remove old-style build tagsJason A. Donenfeld
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-10-11conn,wintun: use unsafe.Slice instead of unsafeSliceJason A. Donenfeld
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-09-05global: add new go 1.17 build commentsJason A. Donenfeld
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-05-20conn: linux: protect read fdsJason A. Donenfeld
The -1 protection was removed and the wrong error was returned, causing us to read from a bogus fd. As well, remove the useless closures that aren't doing anything, since this is all synchronized anyway. Fixes: 10533c3 ("all: make conn.Bind.Open return a slice of receive functions") Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-05-11conn: windows: set count=0 on retryJason A. Donenfeld
When retrying, if count is not 0, we forget to dequeue another request, and so the ring fills up and errors out. Reported-by: Sascha Dierberg <dierberg@dresearch-fe.de> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-04-26conn: windows: do not error out when receiving UDP jumbogramJason A. Donenfeld
If we receive a large UDP packet, don't return an error to receive.go, which then terminates the receive loop. Instead, simply retry. Considering Winsock's general finickiness, we might consider other places where an attacker on the wire can generate error conditions like this. Reported-by: Sascha Dierberg <sascha.dierberg@gmail.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-04-12conn: reconstruct v4 vs v6 receive function based on symtabJason A. Donenfeld
This is kind of gross but it's better than the alternatives. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-04-09conn: windows: reset ring to starting position after freeJason A. Donenfeld
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-04-09conn: windows: compare head and tail properlyJason A. Donenfeld
By not comparing these with the modulo, the ring became nearly never full, resulting in completion queue buffers filling up prematurely. Reported-by: Joshua Sjoding <joshua.sjoding@scjalliance.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-04-09winrio: test that IOCP-based RIO is supportedJason A. Donenfeld
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-04-02all: make conn.Bind.Open return a slice of receive functionsJosh Bleecher Snyder
Instead of hard-coding exactly two sources from which to receive packets (an IPv4 source and an IPv6 source), allow the conn.Bind to specify a set of sources. Beneficial consequences: * If there's no IPv6 support on a system, conn.Bind.Open can choose not to return a receive function for it, which is simpler than tracking that state in the bind. This simplification removes existing data races from both conn.StdNetBind and bindtest.ChannelBind. * If there are more than two sources on a system, the conn.Bind no longer needs to add a separate muxing layer. Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
2021-04-02conn: winrio: pass key parameter into structJason A. Donenfeld
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-03-30conn: document retry loop in StdNetBind.OpenJosh Bleecher Snyder
It's not obvious on a first read what the loop is doing. Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
2021-03-30conn: use local ipvN vars in StdNetBind.OpenJosh Bleecher Snyder
This makes it clearer that they are fresh on each attempt, and avoids the bookkeeping required to clearing them on failure. Also, remove an unnecessary err != nil. Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
2021-03-30conn: unify code in StdNetBind.SendJosh Bleecher Snyder
The sending code is identical for ipv4 and ipv6; select the conn, then use it. Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
2021-03-08conn: linux: unexport mutexJason A. Donenfeld
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-02-25conn: implement RIO for fast Windows UDP socketsJason A. Donenfeld
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-02-23device: test up/down using virtual connJason A. Donenfeld
This prevents port clashing bugs. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-02-23conn: make binds replacableJason A. Donenfeld
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-02-16conn: bump to 1.16 and get rid of NetErrClosed hackJason A. Donenfeld
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-02-10conn: close old fd before trying againJason A. Donenfeld
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-02-09conn: use errors.Is for unwrappingJason A. Donenfeld
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-02-09conn: try harder to have v4 and v6 ports agreeJason A. Donenfeld
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-01-28global: bump copyrightJason A. Donenfeld
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-01-26conn: fix interface parameter name in Bind interface docsBrad Fitzpatrick
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2021-01-20device: allow compiling with Go 1.15Jason A. Donenfeld
Until we depend on Go 1.16 (which isn't released yet), alias our own variable to the private member of the net package. This will allow an easy find replace to make this go away when we eventually switch to 1.16. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-01-20conn: remove _ method receiverJosh Bleecher Snyder
Minor style fix. Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2021-01-08device: receive: do not exit immediately on transient UDP receive errorsJason A. Donenfeld
Some users report seeing lines like: > Routine: receive incoming IPv4 - stopped Popping up unexpectedly. Let's sleep and try again before failing, and also log the error, and perhaps we'll eventually understand this situation better in future versions. Because we have to distinguish between the socket being closed explicitly and whatever error this is, we bump the module to require Go 1.16. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-01-07conn: linux: do not allow ReceiveIPvX to race with CloseJason A. Donenfeld
If Close is called after ReceiveIPvX, then ReceiveIPvX will block on an invalid or potentially reused fd. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-01-07conn: do not SO_REUSEADDR on linuxJason A. Donenfeld
SO_REUSEADDR does not make sense for unicast UDP sockets. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2020-06-22conn: add comments saying what uses these interfacesDavid Crawshaw
Signed-off-by: David Crawshaw <crawshaw@tailscale.com>
2020-06-07conn: unbreak boundif on androidJason A. Donenfeld
Another thing never tested ever. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2020-06-07conn: remove useless commentJason A. Donenfeld
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2020-06-07conn: fix windows situation with boundifJason A. Donenfeld
This was evidently never tested before committing. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2020-05-02global: update header comments and modulesJason A. Donenfeld
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2020-05-02conn: introduce new package that splits out the Bind and Endpoint typesDavid Crawshaw
The sticky socket code stays in the device package for now, as it reaches deeply into the peer list. This is the first step in an effort to split some code out of the very busy device package. Signed-off-by: David Crawshaw <crawshaw@tailscale.com>