wireguard-go - WireGuard Go

Age	Commit message (Collapse)	Author
2023-10-21	conn: set unused OOB to zero length	Jason A. Donenfeld
	Otherwise in the event that we're using GSO without sticky sockets, we pass garbage OOB buffers to sendmmsg, making a EINVAL, when GSO doesn't set its header. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2023-10-21	conn: fix cmsg data padding calculation for gso	Jason A. Donenfeld
	Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2023-10-21	conn: separate gso and sticky control	Jason A. Donenfeld
	Android wants GSO but not sticky. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2023-10-18	conn: harmonize GOOS checks between "linux" and "android"	Jason A. Donenfeld
	Otherwise GRO gets enabled on Android, but the conn doesn't use it, resulting in bundled packets being discarded. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2023-10-18	conn: simplify supportsUDPOffload	Jason A. Donenfeld
	This allows a kernel to support UDP_GRO while not supporting UDP_SEGMENT. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2023-10-10	conn, device: use UDP GSO and GRO on Linux	Jordan Whited
	StdNetBind probes for UDP GSO and GRO support at runtime. UDP GSO is dependent on checksum offload support on the egress netdev. UDP GSO will be disabled in the event sendmmsg() returns EIO, which is a strong signal that the egress netdev does not support checksum offload. The iperf3 results below demonstrate the effect of this commit between two Linux computers with i5-12400 CPUs. There is roughly ~13us of round trip latency between them. The first result is from commit 052af4a without UDP GSO or GRO. Starting Test: protocol: TCP, 1 streams, 131072 byte blocks [ ID] Interval Transfer Bitrate Retr Cwnd [ 5] 0.00-10.00 sec 9.85 GBytes 8.46 Gbits/sec 1139 3.01 MBytes - - - - - - - - - - - - - - - - - - - - - - - - - Test Complete. Summary Results: [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-10.00 sec 9.85 GBytes 8.46 Gbits/sec 1139 sender [ 5] 0.00-10.04 sec 9.85 GBytes 8.42 Gbits/sec receiver The second result is with UDP GSO and GRO. Starting Test: protocol: TCP, 1 streams, 131072 byte blocks [ ID] Interval Transfer Bitrate Retr Cwnd [ 5] 0.00-10.00 sec 12.3 GBytes 10.6 Gbits/sec 232 3.15 MBytes - - - - - - - - - - - - - - - - - - - - - - - - - Test Complete. Summary Results: [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-10.00 sec 12.3 GBytes 10.6 Gbits/sec 232 sender [ 5] 0.00-10.04 sec 12.3 GBytes 10.6 Gbits/sec receiver Reviewed-by: Adrian Dewhurst <adrian@tailscale.com> Signed-off-by: Jordan Whited <jordan@tailscale.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2023-07-04	all: adjust build tags for wasip1/wasm	Brad Fitzpatrick
	Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2023-06-27	conn: windows: add missing return statement in DstToString AF_INET	springhack
	Signed-off-by: SpringHack <springhack@live.cn> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2023-06-27	conn: store IP_PKTINFO cmsg in StdNetendpoint src	James Tucker
	Replace the src storage inside StdNetEndpoint with a copy of the raw control message buffer, to reduce allocation and perform less work on a per-packet basis. Signed-off-by: James Tucker <james@tailscale.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2023-03-24	conn: move booleans to bottom of StdNetBind struct	Jason A. Donenfeld
	This results in a more compact structure. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2023-03-24	conn: use ipv6 message pool for ipv6 receiving	Jason A. Donenfeld
	Looks like a simple copy&paste error. Fixes: 9e2f386 ("conn, device, tun: implement vectorized I/O on Linux") Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2023-03-24	conn: fix StdNetEndpoint data race by dynamically allocating endpoints	Jordan Whited
	In 9e2f386 ("conn, device, tun: implement vectorized I/O on Linux"), the Linux-specific Bind implementation was collapsed into StdNetBind. This introduced a race on StdNetEndpoint from getSrcFromControl() and setSrcControl(). Remove the sync.Pool involved in the race, and simplify StdNetBind's receive path to allocate StdNetEndpoint on the heap instead, with the intent for it to be cleaned up by the GC, later. This essentially reverts ef5c587 ("conn: remove the final alloc per packet receive"), adding back that allocation, unfortunately. This does slightly increase resident memory usage in higher throughput scenarios. StdNetBind is the only Bind implementation that was using this Endpoint recycling technique prior to this commit. This is considered a stop-gap solution, and there are plans to replace the allocation with a better mechanism. Reported-by: lsc <lsc@lv6.tw> Link: https://lore.kernel.org/wireguard/ac87f86f-6837-4e0e-ec34-1df35f52540e@lv6.tw/ Fixes: 9e2f386 ("conn, device, tun: implement vectorized I/O on Linux") Cc: Josh Bleecher Snyder <josharian@gmail.com> Reviewed-by: James Tucker <james@tailscale.com> Signed-off-by: Jordan Whited <jordan@tailscale.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2023-03-23	conn: disable sticky sockets on Android	Jason A. Donenfeld
	We can't have the netlink listener socket, so it's not possible to support it. Plus, android networking stack complexity makes it a bit tricky anyway, so best to leave it disabled. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2023-03-23	global: remove old style build tags	Jason A. Donenfeld
	Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2023-03-16	conn: fix getSrcFromControl() iteration	Jordan Whited
	We only expect a single control message in the normal case, but this would loop infinitely if there were more. Reviewed-by: Adrian Dewhurst <adrian@tailscale.com> Signed-off-by: Jordan Whited <jordan@tailscale.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2023-03-16	conn: use CmsgSpace() for ancillary data buf sizing	Jordan Whited
	CmsgLen() does not account for data alignment. Reviewed-by: Adrian Dewhurst <adrian@tailscale.com> Signed-off-by: Jordan Whited <jordan@tailscale.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2023-03-13	global: buff -> buf	Jason A. Donenfeld
	This always struck me as kind of weird and non-standard. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2023-03-10	conn: use right cmsghdr len types on 32-bit in sticky test	Jason A. Donenfeld
	Cmsghdr uses uint32 and uint64 on 32-bit and 64-bit respectively for the Len member, which makes assignments and comparisons slightly more irksome than usual. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2023-03-10	conn: make StdNetBind.BatchSize() return 1 for non-Linux	Jordan Whited
	This commit updates StdNetBind.BatchSize() to return 1 instead of IdealBatchSize for non-Linux platforms. Non-Linux platforms do not yet benefit from values > 1, which only serves to increase memory consumption. Reviewed-by: James Tucker <james@tailscale.com> Signed-off-by: Jordan Whited <jordan@tailscale.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2023-03-10	conn: ensure control message size is respected in StdNetBind	Jordan Whited
	This commit re-slices received control messages in StdNetBind to the value the OS reports on a successful read. Previously, the len of this slice would always be srcControlSize, which could result in control message values leaking through a sync.Pool round trip. This is unlikely with the IP_PKTINFO socket option set successfully, but should be guarded against. Reviewed-by: James Tucker <james@tailscale.com> Signed-off-by: Jordan Whited <jordan@tailscale.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2023-03-10	conn: fix StdNetBind fallback on Windows	Jordan Whited
	If RIO is unavailable, NewWinRingBind() falls back to StdNetBind. StdNetBind uses x/net/ipv{4,6}.PacketConn for sending and receiving datagrams, specifically via the {Read,Write}Batch methods. These methods are unimplemented on Windows and will return runtime errors as a result. Additionally, only Linux benefits from these x/net types for reading and writing, so we update StdNetBind to fall back to the standard library net package for all platforms other than Linux. Reviewed-by: James Tucker <james@tailscale.com> Signed-off-by: Jordan Whited <jordan@tailscale.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2023-03-10	conn: inch BatchSize toward being non-dynamic	Jason A. Donenfeld
	There's not really a use at the moment for making this configurable, and once bind_windows.go behaves like bind_std.go, we'll be able to use constants everywhere. So begin that simplification now. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2023-03-10	conn: set SO_{SND,RCV}BUF to 7MB on the Bind UDP socket	Jordan Whited
	The conn.Bind UDP sockets' send and receive buffers are now being sized to 7MB, whereas they were previously inheriting the system defaults. The system defaults are considerably small and can result in dropped packets on high speed links. By increasing the size of these buffers we are able to achieve higher throughput in the aforementioned case. The iperf3 results below demonstrate the effect of this commit between two Linux computers with 32-core Xeon Platinum CPUs @ 2.9Ghz. There is roughly ~125us of round trip latency between them. The first result is from commit 792b49c which uses the system defaults, e.g. net.core.{r,w}mem_max = 212992. The TCP retransmits are correlated with buffer full drops on both sides. Starting Test: protocol: TCP, 1 streams, 131072 byte blocks [ ID] Interval Transfer Bitrate Retr Cwnd [ 5] 0.00-10.00 sec 4.74 GBytes 4.08 Gbits/sec 2742 285 KBytes - - - - - - - - - - - - - - - - - - - - - - - - - Test Complete. Summary Results: [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-10.00 sec 4.74 GBytes 4.08 Gbits/sec 2742 sender [ 5] 0.00-10.04 sec 4.74 GBytes 4.06 Gbits/sec receiver The second result is after increasing SO_{SND,RCV}BUF to 7MB, i.e. applying this commit. Starting Test: protocol: TCP, 1 streams, 131072 byte blocks [ ID] Interval Transfer Bitrate Retr Cwnd [ 5] 0.00-10.00 sec 6.14 GBytes 5.27 Gbits/sec 0 3.15 MBytes - - - - - - - - - - - - - - - - - - - - - - - - - Test Complete. Summary Results: [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-10.00 sec 6.14 GBytes 5.27 Gbits/sec 0 sender [ 5] 0.00-10.04 sec 6.14 GBytes 5.25 Gbits/sec receiver The specific value of 7MB is chosen as it is the max supported by a default configuration of macOS. A value greater than 7MB may further benefit throughput for environments with higher network latency and lower CPU clocks, but will also increase latency under load (bufferbloat). Some platforms will silently clamp the value to other maximums. On Linux, we use SO_{SND,RCV}BUFFORCE in case 7MB is beyond net.core.{r,w}mem_max. Co-authored-by: James Tucker <james@tailscale.com> Signed-off-by: James Tucker <james@tailscale.com> Signed-off-by: Jordan Whited <jordan@tailscale.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2023-03-10	conn, device, tun: implement vectorized I/O on Linux	Jordan Whited
	Implement TCP offloading via TSO and GRO for the Linux tun.Device, which is made possible by virtio extensions in the kernel's TUN driver. Delete conn.LinuxSocketEndpoint in favor of a collapsed conn.StdNetBind. conn.StdNetBind makes use of recvmmsg() and sendmmsg() on Linux. All platforms now fall under conn.StdNetBind, except for Windows, which remains in conn.WinRingBind, which still needs to be adjusted to handle multiple packets. Also refactor sticky sockets support to eventually be applicable on platforms other than just Linux. However Linux remains the sole platform that fully implements it for now. Co-authored-by: James Tucker <james@tailscale.com> Signed-off-by: James Tucker <james@tailscale.com> Signed-off-by: Jordan Whited <jordan@tailscale.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2023-03-10	conn, device, tun: implement vectorized I/O plumbing	Jordan Whited
	Accept packet vectors for reading and writing in the tun.Device and conn.Bind interfaces, so that the internal plumbing between these interfaces now passes a vector of packets. Vectors move untouched between these interfaces, i.e. if 128 packets are received from conn.Bind.Read(), 128 packets are passed to tun.Device.Write(). There is no internal buffering. Currently, existing implementations are only adjusted to have vectors of length one. Subsequent patches will improve that. Also, as a related fixup, use the unix and windows packages rather than the syscall package when possible. Co-authored-by: James Tucker <james@tailscale.com> Signed-off-by: James Tucker <james@tailscale.com> Signed-off-by: Jordan Whited <jordan@tailscale.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2023-02-07	global: bump copyright year	Jason A. Donenfeld
	Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-09-20	global: bump copyright year	Jason A. Donenfeld
	Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-09-04	all: use Go 1.19 and its atomic types	Brad Fitzpatrick
	Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-07-04	conn, device, tun: set CLOEXEC on fds	Brad Fitzpatrick
	Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-04-07	conn: remove the final alloc per packet receive	Josh Bleecher Snyder
	This does bind_std only; other platforms remain. The remaining alloc per iteration in the Throughput benchmark comes from the tuntest package, and should not appear in regular use. name old time/op new time/op delta Latency-10 25.2µs ± 1% 25.0µs ± 0% -0.58% (p=0.006 n=10+10) Throughput-10 2.44µs ± 3% 2.41µs ± 2% ~ (p=0.140 n=10+8) name old alloc/op new alloc/op delta Latency-10 854B ± 5% 741B ± 3% -13.22% (p=0.000 n=10+10) Throughput-10 265B ±34% 267B ±39% ~ (p=0.670 n=10+10) name old allocs/op new allocs/op delta Latency-10 16.0 ± 0% 14.0 ± 0% -12.50% (p=0.000 n=10+10) Throughput-10 2.00 ± 0% 1.00 ± 0% -50.00% (p=0.000 n=10+10) name old packet-loss new packet-loss delta Throughput-10 0.01 ±82% 0.01 ±282% ~ (p=0.321 n=9+8) Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-03-17	conn: use netip for std bind	Jason A. Donenfeld
	Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-03-16	all: update to Go 1.18	Josh Bleecher Snyder
	Bump go.mod and README. Switch to upstream net/netip. Use strings.Cut. Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
2021-12-09	global: apply gofumpt	Jason A. Donenfeld
	Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-11-23	global: use netip where possible now	Jason A. Donenfeld
	There are more places where we'll need to add it later, when Go 1.18 comes out with support for it in the "net" package. Also, allowedips still uses slices internally, which might be suboptimal. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-10-12	global: remove old-style build tags	Jason A. Donenfeld
	Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-10-11	conn,wintun: use unsafe.Slice instead of unsafeSlice	Jason A. Donenfeld
	Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-09-05	global: add new go 1.17 build comments	Jason A. Donenfeld
	Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-05-20	conn: linux: protect read fds	Jason A. Donenfeld
	The -1 protection was removed and the wrong error was returned, causing us to read from a bogus fd. As well, remove the useless closures that aren't doing anything, since this is all synchronized anyway. Fixes: 10533c3 ("all: make conn.Bind.Open return a slice of receive functions") Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-05-11	conn: windows: set count=0 on retry	Jason A. Donenfeld
	When retrying, if count is not 0, we forget to dequeue another request, and so the ring fills up and errors out. Reported-by: Sascha Dierberg <dierberg@dresearch-fe.de> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-04-26	conn: windows: do not error out when receiving UDP jumbogram	Jason A. Donenfeld
	If we receive a large UDP packet, don't return an error to receive.go, which then terminates the receive loop. Instead, simply retry. Considering Winsock's general finickiness, we might consider other places where an attacker on the wire can generate error conditions like this. Reported-by: Sascha Dierberg <sascha.dierberg@gmail.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-04-12	conn: reconstruct v4 vs v6 receive function based on symtab	Jason A. Donenfeld
	This is kind of gross but it's better than the alternatives. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-04-09	conn: windows: reset ring to starting position after free	Jason A. Donenfeld
	Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-04-09	conn: windows: compare head and tail properly	Jason A. Donenfeld
	By not comparing these with the modulo, the ring became nearly never full, resulting in completion queue buffers filling up prematurely. Reported-by: Joshua Sjoding <joshua.sjoding@scjalliance.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-04-09	winrio: test that IOCP-based RIO is supported	Jason A. Donenfeld
	Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-04-02	all: make conn.Bind.Open return a slice of receive functions	Josh Bleecher Snyder
	Instead of hard-coding exactly two sources from which to receive packets (an IPv4 source and an IPv6 source), allow the conn.Bind to specify a set of sources. Beneficial consequences: * If there's no IPv6 support on a system, conn.Bind.Open can choose not to return a receive function for it, which is simpler than tracking that state in the bind. This simplification removes existing data races from both conn.StdNetBind and bindtest.ChannelBind. * If there are more than two sources on a system, the conn.Bind no longer needs to add a separate muxing layer. Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
2021-04-02	conn: winrio: pass key parameter into struct	Jason A. Donenfeld
	Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-03-30	conn: document retry loop in StdNetBind.Open	Josh Bleecher Snyder
	It's not obvious on a first read what the loop is doing. Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
2021-03-30	conn: use local ipvN vars in StdNetBind.Open	Josh Bleecher Snyder
	This makes it clearer that they are fresh on each attempt, and avoids the bookkeeping required to clearing them on failure. Also, remove an unnecessary err != nil. Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
2021-03-30	conn: unify code in StdNetBind.Send	Josh Bleecher Snyder
	The sending code is identical for ipv4 and ipv6; select the conn, then use it. Signed-off-by: Josh Bleecher Snyder <josharian@gmail.com>
2021-03-08	conn: linux: unexport mutex	Jason A. Donenfeld
	Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>