Age | Commit message (Collapse) | Author |
|
Implement UDP GSO and GRO for the Linux tun.Device, which is made
possible by virtio extensions in the kernel's TUN driver starting in
v6.2.
secnetperf, a QUIC benchmark utility from microsoft/msquic@8e1eb1a, is
used to demonstrate the effect of this commit between two Linux
computers with i5-12400 CPUs. There is roughly ~13us of round trip
latency between them. secnetperf was invoked with the following command
line options:
-stats:1 -exec:maxtput -test:tput -download:10000 -timed:1 -encrypt:0
The first result is from commit 2e0774f without UDP GSO/GRO on the TUN.
[conn][0x55739a144980] STATS: EcnCapable=0 RTT=3973 us
SendTotalPackets=55859 SendSuspectedLostPackets=61
SendSpuriousLostPackets=59 SendCongestionCount=27
SendEcnCongestionCount=0 RecvTotalPackets=2779122
RecvReorderedPackets=0 RecvDroppedPackets=0
RecvDuplicatePackets=0 RecvDecryptionFailures=0
Result: 3654977571 bytes @ 2922821 kbps (10003.972 ms).
The second result is with UDP GSO/GRO on the TUN.
[conn][0x56493dfd09a0] STATS: EcnCapable=0 RTT=1216 us
SendTotalPackets=165033 SendSuspectedLostPackets=64
SendSpuriousLostPackets=61 SendCongestionCount=53
SendEcnCongestionCount=0 RecvTotalPackets=11845268
RecvReorderedPackets=25267 RecvDroppedPackets=0
RecvDuplicatePackets=0 RecvDecryptionFailures=0
Result: 15574671184 bytes @ 12458214 kbps (10001.222 ms).
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
The length of a packet read from the underlying TUN device may exceed
the length of a supplied buffer when MTU exceeds device.MaxMessageSize.
Reviewed-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
Signed-off-by: James Tucker <james@tailscale.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
Close closes the events channel, resulting in a panic from send on
closed channel.
Reported-By: Brad Fitzpatrick <brad@tailscale.com>
Signed-off-by: James Tucker <james@tailscale.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
IPv4 header and pseudo header checksums were being computed on every
merge operation. Additionally, virtioNetHdr was being written at the
same time. This delays those operations until after all coalescing has
occurred.
Reviewed-by: Adrian Dewhurst <adrian@tailscale.com>
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
$ benchstat old.txt new.txt
goos: linux
goarch: amd64
pkg: golang.zx2c4.com/wireguard/tun
cpu: 12th Gen Intel(R) Core(TM) i5-12400
│ old.txt │ new.txt │
│ sec/op │ sec/op vs base │
Checksum/64-12 10.670n ± 2% 4.769n ± 0% -55.30% (p=0.000 n=10)
Checksum/128-12 19.665n ± 2% 8.032n ± 0% -59.16% (p=0.000 n=10)
Checksum/256-12 37.68n ± 1% 16.06n ± 0% -57.37% (p=0.000 n=10)
Checksum/512-12 76.61n ± 3% 32.13n ± 0% -58.06% (p=0.000 n=10)
Checksum/1024-12 160.55n ± 4% 64.25n ± 0% -59.98% (p=0.000 n=10)
Checksum/1500-12 231.05n ± 7% 94.12n ± 0% -59.26% (p=0.000 n=10)
Checksum/2048-12 309.5n ± 3% 128.5n ± 0% -58.48% (p=0.000 n=10)
Checksum/4096-12 603.8n ± 4% 257.2n ± 0% -57.41% (p=0.000 n=10)
Checksum/8192-12 1185.0n ± 3% 515.5n ± 0% -56.50% (p=0.000 n=10)
Checksum/9000-12 1328.5n ± 5% 564.8n ± 0% -57.49% (p=0.000 n=10)
Checksum/9001-12 1340.5n ± 3% 564.8n ± 0% -57.87% (p=0.000 n=10)
geomean 185.3n 77.99n -57.92%
Reviewed-by: Adrian Dewhurst <adrian@tailscale.com>
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
Signed-off-by: Dimitri Papadopoulos Orfanos <3234522+DimitriPapadopoulos@users.noreply.github.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
tcpGRO() was using an incorrect IPv4 more fragments bit mask.
tcpPacketsCanCoalesce() was not distinguishing tcp6 from tcp4, and TTL
values were not compared. TTL values should be equal at the IP layer,
otherwise the packets should not coalesce. This tracks with the kernel.
Reviewed-by: Denton Gentry <dgentry@tailscale.com>
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
IP options were not being compared prior to coalescing. They are not
commonly used. Disqualification due to nonzero options is in line with
the kernel.
Reviewed-by: Denton Gentry <dgentry@tailscale.com>
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
Reviewed-by: Maisem Ali <maisem@tailscale.com>
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
This always struck me as kind of weird and non-standard.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
Enable TCP SACK for the gVisor Stack used in tun/netstack. This can
improve throughput by an order of magnitude in the presence of packet
loss.
Reviewed-by: James Tucker <james@tailscale.com>
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
There's not really a use at the moment for making this configurable, and
once bind_windows.go behaves like bind_std.go, we'll be able to use
constants everywhere. So begin that simplification now.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
Implement TCP offloading via TSO and GRO for the Linux tun.Device, which
is made possible by virtio extensions in the kernel's TUN driver.
Delete conn.LinuxSocketEndpoint in favor of a collapsed conn.StdNetBind.
conn.StdNetBind makes use of recvmmsg() and sendmmsg() on Linux. All
platforms now fall under conn.StdNetBind, except for Windows, which
remains in conn.WinRingBind, which still needs to be adjusted to handle
multiple packets.
Also refactor sticky sockets support to eventually be applicable on
platforms other than just Linux. However Linux remains the sole platform
that fully implements it for now.
Co-authored-by: James Tucker <james@tailscale.com>
Signed-off-by: James Tucker <james@tailscale.com>
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
Accept packet vectors for reading and writing in the tun.Device and
conn.Bind interfaces, so that the internal plumbing between these
interfaces now passes a vector of packets. Vectors move untouched
between these interfaces, i.e. if 128 packets are received from
conn.Bind.Read(), 128 packets are passed to tun.Device.Write(). There is
no internal buffering.
Currently, existing implementations are only adjusted to have vectors
of length one. Subsequent patches will improve that.
Also, as a related fixup, use the unix and windows packages rather than
the syscall package when possible.
Co-authored-by: James Tucker <james@tailscale.com>
Signed-off-by: James Tucker <james@tailscale.com>
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
This seems like a much better demonstration as it removes the need for
external components.
Signed-off-by: Søren L. Hansen <sorenisanerd@gmail.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
Bump gVisor to a recent known-good version.
Signed-off-by: Colin Adler <colin1adler@gmail.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
Without this, `device.Close()` will deadlock.
Signed-off-by: Colin Adler <colin1adler@gmail.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
Now that the gvisor deps aren't insane, we can just do this in the main
module.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
To build with go1.19, gvisor needs
99325baf ("Bump gVisor build tags to go1.19").
However gvisor.dev/gvisor/pkg/tcpip/buffer is no longer available,
so refactor to use gvisor.dev/gvisor/pkg/tcpip/link/channel directly.
Signed-off-by: Shengjing Zhu <i@zhsj.me>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
Use unix.ByteSliceToString in (*NativeTun).nameSlice to convert the
TUNGETIFF ioctl result []byte to a string.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
Bump go.mod and README.
Switch to upstream net/netip.
Use strings.Cut.
Signed-off-by: Josh Bleecher Snyder <josh@tailscale.com>
|
|
Signed-off-by: Alexander Neumann <alexander.neumann@redteam-pentesting.de>
[Jason: don't wrap deadline error.]
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
This commit fixes all callsites of netip.AddrFromSlice(), which has
changed its signature and now returns two values.
Signed-off-by: Alexander Neumann <alexander.neumann@redteam-pentesting.de>
[Jason: remove error handling from AddrFromSlice.]
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
I'm not 100% sure this is correct, but it certainly is a lot simpler.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
Provide a PacketConn interface for netstack's ICMP endpoint; netstack
currently only provides EchoRequest/EchoResponse ICMP support, so this
code exposes only an interface for doing ping.
Signed-off-by: Thomas Ptacek <thomas@sockpuppet.org>
[Jason: rework structure, match std go interfaces, add example code]
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
There are more places where we'll need to add it later, when Go 1.18
comes out with support for it in the "net" package. Also, allowedips
still uses slices internally, which might be suboptimal.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
Update gvisor to v0.0.0-20211020211948-f76a604701b6, which requires some
changes to tun.go:
WriteRawPacket: Add function with not implemented error.
CreateNetTUN: Replace stack.AddAddress with stack.AddProtocolAddress, and
fix IPv6 address in error message.
Signed-off-by: Mikael Magnusson <mikma@users.sourceforge.net>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
Only wireguard-windows used this, and it's moving to wgnt exclusively.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
We'll eventually be getting rid of it here, but keep it sync'd up for
now.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
(*NativeTun).operateOnFd is only used on darwin and freebsd. Adjust the
build tags accordingly.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
At these points, the socket file descriptor is not yet wrapped in an
*os.File, so it needs to be closed explicitly on error.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
Otherwise recent WDK binaries fail on ARM64, where an exception handler
is used for trapping an illegal instruction when ARMv8.1 atomics are
being tested for functionality.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|