summaryrefslogtreecommitdiffhomepage
path: root/src/send.c
AgeCommit message (Collapse)Author
2020-05-19noise: separate receive counter from send counterJason A. Donenfeld
In "queueing: preserve flow hash across packet scrubbing", we were required to slightly increase the size of the receive replay counter to something still fairly small, but an increase nonetheless. It turns out that we can recoup some of the additional memory overhead by splitting up the prior union type into two distinct types. Before, we used the same "noise_counter" union for both sending and receiving, with sending just using a simple atomic64_t, while receiving used the full replay counter checker. This meant that most of the memory being allocated for the sending counter was being wasted. Since the old "noise_counter" type increased in size in the prior commit, now is a good time to split up that union type into a distinct "noise_replay_ counter" for receiving and a boring atomic64_t for sending, each using neither more nor less memory than required. Also, since sometimes the replay counter is accessed without necessitating additional accesses to the bitmap, we can reduce cache misses by hoisting the always-necessary lock above the bitmap in the struct layout. We also change a "noise_replay_counter" stack allocation to kmalloc in a -DDEBUG selftest so that KASAN doesn't trigger a stack frame warning. All and all, removing a bit of abstraction in this commit makes the code simpler and smaller, in addition to the motivating memory usage recuperation. For example, passing around raw "noise_symmetric_key" structs is something that really only makes sense within noise.c, in the one place where the sending and receiving keys can safely be thought of as the same type of object; subsequent to that, it's important that we uniformly access these through keypair->{sending,receiving}, where their distinct roles are always made explicit. So this patch allows us to draw that distinction clearly as well. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2020-05-19queueing: preserve flow hash across packet scrubbingJason A. Donenfeld
It's important that we clear most header fields during encapsulation and decapsulation, because the packet is substantially changed, and we don't want any info leak or logic bug due to an accidental correlation. But, for encapsulation, it's wrong to clear skb->hash, since it's used by fq_codel and flow dissection in general. Without it, classification does not proceed as usual. This change might make it easier to estimate the number of innerflows by examining clustering of out of order packets, but this shouldn't open up anything that can't already be inferred otherwise (e.g. syn packet size inference), and fq_codel can be disabled anyway. Furthermore, it might be the case that the hash isn't used or queried at all until after wireguard transmits the encrypted UDP packet, which means skb->hash might still be zero at this point, and thus no hash taken over the inner packet data. In order to address this situation, we force a calculation of skb->hash before encrypting packet data. Of course this means that fq_codel might transmit packets slightly more out of order than usual. Toke did some testing on beefy machines with high quantities of parallel flows and found that increasing the reply-attack counter to 8192 takes care of the most pathological cases pretty well. Reported-by: Dave Taht <dave.taht@gmail.com> Reviewed-and-tested-by: Toke Høiland-Jørgensen <toke@toke.dk> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2020-05-05send/receive: use explicit unlikely branch instead of implicit coalescingJason A. Donenfeld
It's very unlikely that send will become true. It's nearly always false between 0 and 120 seconds of a session, and in most cases becomes true only between 120 and 121 seconds before becoming false again. So, unlikely(send) is clearly the right option here. What happened before was that we had this complex boolean expression with multiple likely and unlikely clauses nested. Since this is evaluated left-to-right anyway, the whole thing got converted to unlikely. So, we can clean this up to better represent what's going on. The generated code is the same. Suggested-by: Sultan Alsawaf <sultan@kerneltoast.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2020-05-04send: cond_resched() when processing tx ringbuffersJason A. Donenfeld
Users with pathological hardware reported CPU stalls on CONFIG_ PREEMPT_VOLUNTARY=y, because the ringbuffers would stay full, meaning these workers would never terminate. That turned out not to be okay on systems without forced preemption. This commit adds a cond_resched() to the bottom of each loop iteration, so that these workers don't hog the core. We don't do this on encryption/decryption because the compat module here uses simd_relax, which already includes a call to schedule in preempt_enable. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2020-03-18send: use normaler alignment formula from upstreamJason A. Donenfeld
Slightly more meh, but upstream likes it better, and I'd rather minimize the delta between trees. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2020-02-14send: cleanup skb padding calculationJason A. Donenfeld
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2020-02-14send: account for mtu=0 devicesJason A. Donenfeld
It turns out there's an easy way to get packets queued up while still having an MTU of zero, and that's via persistent keep alive. This commit makes sure that in whatever condition, we don't wind up dividing by zero. Note that an MTU of zero for a wireguard interface is something quasi-valid, so I don't think the correct fix is to limit it via min_mtu. This can be reproduced easily with: ip link add wg0 type wireguard ip link add wg1 type wireguard ip link set wg0 up mtu 0 ip link set wg1 up wg set wg0 private-key <(wg genkey) wg set wg1 listen-port 1 private-key <(wg genkey) peer $(wg show wg0 public-key) wg set wg0 peer $(wg show wg1 public-key) persistent-keepalive 1 endpoint 127.0.0.1:1 However, while min_mtu=0 seems fine, it makes sense to restrict the max_mtu. This commit also restricts the maximum MTU to the greatest number for which rounding up to the padding multiple won't overflow a signed integer. Packets this large were always rejected anyway eventually, due to checks deeper in, but it seems more sound not to even let the administrator configure something that won't work anyway. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2019-12-05chacha20poly1305: port to sgmitter for 5.5Jason A. Donenfeld
I'm not totally comfortable with these changes yet, and it'll require some more scrutiny. But it's a start. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2019-12-05device: prepare skb_list_walk_safe for upstreamingJason A. Donenfeld
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2019-12-05send: use kfree_skb_listJason A. Donenfeld
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2019-06-25global: switch to coarse ktimeJason A. Donenfeld
Coarse ktime is broken until [1] in 5.2 and kernels without the backport, so we use fallback code there. The fallback code has also been improved significantly. It now only uses slower clocks on kernels < 3.17, at the expense of some accuracy we're not overly concerned about. [1] https://lore.kernel.org/lkml/tip-e3ff9c3678b4d80e22d2557b68726174578eaf52@git.kernel.org/ Suggested-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2019-01-07global: update copyrightJason A. Donenfeld
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-10-27send: calculate inner checksums for all protocolsAndrejs Hanins
I'm using GRE tunnel (transparent Ethernet bridging flavor) over WireGuard interface to be able to bridge L2 network segments. The typical protocol chain looks like this IP->GRE->EthernetHeader->IP->UDP. UDP here is the packet sent from the L2 network segment which is tunneled using GRE over Wireguard. Indeed, there is a checksum inside UDP header which is, as a rule, kept partially calculated while packet travels through network stack and outer protocols are added until the packet reaches WG device which exposes NETIF_F_HW_CSUM feature meaning it can handle checksum offload for all protocols. But the problem here is that skb_checksum_setup called from encrypt_packet handles only TCP/UDP protocols under top level IP, but in my case there is a GRE protocol there, so skb_checksum_help is not called and packet continues its life with unfinished (broken) checksum and gets encrypted as-is. When such packet is received by other side and reaches L2 networks it's seen there with a broken checksum inside the UDP header. The fact that Wireguard on the receiving side sets skb->ip_summed to CHECKSUM_UNNECESSARY partially mitigates the problem by telling network stack on the receiving side that validation of the checksum is not necessary, so local TCP stack, for example, works fine. But it doesn't help in situations when packet needs to be forwarded further (sent out from the box). In this case there is no way we can tell next hop that checksum verification for this packet is not necessary, we just send it out with bad checksum and packet gets dropped on the next hop box. I think the issue of the original code was the wrong usage of skb_checksum_setup, simply because it's not needed in this case. Instead, we can just rely on ip_summed skb field to see if partial checksum needs to be finalized or not. Note that many other drivers in kernel follow this approach. In summary: - skb_checksum_setup can only handle TCP/UDP protocols under top level IP header, packets with other protocols (like GRE) are sent out by Wireguard with unfinished partial checksums which causes problems on receiving side (bad checksums). - encrypt_packet gets skb prepared by network stack, so there is no need to setup the checksum from scratch, but just perform hw checksum offload using software helper skb_checksum_help for packet which explicitly require it as denoted by CHECKSUM_PARTIAL. Signed-off-by: Andrejs Hanins <ahanins@gmail.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-10-27send: consider dropped stage packets to be droppedJason A. Donenfeld
Suggested-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-10-25global: do not allow compiler to reorder is_valid or is_deadJason A. Donenfeld
Suggested-by: Jann Horn <jann@thejh.net> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-10-09global: give if statements brackets and other cleanupsJason A. Donenfeld
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-10-08global: more nitsJason A. Donenfeld
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-10-08global: rename struct wireguard_ to struct wg_Jason A. Donenfeld
This required a bit of pruning of our christmas trees. Suggested-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-10-08global: prefix functions used in callbacks with wg_Jason A. Donenfeld
Suggested-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-10-07global: style nitsJason A. Donenfeld
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-10-02global: prefix all functions with wg_Jason A. Donenfeld
I understand why this must be done, though I'm not so happy about having to do it. In some places, it puts us over 80 chars and we have to break lines up in further ugly ways. And in general, I think this makes things harder to read. Yet another thing we must do to please upstream. Maybe this can be replaced in the future by some kind of automatic module namespacing logic in the linker, or even combined with LTO and aggressive symbol stripping. Suggested-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-09-20global: put SPDX identifier on its own lineJason A. Donenfeld
The kernel has very specific rules correlating file type with comment type, and also SPDX identifiers can't be merged with other comments. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-09-17crypto: pass simd by referenceJason A. Donenfeld
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-09-16global: remove non-essential inline annotationsJason A. Donenfeld
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-09-16send/receive: reduce number of sg entriesJason A. Donenfeld
This reduces stack usage to quell warnings on powerpc. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-09-04global: prefer sizeof(*pointer) when possibleJason A. Donenfeld
Suggested-by: Sultan Alsawaf <sultanxda@gmail.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-09-03crypto: import zincJason A. Donenfeld
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-08-28global: run through clang-formatJason A. Donenfeld
This is the worst commit in the whole repo, making the code much less readable, but so it goes with upstream maintainers. We are now woefully wrapped at 80 columns. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-08-06crypto: move simd context to specific typeJason A. Donenfeld
Suggested-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-08-04send: switch handshake stamp to an atomicJason A. Donenfeld
Rather than abusing the handshake lock, we're much better off just using a boring atomic64 for this. It's simpler and performs better. Also, while we're at it, we set the handshake stamp both before and after the calculations, in case the calculations block for a really long time waiting for the RNG to initialize. Otherwise it's possible that when the RNG finally initializes, two handshakes are sent back to back, which isn't sensible. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-08-03peer: ensure destruction doesn't raceJason A. Donenfeld
Completely rework peer removal to ensure peers don't jump between contexts and create races. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-08-02queueing: ensure strictly ordered loads and storesJason A. Donenfeld
We don't want a consumer to read plaintext when it's supposed to be reading ciphertext, which means we need to synchronize across cores. Suggested-by: Jann Horn <jann@thejh.net> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-07-31send: address of variable is never nullJason A. Donenfeld
Reported-by: Jann Horn <jann@thejh.net> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-07-31peer: simplify rcu reference countsJason A. Donenfeld
Use RCU reference counts only when we must, and otherwise use a more reasonably named function. Reported-by: Jann Horn <jann@thejh.net> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-06-23global: use fast boottime instead of normal boottimeJason A. Donenfeld
Generally if we're inaccurate by a few nanoseconds, it doesn't matter. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-06-23global: use ktime boottime instead of jiffiesJason A. Donenfeld
Since this is a network protocol, expirations need to be accounted for, even across system suspend. On real systems, this isn't a problem, since we're clearing all keys before suspend. But on Android, where we don't do that, this is something of a problem. So, we switch to using boottime instead of jiffies. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-06-22global: fix a few typosJonathan Neuschäfer
Signed-off-by: Jonathan Neuschäfer <j.neuschaefer@gmx.net> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-06-17simd: encapsulate fpu amortization into nice functionsJason A. Donenfeld
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-06-16queueing: re-enable preemption periodically to lower latencyJason A. Donenfeld
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-06-16queueing: remove useless spinlocks on scJason A. Donenfeld
Since these are the only consumers, there's no need for locking. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-05-19timers: clear send_keepalive timer on sending handshake responseJason A. Donenfeld
We reorganize this into also doing so on sending keepalives itself, which means the state machine is much more consistent, even if this was already implied. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-04-16send: simplify skb_padding with nice macroJason A. Donenfeld
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-04-15send: account for route-based MTUJason A. Donenfeld
It might be that a particular route has a different MTU than the interface, via `ip route add ... dev wg0 mtu 1281`, for example. In this case, it's important that we don't accidently pad beyond the end of the MTU. We accomplish that in this patch by carrying forward the MTU from the dst if it exists. We also add a unit test for this issue. Reported-by: Roman Mamedov <rm.wg@romanrm.net> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-01-03global: year bumpJason A. Donenfeld
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2017-12-09global: add SPDX tags to all filesGreg Kroah-Hartman
It's good to have SPDX identifiers in all files as the Linux kernel developers are working to add these identifiers to all files. Update all files with the correct SPDX license identifier based on the license text of the project or based on the license in the file itself. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Modified-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2017-10-31global: style nitsJason A. Donenfeld
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2017-10-31global: infuriating kernel iterator styleJason A. Donenfeld
One types: for (i = 0 ... So one should also type: for_each_obj (obj ... But the upstream kernel style guidelines are insane, and so we must instead do: for_each_obj(obj ... Ugly, but one must choose his battles wisely. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2017-10-31global: accept decent check_patch.pl suggestionsJason A. Donenfeld
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2017-10-31stats: more robust accountingJason A. Donenfeld
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2017-10-17send: improve dead packet control flowJason A. Donenfeld
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>