summaryrefslogtreecommitdiffhomepage
path: root/src/allowedips.c
AgeCommit message (Collapse)Author
2021-06-04allowedips: add missing __rcu annotation to satisfy sparseJason A. Donenfeld
A __rcu annotation got lost during refactoring, which caused sparse to become enraged. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-06-04allowedips: free empty intermediate nodes when removing single nodeJason A. Donenfeld
When removing single nodes, it's possible that that node's parent is an empty intermediate node, in which case, it too should be removed. Otherwise the trie fills up and never is fully emptied, leading to gradual memory leaks over time for tries that are modified often. There was originally code to do this, but was removed during refactoring in 2016 and never reworked. Now that we have proper parent pointers from the previous commits, we can implement this properly. In order to reduce branching and expensive comparisons, we want to keep the double pointer for parent assignment (which lets us easily chain up to the root), but we still need to actually get the parent's base address. So encode the bit number into the last two bits of the pointer, and pack and unpack it as needed. This is a little bit clumsy but is the fastest and less memory wasteful of the compromises. Note that we align the root struct here to a minimum of 4, because it's embedded into a larger struct, and we're relying on having the bottom two bits for our flag, which would only be 16-bit aligned on m68k. The existing macro-based helpers were a bit unwieldy for adding the bit packing to, so this commit replaces them with safer and clearer ordinary functions. We add a test to the randomized/fuzzer part of the selftests, to free the randomized tries by-peer, refuzz it, and repeat, until it's supposed to be empty, and then then see if that actually resulted in the whole thing being emptied. That combined with kmemcheck should hopefully make sure this commit is doing what it should. Along the way this resulted in various other cleanups of the tests and fixes for recent graphviz. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-06-04allowedips: allocate nodes in kmem_cacheJason A. Donenfeld
The previous commit moved from O(n) to O(1) for removal, but in the process introduced an additional pointer member to a struct that increased the size from 60 to 68 bytes, putting nodes in the 128-byte slab. With deployed systems having as many as 2 million nodes, this represents a significant doubling in memory usage (128 MiB -> 256 MiB). Fix this by using our own kmem_cache, that's sized exactly right. This also makes wireguard's memory usage more transparent in tools like slabtop and /proc/slabinfo. Suggested-by: Arnd Bergmann <arnd@arndb.de> Suggested-by: Matthew Wilcox <willy@infradead.org> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2021-06-04allowedips: remove nodes in O(1)Jason A. Donenfeld
Previously, deleting peers would require traversing the entire trie in order to rebalance nodes and safely free them. This meant that removing 1000 peers from a trie with a half million nodes would take an extremely long time, during which we're holding the rtnl lock. Large-scale users were reporting 200ms latencies added to the networking stack as a whole every time their userspace software would queue up significant removals. That's a serious situation. This commit fixes that by maintaining a double pointer to the parent's bit pointer for each node, and then using the already existing node list belonging to each peer to go directly to the node, fix up its pointers, and free it with RCU. This means removal is O(1) instead of O(n), and we don't use gobs of stack. The removal algorithm has the same downside as the code that it fixes: it won't collapse needlessly long runs of fillers. We can enhance that in the future if it ever becomes a problem. This commit documents that limitation with a TODO comment in code, a small but meaningful improvement over the prior situation. Currently the biggest flaw, which the next commit addresses, is that because this increases the node size on 64-bit machines from 60 bytes to 68 bytes. 60 rounds up to 64, but 68 rounds up to 128. So we wind up using twice as much memory per node, because of power-of-two allocations, which is a big bummer. We'll need to figure something out there. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2020-02-05allowedips: remove previously added list item when OOM failEric Dumazet
In the unlikely case a new node could not be allocated, we need to remove @newnode from @peer->allowedips_list before freeing it. syzbot reported: BUG: KASAN: use-after-free in __list_del_entry_valid+0xdc/0xf5 lib/list_debug.c:54 Read of size 8 at addr ffff88809881a538 by task syz-executor.4/30133 CPU: 0 PID: 30133 Comm: syz-executor.4 Not tainted 5.5.0-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x197/0x210 lib/dump_stack.c:118 print_address_description.constprop.0.cold+0xd4/0x30b mm/kasan/report.c:374 __kasan_report.cold+0x1b/0x32 mm/kasan/report.c:506 kasan_report+0x12/0x20 mm/kasan/common.c:639 __asan_report_load8_noabort+0x14/0x20 mm/kasan/generic_report.c:135 __list_del_entry_valid+0xdc/0xf5 lib/list_debug.c:54 __list_del_entry include/linux/list.h:132 [inline] list_del include/linux/list.h:146 [inline] root_remove_peer_lists+0x24f/0x4b0 drivers/net/wireguard/allowedips.c:65 wg_allowedips_free+0x232/0x390 drivers/net/wireguard/allowedips.c:300 wg_peer_remove_all+0xd5/0x620 drivers/net/wireguard/peer.c:187 wg_set_device+0xd01/0x1350 drivers/net/wireguard/netlink.c:542 genl_family_rcv_msg_doit net/netlink/genetlink.c:672 [inline] genl_family_rcv_msg net/netlink/genetlink.c:717 [inline] genl_rcv_msg+0x67d/0xea0 net/netlink/genetlink.c:734 netlink_rcv_skb+0x177/0x450 net/netlink/af_netlink.c:2477 genl_rcv+0x29/0x40 net/netlink/genetlink.c:745 netlink_unicast_kernel net/netlink/af_netlink.c:1302 [inline] netlink_unicast+0x59e/0x7e0 net/netlink/af_netlink.c:1328 netlink_sendmsg+0x91c/0xea0 net/netlink/af_netlink.c:1917 sock_sendmsg_nosec net/socket.c:652 [inline] sock_sendmsg+0xd7/0x130 net/socket.c:672 ____sys_sendmsg+0x753/0x880 net/socket.c:2343 ___sys_sendmsg+0x100/0x170 net/socket.c:2397 __sys_sendmsg+0x105/0x1d0 net/socket.c:2430 __do_sys_sendmsg net/socket.c:2439 [inline] __se_sys_sendmsg net/socket.c:2437 [inline] __x64_sys_sendmsg+0x78/0xb0 net/socket.c:2437 do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x45b399 Code: ad b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 7b b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007f99a9bcdc78 EFLAGS: 00000246 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 00007f99a9bce6d4 RCX: 000000000045b399 RDX: 0000000000000000 RSI: 0000000020001340 RDI: 0000000000000003 RBP: 000000000075bf20 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000004 R13: 00000000000009ba R14: 00000000004cb2b8 R15: 0000000000000009 Allocated by task 30103: save_stack+0x23/0x90 mm/kasan/common.c:72 set_track mm/kasan/common.c:80 [inline] __kasan_kmalloc mm/kasan/common.c:513 [inline] __kasan_kmalloc.constprop.0+0xcf/0xe0 mm/kasan/common.c:486 kasan_kmalloc+0x9/0x10 mm/kasan/common.c:527 kmem_cache_alloc_trace+0x158/0x790 mm/slab.c:3551 kmalloc include/linux/slab.h:556 [inline] kzalloc include/linux/slab.h:670 [inline] add+0x70a/0x1970 drivers/net/wireguard/allowedips.c:236 wg_allowedips_insert_v4+0xf6/0x160 drivers/net/wireguard/allowedips.c:320 set_allowedip drivers/net/wireguard/netlink.c:343 [inline] set_peer+0xfb9/0x1150 drivers/net/wireguard/netlink.c:468 wg_set_device+0xbd4/0x1350 drivers/net/wireguard/netlink.c:591 genl_family_rcv_msg_doit net/netlink/genetlink.c:672 [inline] genl_family_rcv_msg net/netlink/genetlink.c:717 [inline] genl_rcv_msg+0x67d/0xea0 net/netlink/genetlink.c:734 netlink_rcv_skb+0x177/0x450 net/netlink/af_netlink.c:2477 genl_rcv+0x29/0x40 net/netlink/genetlink.c:745 netlink_unicast_kernel net/netlink/af_netlink.c:1302 [inline] netlink_unicast+0x59e/0x7e0 net/netlink/af_netlink.c:1328 netlink_sendmsg+0x91c/0xea0 net/netlink/af_netlink.c:1917 sock_sendmsg_nosec net/socket.c:652 [inline] sock_sendmsg+0xd7/0x130 net/socket.c:672 ____sys_sendmsg+0x753/0x880 net/socket.c:2343 ___sys_sendmsg+0x100/0x170 net/socket.c:2397 __sys_sendmsg+0x105/0x1d0 net/socket.c:2430 __do_sys_sendmsg net/socket.c:2439 [inline] __se_sys_sendmsg net/socket.c:2437 [inline] __x64_sys_sendmsg+0x78/0xb0 net/socket.c:2437 do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294 entry_SYSCALL_64_after_hwframe+0x49/0xbe Freed by task 30103: save_stack+0x23/0x90 mm/kasan/common.c:72 set_track mm/kasan/common.c:80 [inline] kasan_set_free_info mm/kasan/common.c:335 [inline] __kasan_slab_free+0x102/0x150 mm/kasan/common.c:474 kasan_slab_free+0xe/0x10 mm/kasan/common.c:483 __cache_free mm/slab.c:3426 [inline] kfree+0x10a/0x2c0 mm/slab.c:3757 add+0x12d2/0x1970 drivers/net/wireguard/allowedips.c:266 wg_allowedips_insert_v4+0xf6/0x160 drivers/net/wireguard/allowedips.c:320 set_allowedip drivers/net/wireguard/netlink.c:343 [inline] set_peer+0xfb9/0x1150 drivers/net/wireguard/netlink.c:468 wg_set_device+0xbd4/0x1350 drivers/net/wireguard/netlink.c:591 genl_family_rcv_msg_doit net/netlink/genetlink.c:672 [inline] genl_family_rcv_msg net/netlink/genetlink.c:717 [inline] genl_rcv_msg+0x67d/0xea0 net/netlink/genetlink.c:734 netlink_rcv_skb+0x177/0x450 net/netlink/af_netlink.c:2477 genl_rcv+0x29/0x40 net/netlink/genetlink.c:745 netlink_unicast_kernel net/netlink/af_netlink.c:1302 [inline] netlink_unicast+0x59e/0x7e0 net/netlink/af_netlink.c:1328 netlink_sendmsg+0x91c/0xea0 net/netlink/af_netlink.c:1917 sock_sendmsg_nosec net/socket.c:652 [inline] sock_sendmsg+0xd7/0x130 net/socket.c:672 ____sys_sendmsg+0x753/0x880 net/socket.c:2343 ___sys_sendmsg+0x100/0x170 net/socket.c:2397 __sys_sendmsg+0x105/0x1d0 net/socket.c:2430 __do_sys_sendmsg net/socket.c:2439 [inline] __se_sys_sendmsg net/socket.c:2437 [inline] __x64_sys_sendmsg+0x78/0xb0 net/socket.c:2437 do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294 entry_SYSCALL_64_after_hwframe+0x49/0xbe The buggy address belongs to the object at ffff88809881a500 which belongs to the cache kmalloc-64 of size 64 The buggy address is located 56 bytes inside of 64-byte region [ffff88809881a500, ffff88809881a540) The buggy address belongs to the page: page:ffffea0002620680 refcount:1 mapcount:0 mapping:ffff8880aa400380 index:0x0 raw: 00fffe0000000200 ffffea000250b748 ffffea000254bac8 ffff8880aa400380 raw: 0000000000000000 ffff88809881a000 0000000100000020 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff88809881a400: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc ffff88809881a480: 00 00 00 00 00 fc fc fc fc fc fc fc fc fc fc fc >ffff88809881a500: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc ^ ffff88809881a580: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc ffff88809881a600: 00 00 00 00 00 00 fc fc fc fc fc fc fc fc fc fc Fixes: e7096c131e51 ("net: WireGuard secure network tunnel") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Cc: Jason A. Donenfeld <Jason@zx2c4.com> Cc: wireguard@lists.zx2c4.com Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2019-11-26allowedips: safely dereference rcu rootsJason A. Donenfeld
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2019-04-06allowedips: initialize list head when removing intermediate nodesJason A. Donenfeld
Otherwise if this list item is later reused, we'll crash on list poison or worse. Also, add a version of Mimka's reproducer to netns.sh to catch these types of bugs in the future. Reported-by: Mimka <mikma.wg@lists.m7n.se> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2019-03-25allowedips: do not use __always_inlineJason A. Donenfeld
DaveM doth forbid. Suggested-by: David Miller <davem@davemloft.net> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2019-03-17global: the _bh variety of rcu helpers have been unifiedJason A. Donenfeld
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2019-02-26allowedips: maintain per-peer list of allowedipsJason A. Donenfeld
This makes `wg show` and `wg showconf` and the like significantly faster, since we don't have to iterate through every node of the trie for every single peer. It also makes netlink cursor resumption much less problematic, since we're just iterating through a list, rather than having to save a traversal stack. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2019-01-07global: update copyrightJason A. Donenfeld
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-11-13global: various formatting tweeksJason A. Donenfeld
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-10-25allowedips: fix up macros and annotationsJason A. Donenfeld
Suggested-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-10-09global: give if statements brackets and other cleanupsJason A. Donenfeld
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-10-08allowedips: swap endianness early onArnd Bergmann
Otherwise if gcc's optimizer is able to look far in but not overly far in, we wind up with "warning: 'key' may be used uninitialized in this function [-Wmaybe-uninitialized]". Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-10-08global: rename struct wireguard_ to struct wg_Jason A. Donenfeld
This required a bit of pruning of our christmas trees. Suggested-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-10-06global: rename include'd C files to be .cJason A. Donenfeld
This is done by 259 other files in the kernel tree: linux $ rg '#include.*\.c' -l | wc -l 259 Suggested-by: Sultan Alsawaf <sultanxda@gmail.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-10-02global: change BUG_ON to WARN_ONJason A. Donenfeld
Suggested-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-10-02global: prefix all functions with wg_Jason A. Donenfeld
I understand why this must be done, though I'm not so happy about having to do it. In some places, it puts us over 80 chars and we have to break lines up in further ugly ways. And in general, I think this makes things harder to read. Yet another thing we must do to please upstream. Maybe this can be replaced in the future by some kind of automatic module namespacing logic in the linker, or even combined with LTO and aggressive symbol stripping. Suggested-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-09-20global: put SPDX identifier on its own lineJason A. Donenfeld
The kernel has very specific rules correlating file type with comment type, and also SPDX identifiers can't be merged with other comments. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-09-20allowedips: change from BUG_ON to WARN_ONJason A. Donenfeld
This is never going to hit anyway, and if it does, it's a development problem that will be caught with the selftests anyway. So don't make Andrew Lunn upset, and just change it to a WARN_ON. Suggested-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-09-16global: remove non-essential inline annotationsJason A. Donenfeld
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-09-04global: always find OOM unlikelyJason A. Donenfeld
Suggested-by: Sultan Alsawaf <sultanxda@gmail.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-09-03global: satisfy check_patch.pl errorsJason A. Donenfeld
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-08-28global: run through clang-formatJason A. Donenfeld
This is the worst commit in the whole repo, making the code much less readable, but so it goes with upstream maintainers. We are now woefully wrapped at 80 columns. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-08-03allowedips: use different macro names so as to avoid confusionJason A. Donenfeld
A mailing list interlocutor argues that sharing the same macro name might lead to errors down the road. Suggested-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-08-01allowedips: free root inside of RCU callbackJason A. Donenfeld
This reduces the amount of call_rcu invocations considerably. Suggested-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-08-01allowedips: avoid window of disappeared peerJann Horn
If a peer is removed, it's possible for a lookup to momentarily return NULL, resulting in needless -ENOKEY returns. Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-08-01allowedips: prevent double read in krefJason A. Donenfeld
Blocks like: if (node_placement(*trie, key, cidr, bits, &node, lock)) { node->peer = peer; return 0; } May result in a double read when adjusting the refcount, in the highly unlikely case of LTO and an overly smart compiler. While we're at it, replace rcu_assign_pointer(X, NULL); with RCU_INIT_POINTER. Reported-by: Jann Horn <jann@thejh.net> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-07-31peer: simplify rcu reference countsJason A. Donenfeld
Use RCU reference counts only when we must, and otherwise use a more reasonably named function. Reported-by: Jann Horn <jann@thejh.net> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-05-22allowedips: set pointer to null before freeingJason A. Donenfeld
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-05-10allowedips: simplify arithmeticJason A. Donenfeld
Suggested-by: Samuel Neves <sneves@dei.uc.pt> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-05-10allowedips: produce better assembly with unsigned arithmeticJason A. Donenfeld
Suggested-by: Samuel Neves <sneves@dei.uc.pt> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-05-10allowedips: use native endian on lookupJason A. Donenfeld
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-02-21allowedips: fix comment styleJason A. Donenfeld
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-02-14allowedips: indicate to clang-analyzer that trie is non-nullJason A. Donenfeld
We check it in the block just above the only call to node_placement, so we're certain this is the case. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2018-01-03global: year bumpJason A. Donenfeld
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2017-12-09global: add SPDX tags to all filesGreg Kroah-Hartman
It's good to have SPDX identifiers in all files as the Linux kernel developers are working to add these identifiers to all files. Update all files with the correct SPDX license identifier based on the license text of the project or based on the license in the file itself. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Modified-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2017-11-25allowedips: simplifyJason A. Donenfeld
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2017-11-25allowedips: optimizeJason A. Donenfeld
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2017-11-13allowedips: do not write out of boundsJason A. Donenfeld
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2017-11-10allowedips: rename from routingtableJason A. Donenfeld
Makes it more clear that this _not_ a routing table replacement. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>