summaryrefslogtreecommitdiff
path: root/nest
AgeCommit message (Collapse)Author
2022-12-10CLI: Fix for long-lived sessions during high loadsOndrej Zajicek
When there is a continuos stream of CLI commands, cli_get_command() always returns 1 (there is a new command). Anyway, the socket receive buffer was reset only when there was no command at all, leading to a strange behavior: after a while, the CLI receive buffer came to its end, then read() was called with zero size buffer, it returned 0 which was interpreted as EOF. The patch fixes that by resetting the buffer position after each command and moving remaining data at the beginning of buffer. Thanks to Maria Matejka for examining the bug and for the original bugfix.
2022-12-06Nest: Avoid spurious announcements triggered by filtered routesOndrej Zajicek
When filtered routes (enabled by 'import keep filtered' option) are updated, they trigger announcements by rte_announce(). For regular channels (e.g. type RA_OPTIMAL or RA_ANY) such announcement is just ignored, but in case of RA_ACCEPTED (BGP peer with 'secondary' option) it just reannounces the old (and still valid) best route. The patch ensures that such no-change is ignored even for these channels.
2022-11-01Moved config-related allocations to config_pool and showing its size in ↵Maria Matejka
memory usage
2022-10-18Doc: Add documentation for "show route (import|export) table"Alexander Zubkov
2022-10-10BGP: Add option 'next hop prefer global'Ondrej Zajicek
Add BGP channel option 'next hop prefer global' that modifies BGP recursive next hop resolution to use global next hop IPv6 address instead of link-local next hop IPv6 address for immediate next hop of received routes.
2022-10-03Nest: Add channel config flag to distinguish new or copyOndrej Zajicek
It is useful to distinguish whehter channel config returned from channel_config_get() was allocated new, or existing from template. Caller may want to initialize new ones.
2022-10-03Filter: Add some minor functions for f_tree and ECOndrej Zajicek
Add some supportive functions for f_tree and EC. These functions are used by L3VPN code.
2022-10-03BGP: Do not assume that all channels are struct bgp_channelOndrej Zajicek
In principle, the channel list is a list of parent struct proto and can contain general structures of type struct channel, That is useful e.g. for adding MPLS channels to BGP.
2022-09-06Better profylaction recursive route loopsMaria Matejka
In some specific configurations, it was possible to send BIRD into an infinite loop of recursive next hop resolution. This was caused by route priority inversion. To prevent priority inversions affecting other next hops, we simply refuse to resolve any next hop if the best route for the matching prefix is recursive or any other route with the same preference is recursive. Next hop resolution doesn't change route priority, therefore it is perfectly OK to resolve BGP next hops e.g. by an OSPF route, yet if the same (or covering) prefix is also announced by iBGP, by retraction of the OSPF route we would get a possible priority inversion.
2022-08-18Simplified the protocol hookup code in MakefilesMaria Matejka
2022-07-22Revert "Export table: Delay freeing of old stored route."Maria Matejka
This reverts commit cee0cd148c9b71bf47d007c850193b5fbf9486c1. This change is not needed in version 2 and the surrounding code has disappeared mostly in version 3.
2022-07-11Added forgotten route source locking in flowspec validationMaria Matejka
2022-07-11Merge remote-tracking branch 'origin/master' into backportMaria Matejka
2022-07-11Merge commit 'beb5f78a' into backportMaria Matejka
2022-07-10Merge version 2.0.10 into backportMaria Matejka
2022-06-27Filter: Implement for loopsOndrej Zajicek (work)
For loops allow to iterate over elements in compound data like BGP paths or community lists. The syntax is: for [ <type> ] <variable> in <expr> do <command-body>
2022-06-27Filter: Improve handling of stack frames in filter bytecodeOndrej Zajicek (work)
When f_line is done, we have to pop the stack frame. The old code just removed nominal number of args/vars. Change it to use stored ventry value modified by number of returned values. This allows to allocate variables on a stack frame during execution of f_lines instead of just at start. But we need to know the number of returned values for a f_line. It is 1 for term, 0 for cmd. Store that to f_line during linearization.
2022-06-27Nest: Cleanups in as_path_filter()Ondrej Zajicek (work)
Use struct f_val as a common argument for as_path_filter(), as suggested by Alexander Zubkov. That allows to use NULL sets as valid arguments.
2022-06-27Preexport callback now takes the channel instead of protocol as argumentMaria Matejka
Passing protocol to preexport was in fact a historical relic from the old times when channels weren't a thing. Refactoring that to match current extensibility needs.
2022-06-04Nest: Improve GC strategy for rtablesOndrej Zajicek
Use timer (configurable as 'gc period') to schedule routing table GC/pruning to ensure that prune is done on time but not too often. Randomize GC timers to avoid concentration of GC events from different tables in one loop cycle. Fix a bug that caused minimum inter-GC interval be 5 us instead of 5 s. Make default 'gc period' adaptive based on number of routing tables, from 10 s for small setups to 600 s for large ones. In marge multi-table RS setup, the patch improved time of flushing a downed peer from 20-30 min to <2 min and removed 40s latencies.
2022-05-30Merge remote-tracking branch 'origin/master' into haugesund-to-2.0Maria Matejka
2022-05-15BGP: Improve tx performance during feed/flushOndrej Zajicek
The prefix hash table in BGP used the same hash function as the rtable. When a batch of routes are exported during feed/flush to the BGP, they all have similar hash values, so they are all crowded in a few slots in the BGP prefix table (which is much smaller - around the size of the batch - and uses higher bits from hash values), making it much slower due to excessive collisions. Use a different hash function to avoid this. Also, increase the batch size to fill 4k BGP packets and increase minimum BGP bucket and prefix hash sizes to avoid back and forth resizing during flushes. This leads to order of magnitude faster flushes (on my test data).
2022-04-13RIP: fixed the EA_RIP_FROM attributeMaria Matejka
The interface pointer was improperly converted to u32 and back. Fixing this by explicitly allocating an adata structure for it. It's not so memory efficient, we'll optimize this later.
2022-04-06All linpools use pages to allocate regular blocksMaria Matejka
2022-04-06Protocols have their own explicit init routinesMaria Matejka
2022-04-06Unsetting route attributes without messing with type systemMaria Matejka
2022-04-06Eattr flags (originated and fresh) get their own struct fieldsMaria Matejka
2022-04-06Minor fix: f_val literals should always have named struct fieldsMaria Matejka
2022-04-06Slab allocator can free the blocks without knowing the parent structureMaria Matejka
2022-03-15Printf variant with a result allocated inside a pool / linpoolMaria Matejka
2022-03-09Merge commit '60880b539b8886f76961125d89a265c6e1112b7a' into haugesundMaria Matejka
2022-03-09BGP Flowspec validation: Removed in-route optimization for multithreading ↵Maria Matejka
compatibility
2022-03-09Merge commit 'e42eedb9' into haugesundMaria Matejka
2022-03-09Merge commit '5cff1d5f' into haugesundMaria Matejka
Conflicts: proto/bgp/attrs.c proto/pipe/pipe.c
2022-03-09Merge commit 'd5a32563' into haugesundMaria Matejka
2022-03-09Fixed resource initialization in unit testsMaria Matejka
2022-03-09Single-threaded version of sark-branch memory page managementMaria Matejka
2022-03-02Replaced custom linpools in tests for the common tmp_linpoolMaria Matejka
2022-02-06Merge branch 'oz-trie-table'Ondrej Zajicek (work)
2022-02-06Nest: Implement locking of prefix tries during walksOndrej Zajicek (work)
The prune loop may may rebuild the prefix trie and therefore invalidate walk state for asynchronous walks (used in 'show route in' cmd). Fix it by adding locking that keeps the old trie in memory until current walks are done. In future this could be improved by rebuilding trie walk states (by lookup for last found prefix) after the prefix trie rebuild.
2022-02-06Nest: Implement prefix trie pruningOndrej Zajicek (work)
When rtable is pruned and network fib nodes are removed, we also need to prune prefix trie. Unfortunately, rebuilding prefix trie takes long time (got about 400 ms for 1M networks), so must not be atomic, we have to rebuild a new trie while current one is still active. That may require some considerable amount of temporary memory, so we do that only if we expect significant trie size reduction.
2022-02-06BGP: Implement flowspec validation procedureOndrej Zajicek (work)
Implement flowspec validation procedure as described in RFC 8955 sec. 6 and RFC 9117. The Validation procedure enforces that only routers in the forwarding path for a network can originate flowspec rules for that network. The patch adds new mechanism for tracking inter-table dependencies, which is necessary as the flowspec validation depends on IP routes, and flowspec rules must be revalidated when best IP routes change. The validation procedure is disabled by default and requires that relevant IP table uses trie, as it uses interval queries for subnets.
2022-02-06Nest: Add routing table configuration blocksOndrej Zajicek (work)
Allow to specify sorted flag, trie fla, and min/max settle time. Also do not enable trie by default, it must be explicitly enabled.
2022-02-06Nest: Add convenience functions to check rtable net typeOndrej Zajicek (work)
2022-02-06Nest: Avoid unnecessary net_format() in 'show route' commandOndrej Zajicek (work)
When output of 'show route' command was generated, the net_format() was called for each network prematurely, even if the result was not needed. Fix the code to call net_format() only when needed. This makes queries that process many networks but show only few (e.g. 'show route where ..', or 'show route count') much faster (like 5x - 10x faster).
2022-02-06Nest: Add trie iteration code to 'show route'Ondrej Zajicek (work)
Add trie iteration code to rt_show_cont() CLI hook and use it to accelerate 'show route in <addr>' commands using interval queries.
2022-02-06Nest: Implement 'show route in <addr>' commandOndrej Zajicek (work)
Implement 'show route in <addr>' command, which shows all routes in networks that are subnets of given network. Currently limited to IP network types.
2022-02-06Nest: Attach prefix trie to rtable for faster LPM and interval queriesOndrej Zajicek (work)
Attach a prefix trie to IP/VPN/ROA tables. Use it for net_route() and net_roa_check(). This leads to 3-5x speedups for IPv4 and 5-10x speedup for IPv6 of these calls. TODO: - Rebuild the trie during rt_prune_table() - Better way to avoid trie_add_prefix() in net_get() for existing tables - Make it configurable (?)
2021-12-28Filter: Add operators to find minimum and maximum element of setsAlexander Zubkov
Add operators .min and .max to find minumum or maximum element in sets of types: clist, eclist, lclist. Example usage: bgp_community.min bgp_ext_community.max filter(bgp_large_community, [(as1, as2, *)]).min Signed-off-by: Alexander Zubkov <green@qrator.net>
2021-12-18Nest: Do not ignore secondary flag changes in ifa updatesOndrej Zajicek (work)
Compare all IA_* flags that are set by sysdep iface code. The old code ignores IA_SECONDARY flag when comparing whether iface address updates from kernel changed anything. This is usually not an issue as kernel removes all secondary addresses due to removal of the primary one, but it breaks when sysctl 'promote_secondaries' is enabled and kernel promotes secondary addresses to primary ones. Thanks to 'Alexander' for the bugreport.