summaryrefslogtreecommitdiff
path: root/nest/rt-table.c
AgeCommit message (Collapse)Author
2022-06-27Preexport callback now takes the channel instead of protocol as argumentMaria Matejka
Passing protocol to preexport was in fact a historical relic from the old times when channels weren't a thing. Refactoring that to match current extensibility needs.
2022-06-04Nest: Improve GC strategy for rtablesOndrej Zajicek
Use timer (configurable as 'gc period') to schedule routing table GC/pruning to ensure that prune is done on time but not too often. Randomize GC timers to avoid concentration of GC events from different tables in one loop cycle. Fix a bug that caused minimum inter-GC interval be 5 us instead of 5 s. Make default 'gc period' adaptive based on number of routing tables, from 10 s for small setups to 600 s for large ones. In marge multi-table RS setup, the patch improved time of flushing a downed peer from 20-30 min to <2 min and removed 40s latencies.
2022-05-15BGP: Improve tx performance during feed/flushOndrej Zajicek
The prefix hash table in BGP used the same hash function as the rtable. When a batch of routes are exported during feed/flush to the BGP, they all have similar hash values, so they are all crowded in a few slots in the BGP prefix table (which is much smaller - around the size of the batch - and uses higher bits from hash values), making it much slower due to excessive collisions. Use a different hash function to avoid this. Also, increase the batch size to fill 4k BGP packets and increase minimum BGP bucket and prefix hash sizes to avoid back and forth resizing during flushes. This leads to order of magnitude faster flushes (on my test data).
2022-02-06Nest: Implement locking of prefix tries during walksOndrej Zajicek (work)
The prune loop may may rebuild the prefix trie and therefore invalidate walk state for asynchronous walks (used in 'show route in' cmd). Fix it by adding locking that keeps the old trie in memory until current walks are done. In future this could be improved by rebuilding trie walk states (by lookup for last found prefix) after the prefix trie rebuild.
2022-02-06Nest: Implement prefix trie pruningOndrej Zajicek (work)
When rtable is pruned and network fib nodes are removed, we also need to prune prefix trie. Unfortunately, rebuilding prefix trie takes long time (got about 400 ms for 1M networks), so must not be atomic, we have to rebuild a new trie while current one is still active. That may require some considerable amount of temporary memory, so we do that only if we expect significant trie size reduction.
2022-02-06BGP: Implement flowspec validation procedureOndrej Zajicek (work)
Implement flowspec validation procedure as described in RFC 8955 sec. 6 and RFC 9117. The Validation procedure enforces that only routers in the forwarding path for a network can originate flowspec rules for that network. The patch adds new mechanism for tracking inter-table dependencies, which is necessary as the flowspec validation depends on IP routes, and flowspec rules must be revalidated when best IP routes change. The validation procedure is disabled by default and requires that relevant IP table uses trie, as it uses interval queries for subnets.
2022-02-06Nest: Add routing table configuration blocksOndrej Zajicek (work)
Allow to specify sorted flag, trie fla, and min/max settle time. Also do not enable trie by default, it must be explicitly enabled.
2022-02-06Nest: Attach prefix trie to rtable for faster LPM and interval queriesOndrej Zajicek (work)
Attach a prefix trie to IP/VPN/ROA tables. Use it for net_route() and net_roa_check(). This leads to 3-5x speedups for IPv4 and 5-10x speedup for IPv6 of these calls. TODO: - Rebuild the trie during rt_prune_table() - Better way to avoid trie_add_prefix() in net_get() for existing tables - Make it configurable (?)
2021-06-14Nest: Fix export of tmpattrs through pipesOndrej Zajicek (work)
Pipes copy the original rte with old values, so they require rte to be exported with stored tmpattrs. Other protocols access stored attributes using eattr list, so they require rte to be exported with expanded tmpattrs. This is temporary hack, we plan to remove whoe tmpattr mechanism. Thanks to Paul Donohue for the bugreport.
2021-06-14Revert "Nest: Fix export of tmpattrs through pipes"Ondrej Zajicek (work)
This reverts commit f8e273b5e7a3c721f4a30cf27a0b4fe54602e83f.
2021-06-14Nest: Fix export of tmpattrs through pipesOndrej Zajicek (work)
In most cases of export there is no need to store back temporary attributes to rte, as receivers (protocols) access eattr list anyway. But pipe copies the original rte with old values, so we should store tmpattrs also during export. Thanks to Paul Donohue for the bugreport.
2021-04-19Internal route tables have a reduced cleanup routineMaria Matejka
This fixes an internal table cleanup bug introduced in ff397df7edcbe7a8abca5b419729b9c64c063847.
2021-03-30Routing table is now a resource allocated from its own poolMaria Matejka
This also fixes memory leaks from import/export tables being never cleaned up and freed.
2021-03-30Routing tables list iteration should use explicit node struct positionMaria Matejka
2021-02-10Nest: Automatic channel reloads based on RPKI changesOndrej Zajicek (work)
If there are roa_check() calls in channel filters, then the channel subscribes to ROA table notifications, which are sent when ROA tables are updated (subject to settle time) and trigger channel reload or refeed.
2020-12-29Nest: Read Babel metric as IGP metricJames Lu
(Minor syntactic changes by committer)
2020-12-07Nest: Per-channel debug flagsOndrej Zajicek (work)
The patch add support for per-channel debug flags, currently just 'states', 'routes', and 'filters'. Flag 'states' is used for channel state changes, remaining two for routes passed through the channel. The per-protocol debug flags 'routes'/'filters' still enable reporting of routes for all channels, to keep existing behavior. The patch causes minor changes in some log messages.
2020-11-15Nest: Fix crash in receive limit handling in import tableOndrej Zajicek (work)
Logging as a result of triggered receive limit in import table code accesset rte->net, which was not filed yet. Thanks to Pier Carlo Chiodi for the bugreport.
2020-07-16Nest: Keep route ordering during route updatesOndrej Zajicek (work)
Put new non-best routes to the end of list instead of the second position. Put updated routes to their old position. Position is changed just by best route selection.
2020-05-01Uninitialized list nodes fixesMaria Matejka
2020-03-26Filter: Optimize IPv4 prefix setsOndrej Zajicek (work)
Use separate IPv4 and IPv6 implementation of prefix sets. Just this change makes IPv4 prefix sets 60% smaller and 50% faster.
2020-01-07KRT: Improve syncer code to avoid using temporary data in rtableOndrej Zajicek (work)
The old code stored route verdicts and temporary routes directly in rtable. The new code do not store received routes (it immediately compares them with exported routes and resolves conflicts) and uses internal bitmap to keep track of which routes were received and which needs to be reinstalled. By not putting 'invalid' temporary routes to rtable, we keep rtable in consistent state, therefore scan no longer needs to be atomic operation and could be splitted to multiple events.
2019-11-26Nest: Use bitmaps to keep track of exported routesOndrej Zajicek (work)
Use a hierarchical bitmap in a routing table to assign ids to routes, and then use bitmaps (indexed by route id) in channels to keep track whether routes were exported. This avoids unreliable and inefficient re-evaluation of filters for old routes in order to determine whether they were exported.
2019-11-03Nest: Fix bug in export tableOndrej Zajicek (work)
For regular channels do not compare src in export table, as we want to keep here only the best (exported) route per network.
2019-10-10Nest: Handle non-MPLS on MPLS case in recursive route updateOndrej Zajicek (work)
When non-MPLS recursive route resolves to MPLS underlying route, then it should get MPLS labels from the the underlying route.
2019-10-10Nest: Handle PtP links in recursive route updateOndrej Zajicek (work)
Underlying (IGP) route may lead to PtP link, in this case it does not need gateway. Which is different than direct route without gateway. When recursive (BGP) route uses PtP route, it should not use recursive next hop as immediate next hop, while for direct routes it should.
2019-10-10Nest: Fix recursive route updateOndrej Zajicek (work)
Missing cleanup can lead to dangling pointer to old next hops.
2019-10-09BGP: AIGP metric support (RFC 7311)Ondrej Zajicek (work)
2019-09-24Nest: Fix bug in export tableOndrej Zajicek (work)
Exported route may be in modified state, we need to get cached one for rte_same() and rta_clone() to work properly.
2019-08-27Channel refeed with import table splitting between routes for one prefixMaria Matejka
2019-08-14BGP: implement Adj-RIB-OutOndrej Zajicek (work)
The patch implements optional internal export table to a channel and hooks it to BGP so it can be used as Adj-RIB-Out. When enabled, all exported (post-filtered) routes are stored there. An export table can be examined using e.g. 'show route export table bgp1.ipv4'.
2019-03-22Fixed one warning and one undefined value.Maria Matejka
2019-03-18Merge branch 'master' into HEADMaria Matejka
2019-03-14Nest: Update handling of temporary attributesOndrej Zajicek (work)
The temporary atttributes are no longer removed by ea_do_prune(), but they are undefined by store_tmp_attrs() protocol hooks. This fixes several bugs where temporary attributes were removed when they should not or not removed when they should be. The flag EAF_TEMP is no longer needed and was removed. Update all protocol make_tmp_attrs() / store_tmp_attrs() hooks to use helper functions and to handle unset attributes properly. Also fix some related bugs like improper handling of empty eattr list.
2019-03-06OSPF: Improved handling of tmpattrsOndrej Zajicek (work)
Keep track of whether OSPF tmpattrs are actually defined for given route (using flags in rte->pflags). That makes them behave more like real eattrs so a protocol can define just a subset of them or they can be undefined by filters. Do not set ospf_metric2 for other than type 2 external OSPF routes and do not set ospf_tag for non-external OSPF routes. That also fixes a bug where internal/inter-area route propagated from one OSPF instance to another is initiated with infinity ospf_metric2. Thanks to Yaroslav Dronskii for the bugreport.
2019-02-22Nest: Do not compare rte.flags during rte_update()Ondrej Zajicek (work)
Route flags are mosty internal state of rtable, they are not significant to whether a route has changed. With the old code, all routes received as a part of enhanced route refresh are always re-announced to other peers due to change in REF_STALE.
2019-02-20Conf: Symbol implementation converted from void pointers to unionMaria Matejka
... and consted some declarations.
2019-02-20Filter: merged filter instruction constructors, counting line size on ↵Maria Matejka
instruction construct
2019-02-20Filters: split the large filter.h file to smaller files.Maria Matejka
This should be revised, there are still ugly things in the filter API.
2019-02-19Nest: Prevent withdraws from propagation back to source protocol (for ↵Ondrej Zajicek (work)
accepted mode) Update for one of previous patches, handles the the issue for first-accepted mode of route propagation.
2019-02-09Merge remote-tracking branch 'origin/mq-opt'Ondrej Zajicek (work)
2019-02-05Nest: Improve export counter handlingOndrej Zajicek (work)
One of previous workarounds for phantom route avoidance breaks export counters by expanding sending of spurious withdraws, which are send when we are not sure whether we have advertised that routes in the past. If not, then export counter is decreased, but it was not increased before, so it overflows under zero. The patch fixes that by sendung spurious withdraws, but not counting them on export counter. That may lead to error in the other direction, but that happens only as a race condition (i.e., in normal operation filters return proper values about old route export state).
2019-02-02Nest: Reestablish preferred countersOndrej Zajicek (work)
2019-01-31Nest: Don't lookup net in table before filters are run.Maria Matejka
Using dummy net instead. This should help with performance on rejected routes.
2019-01-30Nest: Prevent withdraws from propagation back to source protocolOndrej Zajicek (work)
The earlier fix loosen conditions for not running filters on old route when deciding about route propagation to a protocol to avoid issues with ghost routes in some race conditions. Unfortunately, the fix also caused back-propagation of withdraws. For regular updates, back-propagation is prevented in import_control hooks, but these are not called on withdraws. For them, import_control hooks are called on old routes instead, changing (old, NULL) notification to (NULL, NULL), which is ignored. By not calling export processing in some cases, the withdraw is not ignored and is back-propagated. This patch fixes that by contract conditions so the earlier fix is not applied to back-propagated updates.
2019-01-17Nest: Don't make tmp_attr before preexport is calledJan Maria Matejka
2018-12-12Nest: Update statistics and rx-limit for Adj-RIB-InOndrej Zajicek (work)
2018-12-12BGP: implement Adj-RIB-InOndrej Zajicek (work)
The patch implements optional internal import table to a channel and hooks it to BGP so it can be used as Adj-RIB-In. When enabled, all received (pre-filtered) routes are stored there and import filters can be re-evaluated without explicit route refresh. An import table can be examined using e.g. 'show route import table bgp1.ipv4'.
2018-12-04Terminology cleanup: The import_control hook is now called preexport.Jan Maria Matejka
Once upon a time, far far away, there were the old Bird developers discussing what direction of route flow shall be called import and export. They decided to say "import to protocol" and "export to table" when speaking about a protocol. When speaking about a table, they spoke about "importing to table" and "exporting to protocol". The latter terminology was adopted in configuration, then also the bird CLI in commit ea2ae6dd0 started to use it (in year 2009). Now it's 2018 and the terminology is the latter. Import is from protocol to table, export is from table to protocol. Anyway, there was still an import_control hook which executed right before route export. One thing is funny. There are two commits in April 1999 with just two minutes between them. The older announces the final settlement on config terminology, the newer uses the other definition. Let's see their commit messages as the git-log tool shows them (the newer first): commit 9e0e485e50ea74c4f1c5cb65bdfe6ce819c2cee2 Author: Martin Mares <mj@ucw.cz> Date: Mon Apr 5 20:17:59 1999 +0000 Added some new protocol hooks (look at the comments for better explanation): make_tmp_attrs Convert inline attributes to ea_list store_tmp_attrs Convert ea_list to inline attributes import_control Pre-import decisions commit 5056c559c4eb253a4eee10cf35b694faec5265eb Author: Martin Mares <mj@ucw.cz> Date: Mon Apr 5 20:15:31 1999 +0000 Changed syntax of attaching filters to protocols to hopefully the final version: EXPORT <filter-spec> for outbound routes (i.e., those announced by BIRD to the rest of the world). IMPORT <filter-spec> for inbound routes (i.e., those imported by BIRD from the rest of the world). where <filter-spec> is one of: ALL pass all routes NONE drop all routes FILTER <name> use named filter FILTER { <filter> } use explicitly defined filter For all protocols, the default is IMPORT ALL, EXPORT NONE. This includes the kernel protocol, so that you need to add EXPORT ALL to get the previous configuration of kernel syncer (as usually, see doc/bird.conf.example for a bird.conf example :)). Let's say RIP to this almost 19-years-old inconsistency. For now, if you import a route, it is always from protocol to table. If you export a route, it is always from table to protocol. And they lived happily ever after.
2018-11-20The MRT protocolOndrej Zajicek (work)
The new MRT protocol is responsible for periodic RIB table dumps in the MRT format (RFC 6396). Also the existing code for BGP4MP MRT dumps is refactored and splitted between BGP to MRT protocols, will be more integrated into MRT in the future. Example: protocol mrt { table "*"; filename "%N_%F_%T.mrt"; period 60; } It is partially based on the old MRT code from Pavel Tvrdik.