summaryrefslogtreecommitdiff
path: root/nest/rt-table.c
AgeCommit message (Collapse)Author
2023-10-16MPLS: Fix issue with recursive MPLS routesOndrej Zajicek
Recursive MPLS routes used hostentry from the original route, which triggered different table than MPLS table, and therefore were not updated.
2023-10-04MPLS: Handle label allocation failuresOndrej Zajicek
2023-10-04MPLS subsystemOndrej Zajicek
The MPLS subsystem manages MPLS labels and handles their allocation to MPLS-aware routing protocols. These labels are then attached to IP or VPN routes representing label switched paths -- LSPs. There was already a preliminary MPLS support consisting of MPLS label net_addr, MPLS routing tables with static MPLS routes, remote labels in next hops, and kernel protocol support. This patch adds the MPLS domain as a basic structure representing local label space with dynamic label allocator and configurable label ranges. To represent LSPs, allocated local labels can be attached as route attributes to IP or VPN routes with local labels as attributes. There are several steps for handling LSP routes in routing protocols -- deciding to which forwarding equivalence class (FEC) the LSP route belongs, allocating labels for new FECs, announcing MPLS routes for new FECs, attaching labels to LSP routes. The FEC map structure implements basic code for managing FECs in routing protocols, therefore existing protocols can be made MPLS-aware by adding FEC map and delegating most work related to local label management to it.
2023-10-02Nest: Expand rte_src.private_id to u64Ondrej Zajicek
In general, private_id is sparse and protocols may want to map some internal values directly into it. For example, L3VPN needs to map VPN route discriminators to private_id. OTOH, u32 is enough for global_id, as these identifiers are dense.
2023-09-26Basic route aggregationIgor Putovny
Add a new protocol offering route aggregation. User can specify list of route attributes in the configuration file and run route aggregation on the export side of the pipe protocol. Routes are sorted and for every group of equivalent routes new route is created and exported to the routing table. It is also possible to specify filter which will run for every route before aggregation. Furthermore, it will be possible to set attributes of new routes according to attributes of the aggregated routes. This is a work in progress. Original work by Igor Putovny, subsequent cleanups and finalization by Maria Matejka.
2023-09-12Conf: Symbol manipulation gets its context explicitlyMaria Matejka
2023-08-21Nest: Use generic rte_announce() also for import tablesOndrej Zajicek
Remove special rte_announce_in(), so we can use generic rte_announce() for bot feed and notifications.
2023-08-18BMP: Refactor route monitoringOndrej Zajicek
- Manage BMP state through bmp_peer, bmp_stream, bmp_table structures - Use channels and rt_notify() hook for route announcements - Add support for post-policy monitoring - Send End-of-RIB even when there is no routes - Remove rte_update_in_notify() hook from import tables - Update import tables to support channels - Add bmp_hack (no feed / no flush) flag to channels
2023-04-16BMP: Remove duplicate functions for update encodingOndrej Zajicek (work)
Use existing BGP functions also for BMP update encoding.
2023-04-16BMP protocol supportPawel Maslanka
Initial implementation of a basic subset of the BMP (BGP Monitoring Protocol, RFC 7854) from Akamai team. Submitted for further review and improvement.
2023-01-01Nest: Fix several issues with pflagsOndrej Zajicek
There were some confusion about validity and usage of pflags, which caused incorrect usage after some flags from (now removed) protocol- specific area were moved to pflags. We state that pflags: - Are secondary data used by protocol-specific hooks - Can be changed on an existing route (in contrast to copy-on-write for primary data) - Are irrelevant for propagation (not propagated when changed) - Are specific to a routing table (not propagated by pipe) The patch did these fixes: - Do not compare pflags in rte_same(), as they may keep cached values like BGP_REF_STALE, causing spurious propagation. - Initialize pflags to zero in rte_get_temp(), avoid initialization in protocol code, fixing at least two forgotten initializations (krt and one case in babel). - Improve documentation about pflags
2022-12-06Nest: Avoid spurious announcements triggered by filtered routesOndrej Zajicek
When filtered routes (enabled by 'import keep filtered' option) are updated, they trigger announcements by rte_announce(). For regular channels (e.g. type RA_OPTIMAL or RA_ANY) such announcement is just ignored, but in case of RA_ACCEPTED (BGP peer with 'secondary' option) it just reannounces the old (and still valid) best route. The patch ensures that such no-change is ignored even for these channels.
2022-10-10BGP: Add option 'next hop prefer global'Ondrej Zajicek
Add BGP channel option 'next hop prefer global' that modifies BGP recursive next hop resolution to use global next hop IPv6 address instead of link-local next hop IPv6 address for immediate next hop of received routes.
2022-09-06Better profylaction recursive route loopsMaria Matejka
In some specific configurations, it was possible to send BIRD into an infinite loop of recursive next hop resolution. This was caused by route priority inversion. To prevent priority inversions affecting other next hops, we simply refuse to resolve any next hop if the best route for the matching prefix is recursive or any other route with the same preference is recursive. Next hop resolution doesn't change route priority, therefore it is perfectly OK to resolve BGP next hops e.g. by an OSPF route, yet if the same (or covering) prefix is also announced by iBGP, by retraction of the OSPF route we would get a possible priority inversion.
2022-07-22Revert "Export table: Delay freeing of old stored route."Maria Matejka
This reverts commit cee0cd148c9b71bf47d007c850193b5fbf9486c1. This change is not needed in version 2 and the surrounding code has disappeared mostly in version 3.
2022-07-11Added forgotten route source locking in flowspec validationMaria Matejka
2022-07-11Merge commit 'beb5f78a' into backportMaria Matejka
2022-07-10Merge version 2.0.10 into backportMaria Matejka
2022-06-27Preexport callback now takes the channel instead of protocol as argumentMaria Matejka
Passing protocol to preexport was in fact a historical relic from the old times when channels weren't a thing. Refactoring that to match current extensibility needs.
2022-06-04Nest: Improve GC strategy for rtablesOndrej Zajicek
Use timer (configurable as 'gc period') to schedule routing table GC/pruning to ensure that prune is done on time but not too often. Randomize GC timers to avoid concentration of GC events from different tables in one loop cycle. Fix a bug that caused minimum inter-GC interval be 5 us instead of 5 s. Make default 'gc period' adaptive based on number of routing tables, from 10 s for small setups to 600 s for large ones. In marge multi-table RS setup, the patch improved time of flushing a downed peer from 20-30 min to <2 min and removed 40s latencies.
2022-05-30Merge remote-tracking branch 'origin/master' into haugesund-to-2.0Maria Matejka
2022-05-15BGP: Improve tx performance during feed/flushOndrej Zajicek
The prefix hash table in BGP used the same hash function as the rtable. When a batch of routes are exported during feed/flush to the BGP, they all have similar hash values, so they are all crowded in a few slots in the BGP prefix table (which is much smaller - around the size of the batch - and uses higher bits from hash values), making it much slower due to excessive collisions. Use a different hash function to avoid this. Also, increase the batch size to fill 4k BGP packets and increase minimum BGP bucket and prefix hash sizes to avoid back and forth resizing during flushes. This leads to order of magnitude faster flushes (on my test data).
2022-04-06All linpools use pages to allocate regular blocksMaria Matejka
2022-04-06Slab allocator can free the blocks without knowing the parent structureMaria Matejka
2022-03-15Printf variant with a result allocated inside a pool / linpoolMaria Matejka
2022-03-09Merge commit '60880b539b8886f76961125d89a265c6e1112b7a' into haugesundMaria Matejka
2022-03-09BGP Flowspec validation: Removed in-route optimization for multithreading ↵Maria Matejka
compatibility
2022-03-09Merge commit 'e42eedb9' into haugesundMaria Matejka
2022-03-09Merge commit '5cff1d5f' into haugesundMaria Matejka
Conflicts: proto/bgp/attrs.c proto/pipe/pipe.c
2022-03-09Merge commit 'd5a32563' into haugesundMaria Matejka
2022-02-06Nest: Implement locking of prefix tries during walksOndrej Zajicek (work)
The prune loop may may rebuild the prefix trie and therefore invalidate walk state for asynchronous walks (used in 'show route in' cmd). Fix it by adding locking that keeps the old trie in memory until current walks are done. In future this could be improved by rebuilding trie walk states (by lookup for last found prefix) after the prefix trie rebuild.
2022-02-06Nest: Implement prefix trie pruningOndrej Zajicek (work)
When rtable is pruned and network fib nodes are removed, we also need to prune prefix trie. Unfortunately, rebuilding prefix trie takes long time (got about 400 ms for 1M networks), so must not be atomic, we have to rebuild a new trie while current one is still active. That may require some considerable amount of temporary memory, so we do that only if we expect significant trie size reduction.
2022-02-06BGP: Implement flowspec validation procedureOndrej Zajicek (work)
Implement flowspec validation procedure as described in RFC 8955 sec. 6 and RFC 9117. The Validation procedure enforces that only routers in the forwarding path for a network can originate flowspec rules for that network. The patch adds new mechanism for tracking inter-table dependencies, which is necessary as the flowspec validation depends on IP routes, and flowspec rules must be revalidated when best IP routes change. The validation procedure is disabled by default and requires that relevant IP table uses trie, as it uses interval queries for subnets.
2022-02-06Nest: Add routing table configuration blocksOndrej Zajicek (work)
Allow to specify sorted flag, trie fla, and min/max settle time. Also do not enable trie by default, it must be explicitly enabled.
2022-02-06Nest: Attach prefix trie to rtable for faster LPM and interval queriesOndrej Zajicek (work)
Attach a prefix trie to IP/VPN/ROA tables. Use it for net_route() and net_roa_check(). This leads to 3-5x speedups for IPv4 and 5-10x speedup for IPv6 of these calls. TODO: - Rebuild the trie during rt_prune_table() - Better way to avoid trie_add_prefix() in net_get() for existing tables - Make it configurable (?)
2021-11-09Extended route trace: logging Path IdentifiersMaria Matejka
2021-10-13Dropping the unused rte_same hookMaria Matejka
2021-10-13Dropping rte-local dumper entriesMaria Matejka
2021-10-13Route: moved rte_src pointer from rta to rteMaria Matejka
It is an auxiliary key in the routing table, not a route attribute.
2021-10-13Preexport: No route modification, no linpool neededMaria Matejka
2021-10-13RIP fixup + dropping the tmp_attrs mechanism as obsoleteMaria Matejka
2021-10-13Dropping the RTS_DUMMY temporary route storage.Maria Matejka
Kernel route sync is done by other ways now and this code is not used currently.
2021-10-13Preference moved to RTA and set explicitly in protocolsMaria Matejka
2021-10-13Export table: Delay freeing of old stored route.Maria Matejka
This is needed to provide the protocols the full old route after filters when export table is enabled.
2021-10-13IGP metric getter refactoring to protocol callbackMaria Matejka
Direct protocol hooks for IGP metric inside nest/rt-table.c make the protocol API unnecessarily complex. Instead, we use a proper callback.
2021-06-14Nest: Fix export of tmpattrs through pipesOndrej Zajicek (work)
Pipes copy the original rte with old values, so they require rte to be exported with stored tmpattrs. Other protocols access stored attributes using eattr list, so they require rte to be exported with expanded tmpattrs. This is temporary hack, we plan to remove whoe tmpattr mechanism. Thanks to Paul Donohue for the bugreport.
2021-06-14Revert "Nest: Fix export of tmpattrs through pipes"Ondrej Zajicek (work)
This reverts commit f8e273b5e7a3c721f4a30cf27a0b4fe54602e83f.
2021-06-14Nest: Fix export of tmpattrs through pipesOndrej Zajicek (work)
In most cases of export there is no need to store back temporary attributes to rte, as receivers (protocols) access eattr list anyway. But pipe copies the original rte with old values, so we should store tmpattrs also during export. Thanks to Paul Donohue for the bugreport.
2021-04-19Internal route tables have a reduced cleanup routineMaria Matejka
This fixes an internal table cleanup bug introduced in ff397df7edcbe7a8abca5b419729b9c64c063847.
2021-03-30Routing table is now a resource allocated from its own poolMaria Matejka
This also fixes memory leaks from import/export tables being never cleaned up and freed.