summaryrefslogtreecommitdiff
path: root/sysdep/linux/netlink.c
AgeCommit message (Collapse)Author
2022-08-05Merge commit '082905a8' into thread-nextMaria Matejka
2022-08-05Merge commit '534d0a4b' into thread-nextMaria Matejka
2022-07-27Merge branch 'master' into backportOndrej Zajicek
2022-07-26Netlink: Restrict route replace for IPv6Ondrej Zajicek
Seems like the previous patch was too optimistic, as route replace is still broken even in Linux 4.19 LTS (but fixed in Linux 5.10 LTS) for: ip route add 2001:db8::/32 via fe80::1 dev eth0 ip route replace 2001:db8::/32 dev eth0 It ends with two routes instead of just the second. The issue is limited to direct and special type (e.g. unreachable) routes, the patch restricts route replace for cases when the new route is a regular route (with a next hop address).
2022-07-25Netlink: Simplify handling of IPv6 ECMP routesOndrej Zajicek
When IPv6 ECMP support first appeared in Linux kernel, it used different API than IPv4 ECMP. Individual next hops were updated and announced separately, instead of using RTA_MULTIPATH as in IPv4. This has several drawbacks and requires complex code to merge received notifications to one multipath route. When Linux came with IPv6 RTA_MULTIPATH support, the initial versions were somewhat buggy, so we kept using the old API for updates (splitting multipath routes to sequences of route updates), while accepting both old-style routes and RTA_MULTIPATH routes in scans / notifications. As IPv6 RTA_MULTIPATH support is here for a long time, this patch fully switches Netlink to the IPv6 RTA_MULTIPATH API and removes old complex code for handling individual next hop announces. The required Linux version is at least 4.11 for reliable operation. Thanks to Daniel Gröber for the original patch.
2022-07-24KRT: Scan routing tables separetely on linux to avoid congestionOndrej Zajicek
Remove compile-time sysdep option CONFIG_ALL_TABLES_AT_ONCE, replace it with runtime ability to run either separate table scans or shared scan. On Linux, use separate table scans by default when the netlink socket option NETLINK_GET_STRICT_CHK is available, but retreat to shared scan when it fails. Running separate table scans has advantages where some routing tables are managed independently, e.g. when multiple routing daemons are running on the same machine, as kernel routing table modification performance is significantly reduced when the table is modified while it is being scanned. Thanks Daniel Gröber for the original patch and Toke Høiland-Jørgensen for suggestions.
2022-07-11Dropped the internal kernel protocol table for learnt routes.Maria Matejka
The learnt routes are now pushed all into the connected table, not only the best one. This shouldn't do any damage in well managed setups, yet it should be noted that it is a change of behavior. If anybody misses a feature which they implemented by misusing this internal learn table, let us know, we'll consider implementing it in a better way.
2022-06-08Merge commit '938742decc6e1d6d3a0375dd012b75172e747bbc' into haugesundMaria Matejka
2022-06-08Merge commit '950775f6fa3d569a9d7cd05e33538d35e895d688' into haugesundMaria Matejka
There were quite a lot of conflicts in flowspec validation code which ultimately led to some code being a bit rewritten, not only adapted from this or that branch, yet it is still in a limit of a merge.
2022-05-30Merge commit 'f15f2fcee7eeb5a100bd204a0e67018e25953420' into haugesundMaria Matejka
2022-05-30Merge commit '1c30b689ddd032ef8000fb7836348a48ba3184ff' into haugesundMaria Matejka
2022-05-30Merge commit '702c04fbef222e802ca4dfac645dc75ede522db6' into haugesundMaria Matejka
2022-05-30Merge commit '17f91f9e6e70f7e3f29502e854823c0d48571eaa' into haugesundMaria Matejka
2022-05-30Merge commit 'cf07d8ad79273a3bbf0617c17e438602e4b64ece' into haugesundMaria Matejka
2022-05-30Merge commit 'ef6a903e6f44b467f9606018446095521ad01ef1' into haugesundMaria Matejka
2022-05-30Merge commit '0d0f6554a5c233bf2bf830ae319191c4b1808d49' into haugesundMaria Matejka
2022-05-30Merge commit '80272d4b64a38ee6f04a1c4e8566cac3a2293176' into haugesundMaria Matejka
2022-05-30Squashing the route attribute structure into one level.Maria Matejka
For now, all route attributes are stored as eattrs in ea_list. This should make route manipulation easier and it also allows for a layered approach of route attributes where updates from filters will be stored as an overlay over the previous version.
2022-05-30Route destination field merged with nexthop attribute; splitting flowspec ↵Maria Matejka
validation result out. As there is either a nexthop or another destination specification (or othing in case of ROAs and Flowspec), it may be merged together. This code is somehow quirky and should be replaced in future by better implementation of nexthop. Also flowspec validation result has its own attribute now as it doesn't have anything to do with route nexthop.
2022-05-26Moved nexthop from struct rta to extended attribute.Maria Matejka
This doesn't do anything more than to put the whole structure inside adata. The overall performance is certainly going downhill; we'll optimize this later. Anyway, this is one of the latest items inside rta and in several commits we may drop rta completely and move to eattrs-only routes.
2022-05-04Moved route source attribute (RTS_*) to eattrsMaria Matejka
2022-05-04Removing the route scope attribute. Use custom attributes instead.Maria Matejka
The route scope attribute was used for simple user route marking. As there is a better tool for this (custom attributes), the old and limited way can be dropped.
2022-05-04Explicit definition structures of route attributesMaria Matejka
Changes in internal API: * Every route attribute must be defined as struct ea_class somewhere. * Registration of route attributes known at startup must be done by ea_register_init() from protocol build functions. * Every attribute has now its symbol registered in a global symbol table defined as SYM_ATTRIBUTE * All attribute ID's are dynamically allocated. * Attribute value custom formatting hook is defined in the ea_class. * Attribute names are the same for display and filters, always prefixed by protocol name. Also added some unit testing code for filters with route attributes.
2022-05-04Replaced boilerplate eattr allocation by ea_set_attr()Maria Matejka
2022-05-04Splitting route data structures out to libMaria Matejka
2022-05-04Unified attribute and filter typesMaria Matejka
This commit removes the EAF_TYPE_* namespace completely and also for route attributes, filter-based types T_* are used. This simplifies fetching and setting route attributes from filters. Also, there is now union bval which serves as an universal value holder instead of private unions held separately by eattr and filter code.
2022-05-04Filter: Bitfield eattrs reading / writing moved to filter codeMaria Matejka
Before this change, fetch-update-write and bitmasking was hardcoded in attribute access code cased by the attribute type. Several filter instructions are used to do it instead. As this is certainly going to be a little bit slower than before, the switch block in attribute access code should be completely removed in near future, helping with both performance and code cleanliness. The user interface should have stayed intact.
2022-03-09Merge commit '1b9189d5' into haugesundMaria Matejka
2022-03-09Merge commit 'e42eedb9' into haugesundMaria Matejka
2022-03-09Merge commit '5cff1d5f' into haugesundMaria Matejka
Conflicts: proto/bgp/attrs.c proto/pipe/pipe.c
2022-02-08Netlink: Minor cleanupOndrej Zajicek (work)
2022-01-17Netlink: Add option to specify netlink socket receive buffer sizeOndrej Zajicek (work)
Add option 'netlink rx buffer' to specify netlink socket receive buffer size. Uses SO_RCVBUFFORCE, so it can override rmem_max limit. Thanks to Trisha Biswas and Michal for the original patches.
2022-01-15Netlink: Add another workaround for older kernel headersOndrej Zajicek (work)
Unfortunately, SOL_NETLINK is both recently added and arch-dependent, so we cannot just define it.
2022-01-14Netlink: Add workaround for older kernel headersOndrej Zajicek (work)
2022-01-14Netlink: Enable strict checking for KRT dumpsOndrej Zajicek (work)
Add strict checking for netlink KRT dumps to avoid PMTU cache records from FNHE table dump along with KRT. Linux Kernel added FNHE table dump to the netlink API in patch: https://patchwork.ozlabs.org/project/netdev/patch/8d3b68cd37fb5fddc470904cdd6793fcf480c6c1.1561131177.git.sbrivio@redhat.com/ Therefore, since Linux 5.3 these route cache entries are dumped together with regular routes during periodic KRT scans, which in some cases may be huge amount of useless data. This can be avoided by using strict checking for netlink dumps: https://lore.kernel.org/netdev/20181008031644.15989-1-dsahern@kernel.org/ The patch mitigates the risk of receiving unknown and potentially large number of FNHE records that would block BIRD I/O in each sync. There is a known issue caused by the GRE tunnels on Linux that seems to be creating one FNHE record for each destination IP address that is routed through the tunnel, even when the PMTU equals to GRE interface MTU. Thanks to Tomas Hlavacek for the original patch.
2022-01-14Netlink: Explicitly skip received cloned routesOndrej Zajicek (work)
Kernel uses cloned routes to keep route cache entries, but reports them together with regular routes. They were skipped implicitly as they do not have rtm_protocol filled. Add explicit check for cloned flag and skip such routes explicitly. Also, improve debug logs of skipped routes.
2022-01-05Netlink: Do not ignore dead routes from BIRDOndrej Zajicek (work)
Currently, BIRD ignores dead routes to consider them absent. But it also ignores its own routes and thus it can not correctly manage such routes in some cases. This patch makes an exception for routes with proto bird when ignoring dead routes, so they can be properly updated or removed. Thanks to Alexander Zubkov for the original patch.
2022-01-05Netlink: Improve multipath parsing errorsOndrej Zajicek (work)
Function nl_parse_multipath() should handle errors internally.
2021-11-09Split route data structure to storage (ro) / manipulation (rw) structures.Maria Matejka
Routes are now allocated only when they are just to be inserted to the table. Updating a route needs a locally allocated route structure. Ownership of the attributes is also now not transfered from protocols to tables and vice versa but just borrowed which should be easier to handle in a multithreaded environment.
2021-10-13Kernel: Convert the rte-local attributes to extended attributes and flags to ↵Maria Matejka
pflags
2021-10-13Route: moved rte_src pointer from rta to rteMaria Matejka
It is an auxiliary key in the routing table, not a route attribute.
2021-10-13Dropping the RTS_DUMMY temporary route storage.Maria Matejka
Kernel route sync is done by other ways now and this code is not used currently.
2021-01-14Netlink: Ignore dead routesOndrej Zajicek (work)
With net.ipv4.conf.XXX.ignore_routes_with_linkdown sysctl, a user can ensure the kernel does not use a route whose target interface is down. Such route is marked with a 'dead' / RTNH_F_DEAD flag. Ignore these routes or multipath nexthops during scan. Thanks to Vincent Bernat for the original patch.
2021-01-06Kernel: Fix handling of krt_realm with ECMP routesOndrej Zajicek (work)
For ECMP routes, RTA_FLOW attribute must be set per-nexthop, not per-route. Our corresponding krt_realm attribute is per-route. Thanks to Mikhail Petrov for the bugreport.
2020-06-03Netlink: Fix parsing of MPLS multipath routesKazuki Yamaguchi
Add support for RTA_MULTIPATH attribute parsing for AF_MPLS routes. BIRD is capable of installing a multipath route into kernel on Linux, but it would not be seen because parsing fails. This made BIRD attempt to install the same route repeatedly. (The patch minorly updated by committer)
2020-05-01Nest: Added const to ea_show just to declare that this shouldn't really ↵Maria Matejka
change anything
2020-03-07Netlink: Handle interfaces with missing broadcast addressesOndrej Zajicek (work)
2019-12-19KRT: Remove KRF_SYNC_ERROR flagOndrej Zajicek (work)
This info is now stored in an internal bmap. Unfortunately, net.flags is still needed for temporary kernel data.
2019-11-12Netlink: Handle IPv4 routes with IPv6 nexthopsOndrej Zajicek
Accept RTA_VIA attribute in all cases. The old code always used RTA_GATEWAY for IPv4 / IPv6 and RTA_VIA for MPLS. The new code uses RTA_VIA in cases where AF of network and AF of nexthop differs.
2019-07-24Merge remote-tracking branch 'origin/mq-filter-stack'Ondrej Zajicek (work)