Skip to content
Snippets Groups Projects
  1. Feb 07, 2022
    • Ondřej Zajíček's avatar
      Lib: Update alignment of slabs · edc1a240
      Ondřej Zajíček authored
      Alignment of slabs should be at least sizeof(ptr) to avoid unaligned
      pointers in slab structures. Fixme: Use proper way to choose alignment
      for internal allocators.
      edc1a240
  2. Feb 06, 2022
    • Ondřej Zajíček's avatar
      Merge branch 'oz-trie-table' · 53a25406
      Ondřej Zajíček authored
      53a25406
    • Ondřej Zajíček's avatar
      Trie: Fix trie format · 24600c64
      Ondřej Zajíček authored
      After switching to 16-way tries, trie format ignored unaligned / internal
      prefixes and only reported the primary prefix of a trie node.
      
      Fix trie format by showing internal prefixes based on the 'local' bitmask
      of a node. Also do basic (intra-node) reconstruction of prefix patterns
      by finding common subtrees in 'local' bitmask.
      
      In future, we could improve that by doing inter-node reconstruction, so
      prefixes entered as one pattern for a subtree (e.g. 192.168.0.0/18+)
      would be reported as such, like with aligned prefixes.
      24600c64
    • Ondřej Zajíček's avatar
      Nest: Implement locking of prefix tries during walks · 5a89edc6
      Ondřej Zajíček authored
      The prune loop may may rebuild the prefix trie and therefore invalidate
      walk state for asynchronous walks (used in 'show route in' cmd). Fix it
      by adding locking that keeps the old trie in memory until current walks
      are done.
      
      In future this could be improved by rebuilding trie walk states (by
      lookup for last found prefix) after the prefix trie rebuild.
      5a89edc6
    • Ondřej Zajíček's avatar
      Nest: Implement prefix trie pruning · de6318f7
      Ondřej Zajíček authored
      When rtable is pruned and network fib nodes are removed, we also need to
      prune prefix trie. Unfortunately, rebuilding prefix trie takes long time
      (got about 400 ms for 1M networks), so must not be atomic, we have to
      rebuild a new trie while current one is still active. That may require
      some considerable amount of temporary memory, so we do that only if
      we expect significant trie size reduction.
      de6318f7
    • Ondřej Zajíček's avatar
      Trie: Add prefix counter · ba5aec94
      Ondřej Zajíček authored
      Add counter of prefixes stored in trie. Works only for 'restricted' tries
      composed of explicit prefixes (pxlen == l == h), like ones used in rtables.
      ba5aec94
    • Ondřej Zajíček's avatar
      Doc: Describe routing table options · d0f9a77f
      Ondřej Zajíček authored
      d0f9a77f
    • Ondřej Zajíček's avatar
      BGP: Implement flowspec validation procedure · 1f2eb2ac
      Ondřej Zajíček authored
      Implement flowspec validation procedure as described in RFC 8955 sec. 6
      and RFC 9117. The Validation procedure enforces that only routers in the
      forwarding path for a network can originate flowspec rules for that
      network.
      
      The patch adds new mechanism for tracking inter-table dependencies, which
      is necessary as the flowspec validation depends on IP routes, and flowspec
      rules must be revalidated when best IP routes change.
      
      The validation procedure is disabled by default and requires that
      relevant IP table uses trie, as it uses interval queries for subnets.
      1f2eb2ac
    • Ondřej Zajíček's avatar
      Nest: Add routing table configuration blocks · 1ae42e52
      Ondřej Zajíček authored
      Allow to specify sorted flag, trie fla, and min/max settle time.
      
      Also do not enable trie by default, it must be explicitly enabled.
      1ae42e52
    • Ondřej Zajíček's avatar
    • Ondřej Zajíček's avatar
      Nest: Avoid unnecessary net_format() in 'show route' command · 61375bd0
      Ondřej Zajíček authored
      When output of 'show route' command was generated, the net_format() was
      called for each network prematurely, even if the result was not needed.
      
      Fix the code to call net_format() only when needed. This makes queries
      that process many networks but show only few (e.g. 'show route where ..',
      or 'show route count') much faster (like 5x - 10x faster).
      61375bd0
    • Ondřej Zajíček's avatar
      Nest: Add trie iteration code to 'show route' · 9ac16df3
      Ondřej Zajíček authored
      Add trie iteration code to rt_show_cont() CLI hook and use it to
      accelerate 'show route in <addr>' commands using interval queries.
      9ac16df3
    • Ondřej Zajíček's avatar
      Nest: Implement 'show route in <addr>' command · ea97b890
      Ondřej Zajíček authored
      Implement 'show route in <addr>' command, which shows all routes in
      networks that are subnets of given network. Currently limited to IP
      network types.
      ea97b890
    • Ondřej Zajíček's avatar
      Nest: Attach prefix trie to rtable for faster LPM and interval queries · 836a87b8
      Ondřej Zajíček authored
      Attach a prefix trie to IP/VPN/ROA tables. Use it for net_route() and
      net_roa_check(). This leads to 3-5x speedups for IPv4 and 5-10x
      speedup for IPv6 of these calls.
      
      TODO:
       - Rebuild the trie during rt_prune_table()
       - Better way to avoid trie_add_prefix() in net_get() for existing tables
       - Make it configurable (?)
      836a87b8
  3. Jan 28, 2022
    • Ondřej Zajíček's avatar
      BGP: Make routing loops silent · 4c6ee53f
      Ondřej Zajíček authored
      One of previous commits added error logging of invalid routes. This
      also inadvertently caused error logging of route loops, which should
      be ignored silently. Fix that.
      4c6ee53f
    • Ondřej Zajíček's avatar
      BGP: Use proper class in attribute error messages · 963b2c7c
      Ondřej Zajíček authored
      Most error messages in attribute processing are in rx/decode step and
      these use L_REMOTE log class. But there are few that are in tx/export
      step and these should use L_ERR log class.
      
      Use tx-specific macro (REJECT()) in tx/export code and rename field
      err_withdraw to err_reject in struct bgp_export_state to ensure that
      appropriate error reporting macros are called in proper contexts.
      963b2c7c
    • Ondřej Zajíček's avatar
      BGP: Improve 'invalid next hop' error reporting · 75d01ecc
      Ondřej Zajíček authored
      Distinguish multiple causes of 'invalid next hop' message and report
      the relevant next hop address.
      
      Thanks to Simon Ruderich for the original patch.
      75d01ecc
  4. Jan 24, 2022
  5. Jan 17, 2022
  6. Jan 15, 2022
  7. Jan 14, 2022
  8. Jan 09, 2022
    • Ondřej Zajíček's avatar
      BGP: Add option 'free bind' · 60e9def9
      Ondřej Zajíček authored
      The BGP 'free bind' option applies the IP_FREEBIND/IPV6_FREEBIND
      socket option for the BGP listening socket.
      
      Thanks to Alexander Zubkov for the idea.
      60e9def9
  9. Jan 08, 2022
    • Alexander Zubkov's avatar
      IO: Support nonlocal bind in socket interface · 87a02489
      Alexander Zubkov authored and Ondřej Zajíček's avatar Ondřej Zajíček committed
      Add option to socket interface for nonlocal binding, i.e. binding to an
      IP address that is not present on interfaces. This behaviour is enabled
      when SKF_FREEBIND socket flag is set. For Linux systems, it is
      implemented by IP_FREEBIND socket flag.
      
      Minor changes done by commiter.
      87a02489
  10. Jan 05, 2022
  11. Dec 28, 2021
  12. Dec 27, 2021
    • Ondřej Zajíček's avatar
      BSD: Assume onlink flag on ifaces with only host addresses · a39cd2cc
      Ondřej Zajíček authored
      The BSD kernel does not support the onlink flag and BIRD does not use
      direct routes for next hop validation, instead depends on interface
      address ranges. We would like to handle PtMP cases with only host
      addresses configured, like:
      
        ifconfig wg0 192.168.0.10/32
        route add 192.168.0.4 -iface wg0
        route add 192.168.0.8 -iface wg0
      
      To accept BIRD routes with onlink next-hop, like:
      
        route 192.168.42.0/24 via 192.168.0.4%wg0 onlink
      
      BIRD would dismiss the route when receiving from the kernel, as the
      next-hop 192.168.0.4 is not part of any interface subnet and onlink
      flag is not kept by the BSD kernel.
      
      The commit fixes this by assuming that for routes received from the
      kernel, any next-hop is onlink on ifaces with only host addresses.
      
      Thanks to Stefan Haller for the original patch.
      a39cd2cc
  13. Dec 18, 2021
    • Job Snijders's avatar
      RPKI: Add contextual out-of-bound checks in RTR Prefix PDU handler · b9f38727
      Job Snijders authored and Ondřej Zajíček's avatar Ondřej Zajíček committed
      RFC 6810 and RFC 8210 specify that the "Max Length" value MUST NOT be
      less than the Prefix Length element (underflow). On the other side,
      overflow of the Max Length element also is possible, it being an 8-bit
      unsigned integer allows for values larger than 32 or 128. This also
      implicitly ensures there is no overflow of "Length" value.
      
      When a PDU is received where the Max Length field is corrputed, the RTR
      client (BIRD) should immediately terminate the session, flush all data
      learned from that cache, and log an error for the operator.
      
      Minor changes done by commiter.
      b9f38727
    • Simon Ruderich's avatar
      Doc: bgp: remove "advertise ipv4" · 00410fd6
      Simon Ruderich authored and Ondřej Zajíček's avatar Ondřej Zajíček committed
      The option was removed in d15b0b0a ("BGP redesign", 2016-12-07)
      but the documentation wasn't updated.
      00410fd6
    • Ondřej Zajíček's avatar
      Nest: Do not ignore secondary flag changes in ifa updates · b21104c9
      Ondřej Zajíček authored
      Compare all IA_* flags that are set by sysdep iface code.
      
      The old code ignores IA_SECONDARY flag when comparing whether iface
      address updates from kernel changed anything. This is usually not an
      issue as kernel removes all secondary addresses due to removal of the
      primary one, but it breaks when sysctl 'promote_secondaries' is enabled
      and kernel promotes secondary addresses to primary ones.
      
      Thanks to 'Alexander' for the bugreport.
      b21104c9
  14. Dec 02, 2021
Loading