[ixpmanager] Order of customers' config in the RS configuration of the IXP manager

Barry O'Donovan barry.odonovan at inex.ie
Mon Dec 15 14:43:06 GMT 2014


Hi Andreas,

First of all, apologies for the delay in replying. You've sent a few
mails I've wanted to reply to - including to the Euro-IX Bird list - but
I've just been swamped.

Starting with the end of your email:

> Before filing a feature request, I wonder how other IXP-manager users
> handle this: Do you use a different procedure to update the route
> servers? Do humans need to review the changes before getting applied
> (or at least they get notified)? Do you solve the problem in a
> different approach?

At INEX we've been running route servers for >7 years and have been
auto-generating them from the very beginning. We've never had any issues.

Human review is not necessary nor do we /require/ emails of changes. Our 
Quagga based route server is on our RANCID configuration management 
system which does email us. Bird is not (but see below re backups).

All changes are deterministic (* - see below also!) - if we haven't 
enabled / disabled / changed a vlan interface for the route server, then 
the configuration doesn't change. It's also important to say that there 
are never 'human' changes to the generated script. The deterministic 
nature of the configuration is the basis for the Travis-CI tests also 
(more below).

Any changes in the configuration go via IXP Manager templates (with 
testing including Travis CI tests for known good / expected against 
generated). This is all documented in the wiki - http://git.io/Evv2MQ 
and http://git.io/GrAnqg.

That's not to say we have blind faith that every step of the process
just works! I.e. there's no point reloading Bird of the API call results
in an error.

> Also, if someone would be interested to share scripts, we would
> certainly appreciate it. And we would be glad to upload ours if such
>  a repository exists somewhere.

See: http://git.io/dsFP9w (documented in the wiki at: http://git.io/bv_i0g

This is the script we use to download a new route server configuration 
and it:

  - ensures wget terminated successfully
  - ensures the new file exists and has content
  - ensures we have a minimum number of neighbors
  - runs Bird parser against the new file

Only if these succeed do we reload Bird. At this point we could also 
email a diff of new versus old.

If any step fails, the script produce output which cron emails to us.

Note also that Bird / Quagga are monitored on another level by our 
Nagios implementation.

> The IXP manager does a great job creating the bird configuration
> through the skinned templates. What is missing, however, is a way to
> push the configuration to the route servers, check it, load it, and
> maybe notify the admins.

Please see script referenced above :-)

> For that reason we are writing a script that:
 > a) updates the IXP manager ASN and prefixes database
 > b) produces the bird configuration
> c) compares with the old bird config, notify us about changes
 > d) push the configuration to the route servers Then, the route
 > servers need to get reloaded; manually or semi-automatically
 > at the begging, automatically later.

We've discussed this script with Rowan also. For (a) above, please note 
that we have documentation at http://git.io/7jnKcQ

We have put a lot of effort into decoupling (a) from the RS build. 
Chaining the update of the as/prefix table to the rs build can lead to 
~1hour of processing time (ymmv depending on members, AS-SET unwrapping, 
etc). What we have now is a transaction safe process where they can be 
run independently and on separate servers.

Note two things mentioned in the documentation for (a):

* we use transactions to update the database so, even in the middle of a 
(route server configuration) refresh, a full set of prefixes for all 
customers will still be available.
* The command will rigorously validate the return code and output of 
BGPQ3 and it will throw an alert rather than removing prefixes when/if 
BGPQ3 returns an empty prefix list where prefixes already exist in the 
database.

I suggest, in the UNIX way, a tool should do one job and do it well - a 
complicated script trying to (unnecessarily) couple together different 
jobs can be confusing and difficult to manage.

Rowan's script also had Git functionality. I'm not sure this is 
necessary as:

  - for a known database and version of IXP Manager, the configuration 
is always reproducible.
  - standard server backups should maintain all three of the above 
without the additional need for Git.
  - in 7 years, we have never needed to revisit an older configuration.

> And now our problem: If a customer has more that one connections
> (i.e, there are at least two different #vliid}, the order that these
> will be processed by the IXP manager is random.

Well, not random but rather based on the order returned by the database. 
Granted, that may effectively be random ;-)

Interestingly, our test database always returned data in the same order; 
even prefix and ASN lists - so we were never hit with this issue.

I've just tested and pushed the following diff:

--- a/application/Repositories/VlanInterface.php
+++ b/application/Repositories/VlanInterface.php
@@ -72,7 +72,7 @@ class VlanInterface extends EntityRepository
                          AND " . Customer::DQL_CUST_TRAFFICING . "
                          AND pi.status = :pistatus";

-        $qstr .= " ORDER BY c.autsys ASC";
+        $qstr .= " ORDER BY c.autsys ASC, vli.id ASC";

This will ensure deterministic ordering in all cases.

...

> This causes diffs to appear, which require unnecessary human
> attention. (You can find our neighbor.cfg file at the end of this
> email)

I've also pushed changes to the select queries for routes / asn acls to 
ensure a deterministic ordering of those.

Good feedback, thanks!


  - Barry





More information about the ixpmanager mailing list