Saving and evaluating network paths in Neo4j

Feeding Machine

Of course, it would be extremely tedious to type in the data manually for a large network. That's why the routers.yml file in Listing 1 [2] defines the key data for all routers in the readable YAML format. The links between the routers as relations in the graph later implicitly result from connecting the gateway addresses of one router with the LAN IP of the next one.

Listing 1



The script in Listing 2 grabs the Yaml records of all the routers from the routers.yml file. Starting in line 28, it iterates through the list and dumps it into the database as a collection of nodes  – thanks to the CPAN REST::Neo4p module. While doing so, it saves all the references to the node objects created in the %lans hash using the LAN IPs as the key.

Listing 2



The script also stores the gateway IPs encountered in the @gateways array along with the routers that use these gateway IPs. Starting in line 46, a for loop iterates through @gateways, uses the %lans hash to discover the target object, and defines one gateway relation from the start to the destination router in the database using the method relate_to().

As soon as the script finishes, a glance at the browser interface provided through port 7474 shows that the data is properly stored in Neo4j (Figure 2).

Figure 2: The Neo4j browser interface on port 7474 renders Cypher queries graphically.

With a query such as MATCH (n) RETURN n, which returns all previously stored nodes, the database browser shows the nodes as numbered squiggles in graph mode and their relationships as labeled arrows that point from one node to the next in the graph.

When you click on a node, its attributes are displayed in a dialog box that pops up. In text mode, the boxes shown in Figure 2 with the attribute values of the nodes edge into the picture.

Gordian Knot

To be able to replace the network structure in the database without leftovers if changes are made to the Yaml data, Listing  2 starts by deleting all the previously defined nodes and relations in the graph. This step is not as easy as you might think, because to keep the data model intact, Neo4j refuses to delete nodes that still have relationships. The Cypher query for purging the database therefore does:


This matches all nodes n – as well as any relationships to another (anonymous) node emanating from them. The delete statement that follows then deletes the entry for the node, including it's outgoing relations.

To show all the routers stored in the database with the Neo4j shell, you would just run the query shown in Figure 3. If you simply had a RETURN n, instead of the return clause with three attribute values of interest (, n.hardware_vendor, …), the query result would contain all the defined attributes, which could be hard to read. Instead, the query uses RETURN with specific field names to hide any values that are not of interest.

Figure 3: This query searches for and finds all the devices stored in the database.

Walking Paths

Of course, the advantages of a graph database are not found in menial services, such as finding nodes by their attributes, but in finding connection paths between nodes. It's actually quite simple to find out which network connections exist in the graph: The query in Figure 4 uses MATCH(n) to consider all the routers on the network. The relation description -[r:gateway*]-> matches one or more relations of the type gateway; the (m) in the query after the relation description stands for the terminating node in the found path.

Figure 4: This query shows all possible paths in the graph via gateway relations.

Because the Cypher query stores the results in the p variable with a prepended p=, a subsequent RETURN p would output all routing paths with all the routers considered by the query including their attributes. That would be a real mess of data. The return statement therefore restricts the output to the router name at the beginning of the route and uses collect( to concatenate the names of all subsequent routers on the path covered by the query.

The output in Figure 4 thus has two columns: The first contains the identified start router, and the second contains a list with the names of routers passed through on the way to the open Internet.

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy Linux Magazine

Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

  • Skydive

    If you don't speak fluent Ethernet, it sometimes helps to get a graphical view of what your network is doing. Skydive offers visual insights that could reveal complex error patterns.

  • Kaspersky Polishes Mail Gateway

    Russian security specialist Kaspersky has reworked its anti-spam product for Linux and Unix servers.

  • Perl: isp-switch

    When an Internet provider goes down, users suffer. Alternatively, users can immediately switch to another ISP. We’ll show you a Perl script that can help you reconfigure your computer to make the switch.


    They say data is "the new oil," but all that data you collect is only valuable if it leads to new insights. An open source analysis tool called KNIME lets you analyze data through graphical workflows – without the need for programming or complex spreadsheet manipulation.

  • Charly's Column – Munin

    What do you do if the Munin system monitoring tool does not have a native Munin client for a device? Sys admin Charly has a solution.

comments powered by Disqus
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters

Support Our Work

Linux Magazine content is made possible with support from readers like you. Please consider contributing when you’ve found an article to be beneficial.

Learn More