The graphalgo package

A graph provides a very attractive solution when you want to model real-world data. As they are more flexible than RDBMS, they offer an intuitive approach and are practically relevant to the way we think of stuff. The graph world revolves around several featured algorithms that are used to process graphs and for route calculation, detection of loops, calculation of the shortest path, subgraph and pattern matching being a few of them. Although you can implement your own collection of algorithms and tweaks, Neo4j also includes a set of predefined algorithms that you use most for rapid application development, even for scenarios that involve large volumes of data. They are packaged in a library called the graphalgo that you can use directly in your Java code fragments. The REST API also exposes a few of these algorithms such as dijkstra's and A* to be used with requests sent to the REST server. The graphalgo interfaces can be accessed and used in your programs using methods of the GraphAlgoFactory class. Some of the methods that can prove quite useful at times are:

  • allPaths(PathExpander, int): This method returns an algorithm that can be used to calculate all possible paths between two specified nodes. Paths with loops can also be calculated using this method.
  • allSimplePaths(PathExpander, int): This method returns an algorithm that does a similar job as the allPaths algorithm, except that it returns paths that do not contain a loop.
  • aStar(PathExpander, CostEvaluator<Double>, EstimateEvaluator<Double>): This method returns a variable of type PathFinder that can use the A* algorithm to calculate the path with the minimum weight/cost between two specified nodes.
  • dijkstra(PathExpander, CostEvaluator<Double>): This method returns a variable of type PathFinder that operates similar to the one in the previous method, except it uses the Dijkstra's algorithm instead of A*.
  • pathsWithLength(PathExpander, int): It returns an algorithm that can be used to a specific weighted path between two nodes.
  • shortestPath(PathExpander, int): From this method, you get an algorithm to calculate all possible shortest paths that exist between a given pair of nodes.

All the preceding algorithms use an instance of PathExpander as a parameter, which contains the logic to decide which relationship to expand on, or select the next relationship in the process of traversal. Alternatively, the preceding methods also allow the use of RelationshipExpander, which is a similarly flexible way of getting the relationships from a particular node. All the preceding methods return a value of the type PathFinder, which you can use to retrieve the paths from the algorithms. Let's see the use of some of these through an example:

The graphalgo package

The preceding diagram illustrates a graphical scenario of interconnected cities that we will be using for our example. The following code shows how the graph can be created and operated upon:

//Create a sample graph
Node cityA = createNode( "city", "London", "x", 0d, "y", 0d );
Node cityB = createNode( "city", "New York", "x", 7d, "y", 0d );
Node cityC = createNode( "city", "Bangalore", "x", 2d, "y", 1d );
Relationship distAB = createRelationship( cityA, cityC, "distance",20d );
Relationship distBC = createRelationship( cityC, cityB, "distance",30d );
Relationship distAC = createRelationship( cityA, cityB, "distance",100d );

EstimateEvaluator<Double> estimateEvaluator = new EstimateEvaluator<Double>()
{
    @Override
    public Double getCost( final Node node, final Node goal )
    {
        double costx = (Double) node.getProperty( "x" ) - (Double) goal.getProperty( "x" );
        double costy = (Double) node.getProperty( "y" ) - (Double) goal.getProperty( "y" );
        double answer = Math.sqrt( Math.pow( costx, 2 ) + Math.pow( costy, 2 ) );
        return answer;
    }
};

//Use the A* algorithm
PathFinder<WeightedPath> astarFinder = GraphAlgoFactory.aStar(PathExpanders.allTypesAndDirections(),
    CommonEvaluators.doubleCostEvaluator( "distance" ), estimateEvaluator );
WeightedPath astarPath = astarFinder.findSinglePath( cityA, cityB );


//Using the Dijkstra's algorithm
PathFinder<WeightedPath> dijkstraFinder = GraphAlgoFactory.dijkstra(
    PathExpanders.forTypeAndDirection( ExampleTypes.REL_TYPE, Direction.BOTH ), "distance" );

WeightedPath shortestPath = dijkstraFinder.findSinglePath( cityA, cityC );
//print the weight of this path
System.out.println(shortestPath.weight());


//Using the find all paths method
PathFinder<Path> allPathFinder = GraphAlgoFactory.shortestPath(
    PathExpanders.forTypeAndDirection( ExampleTypes.REL_TYPE, Direction.OUTGOING ), 15 );
Iterable<Path> all_paths = allPathFinder.findAllPaths( cityA, cityC );

The REST interface of the Neo4j server is also capable of executing these graph algorithms, where you can define the start node and the type of algorithm in the body of the POST request, as illustrated here:

POST http://localhost:7474/db/data/node/264/path
Accept: application/json; charset=UTF-8
Content-Type: application/json
{
  "to" : "http://localhost:7474/db/data/node/261",
  "cost_property" : "cost",
  "relationships" : {
    "type" : "to",
    "direction" : "out"
  },
  "algorithm" : "dijkstra"
}

These algorithms operate efficiently in moderate to large graphs. However, when you are processing graphs that require visiting billions of vertices that are highly connected in the graph, you can tweak these algorithms to improve performance. For example, if your graph cannot fit on a single instance and you resort to a cluster, then you need to modify your algorithm to traverse across machines, or to calculate in parts. However, for most big data scenarios today, they provide optimal performance.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset