Cypher graph operations

Cypher is a whiteboard-friendly language. Like the data on which it is used, queries in Cypher follow a diagrammatic approach in their syntax. This helps to target the use of graph databases to a greater variety of audience including database admins, developers, corporate professionals, and even the common folk. Let's take a look at some Cypher queries before diving into the best practices and optimizations for Cypher.

The following pattern shown depicts three entities interrelated through a relationship denoting the NEEDS dependency. It is represented in the form of an ASCII art:

(A)-[:NEEDS]->(B)-[:NEEDS]->(C), (A)-[:NEEDS]->(C)

The previous statement is in the form of a path that links entity A to B, then B to C, and finally A to C. The directed relation is denoted with the -> operator. As it is evident, patterns denoted in Cypher are a realization of how graphs are represented on a whiteboard. It is worth noting that although a graph can be constructed with edges in both directions, the query-processing languages operate in one direction, for example, from left to right as in the preceding case. This is handled using a list of patterns that are separated with commas. Cypher queries fundamentally make use of patterns of the ASCII art. What a cypher query does is hold on to some initiating part of the graph with a section of its pattern and then use the remaining parts of the pattern to search for local matching entities in the graph.

Cypher clauses

Being a language for querying data, Cypher consists of several clauses to perform different tasks. A simple basic operation with cypher makes use of the START clause to anchor to the source, which is succeeded by a MATCH clause that is used to conditionally traverse through desired nodes in the graph and finally a RETURN clause that outputs the matching values or some computable action result. In the following query, we find a connecting flight path for the city of Alabama using Cypher:

START city1=node:location(name='Alabama')
MATCH (city1)-[:CONNECTS]->(city2)-[:CONNECTS]->(city3), (city1)-[:CONNECTS]->(city3)
RETURN city2, city3

The preceding snippet contains the following three clauses:

  • The START clause: This clause is used to indicate single or multiple starting points for the graph in consideration. The starting points in consideration can be nodes or relationships. We can look the start nodes up with the help of an index or occasionally accessed through the IDs of some node or relationship. In the previous query, we obtain the initial node with the help of an index called location that is asked to locate a place stored with the name property set to 'Alabama'. This statement returns a reference that we bind to an identifier called city1 in the previous example.
  • The MATCH clause: These statements indicate that Cypher matches the pattern given with the initial identifier through the rest of the graph for find a match for the pattern. This way, we retrieve the data that we desire. Nodes are drawn with a set of parentheses and the relationships are indicated with the help of the --> and <-- symbols that also include the direction in which the relationship exists. Within the dashes in the previous symbols for relationships, we can insert the names of the relationships within a set of [ … ] and the name of the connecting relationship can be indicated after a colon.

    Since the pattern in the MATCH clause can occur in many ways, and if the size of the dataset is increased manifold, we will get a very large set of matched results. To avoid this, we use anchoring for a part of the pattern with the help of the START clause.

    The Cypher engine can then match the rest of the querying pattern in the graph surrounding the initiating points or nodes.

  • The RETURN clause: The RETURN clause is used to specify the resulting nodes and connecting relationships that matched the pattern along with their properties in the form of identifiers, which in the previous example matched instances of city2 and city3. This follows a lazy binding approach for all the nodes that matched to some identifier that is specified in the query as the traversals take place in the graph.

More useful clauses

Some other essential clauses that Cypher supports for the construction of complex queries in the graph are listed as follows:

  • CREATE: You can use this clause to define a new node or a new relationship. If you want only unique occurrences of nodes/relationships in the graphs, then you can use the CREATE UNIQUE clause to avoid the creation of duplicate entities.
  • MERGE: This clause is equivalent to MATCH or CREATE. It can also be used with the help of indexes and unique constraints to find an existing entity or otherwise create a new one.
  • WHERE: This clause provides a specification of conditions that can be used to filter nodes and relationships based on their stored properties.
  • SET: This clause is used to assign values to properties of nodes or relationships.
  • WITH: This clause is used to pipeline the output of one query in the form of input into the next query, thereby making the chaining of queries possible.
  • UNION: This clause acts as a conjunction operation for queries in Cypher. You can combine the action of multiple queries on the data to produce a final result with the help of this clause.
  • DELETE: It is used for the removal of any type of entities in the graph, be it nodes or relationships or their individual properties.
  • FOREACH: This is an action clause that can be used to sequentially update the elements in a set of entities.

Some of these query clauses are radically similar to those in SQL. Cypher is intended to be simple enough so that it can be easily and quickly grasped by developers. Its clauses indicate that the operations are applied on graphs instead of relational data stores. We'll deal with some more clause-based examples in due course in the chapter.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset