Chapter 3. Efficient Data Modeling with Graphs

Databases are not dumps for data; rather, they are planned and strategically organized stores that befit a particular objective. This is where modeling comes into the picture. Modeling is motivated by a specific need or goal so that specific facets of the available information are aggregated together into a form that facilitates structuring and manipulation. The world cannot be represented in the way it actually is; rather, simplified abstractions can be formed in accordance with some particular goal. The same is true for graph data representations that are close logical representations of the physical-world objects. Systems managing relational data have storage structures far different from those that represent data close to natural language. Transforming the data in such cases can lead to semantic dissonance between how we conceptualize the real world and data storage structure. This issue is, however, overcome by graph databases. In this chapter, we will look at how we can efficiently model data for graphs. The topics to be covered in this chapter are:

  • Data models and property graphs
  • Neo4j design constraints
  • Techniques for modeling graphs
  • Designing schemas
  • Modeling across multiple domains
  • Data models

Data models

A data model tells us how the logical structure of a database is modeled. Data models are fundamental entities to introduce abstraction in DBMS. They define how data is connected to each other and how it will be processed and stored inside the system. There are two basic types of data models in which related data can be modeled: aggregated and connected.

The aggregated data model

Aggregated data is about how you might use aggregation in your model to simulate relationships when you can't explicitly define them. So, you might store different objects internally in separate sections of the data store and derive the relationships on the fly with the help of foreign keys, or any related fields between the object. Hence, you are aggregating data from different sources, for example, as depicted in the following diagram, a company might contain a collection of the people who work there, and a person might in turn be associated with several products within the company, from which you might extract the related data. The aggregation or grouping is used to form a link between entities rather than a well-defined relationship.

The aggregated data model

Connected data models

Connected data is about relationships that exist between different entities. We explicitly specify how the two entities are connected with a well-defined relationship and also the features of this relation, so we do not need to derive relationships. This makes data access faster and makes this the prominent data model for most graph databases, including Neo4j. An example would be a PLAYS_FOR relationship between a player and a team, as shown in the following diagram:

Connected data models
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset