Modeling across multiple domains

Most organizations use Neo4j for the purpose of business applications. Developers might sometimes argue that in designing the underlying data model, there ought to be multiple graphs that are classified on the basis of subdivided domains. Others might insist on having all of the data of the domain in a single large graph. If you consider the facts, both these scenarios have their own trade-offs.

If the subdivided domain datasets are queried frequently in such a way that the traversals are spread across multiple domains, then the developer who suggested a single large graph is right. However, if you are confident that the subdivided domains will rarely need to interact among themselves, then it would be effective to use the multiple small graphs. This will make the system more robust and decrease the query response time.

With all this said, it is important to outline that the more messy your graph looks with interconnections of all kinds, the more complex queries you can practically run to derive highly interesting relationships that will make your application more complex and intelligent. For example, if your comprehension of music was to be completely independent of your ability to play soccer, then the movements of the goalkeeper will not appear like dancing (if you are thinking of music while watching a match, you could actually interpret the movements as a dance). Humans on the other hand are capable of seamlessly mixing so many different variations.

Domains or subdomains for that matter are rarely different in the truest sense; they overlap at times, which can be a benefit for your app if the traversals can make use of explicit continuities in such cases. Performance depends on how frequently you traverse and the methods you use for it. The data size and density of related data are rather insignificant for performance if you decide to subdivide your graph into chunks.

Although it would not be a preferred choice on single machines, if you need to subdivide your graph based on domains, you need to keep in mind that Neo4j graphs are essentially directories in the filesystem. So, in order to link them, you would be required to create a class to dynamically load the path of the database needed into memory for the querying process and remove it when the result of the query is obtained. In a real-time scenario where loads of queries need to processed on the fly, this is not a good idea. However, for scenarios where you need to warehouse your data for analysis purposes, it definitely works well.

Insights for businesses require us to understand the underlying network actions in a complex chain of values. This might require us to perform joins across several domains with little or no distortion in the details of each domain. This process is simplified with property graphs, where we can model a chain of values as a forest (that is, a graph of graphs) that can include interdomain relationships on rare occasions.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset