This section covers Conceptual Data Modeling (Chapter 5), Logical Data Modeling (Chapter 6), and Physical Data Modeling (Chapter 7). Notice the “ing” at the end of each of these chapter titles. We focus on the process of building each of these models, which is where we gain essential business knowledge. This pyramid summarizes the four levels of design:
At the highest level, we have the Conceptual Data Model (CDM), which captures the satellite view of the business solution. The CDM provides the context for and scope of the Logical Data Model (LDM), which provides the detailed business solution. The LDM becomes the starting point for applying the impact of technology in the Physical Data Model (PDM). The PDM represents the actual MongoDB database structures.
In addition to the conceptual, logical, and physical levels of detail, there are also two different modeling mindsets: relational and dimensional.
Relational data modeling is the process of capturing how the business works by precisely representing business rules, while dimensional data modeling is the process of capturing how the business is monitored by precisely representing business questions.
The major difference between relational and dimensional data models is in the meaning of the relationships. On a relational data model, a relationship communicates a business rule, while on a dimensional data model, the relationship communicates a navigation path. On a relational data model, for example, we can represent the business rule “A Customer must have at least one Account.” On a dimensional data model, we can display the measure Gross Sales Amount and all of the navigation paths from which a user needs to see Gross Sales Amount such as by day, month, year, region, account, and customer. The dimensional data model is all about answering business questions by viewing measures at different levels of granularity.
The following table summarizes these three levels of design and two modeling mindsets, leading to six different types of models:
Mindset |
|||
Relational |
Dimensional |
||
Levels of Design |
CDM |
Key concepts and their business rules such as, “Each Customer may place one or many Orders.” |
Key concepts focused on answering a set of business questions such as, “Can I see Gross Sales Amount by Customer?” |
LDM |
All attributes required for a given application, neatly organized into entities according to strict business rules, and independent of technology such as, “Each Customer ID value must return, at most, one Customer Last Name.” |
All attributes required for a given analytical application, focused on answering a set of business questions and independent of technology, such as, “Can I see Gross Sales Amount by Customer and view the customer’s first and last name?” |
|
PDM |
The LDM modified to perform well in MongoDB. For example, “To improve retrieval speed, we need a non-unique index on Customer Last Name.” |
The LDM modified to perform well in MongoDB. For example, “Because there is a need to view Gross Sales Amount at a Day level, and then by Month and Year, we should consider embedding all calendar fields into a single collection.” |
Note that it seems like there is a lot of work to do; we need to go through all three phases – conceptual, logical, and physical. Wouldn’t it be easier to just jump straight to building a MongoDB database and be done with it?
Going through the proper levels of design will take more time than just jumping into building a MongoDB database. However, the thought process we go through in building the application should ideally cover the steps we go through during these three levels of design anyway. For example, if we jump straight into building a MongoDB database, we would still need to ask at least some of the questions about definitions and business rules; it’s just that we would do it all at once instead of in separate phases. Also, if we don’t follow these modeling steps proactively, we will be asking the questions during support, where fixing things can be much more expensive in terms of time, money, and reputation. Believe me, I know—many of my consulting assignments involve fixing situations due to skipping levels of design (e.g., jumping right to the physical). I can’t tell you how many times during my assignments I have heard a manager use the phrase “technical debt” to summarize the high cost to maintain and poor performance of applications built without conceptual and logical data models. For example, take the MongoDB document we created in the previous chapter:
{
titleName : “Extreme Scoping”,
subtitleName : “An Agile Approach to Enterprise Data Warehousing and Business Intelligence”,
pageCount : 300
}
This is a very simple document with just three fields: the book’s title name, subtitle name, and page count. However, even with just these three fields, there are conceptual, logical, and physical questions that need to be answered.
During conceptual data modeling, we would address questions such as these:
During logical data modeling, we would address questions such as these:
During physical data modeling, we would address questions such as these:
By the end of this section, you will be able to appreciate, understand, and complete the three different phases of modeling for MongoDB applications.