CHAPTER 22

How can PowerDesigner help me improve data model quality?

Quality, correct

Requirements, complete and right

Take all into account

In this chapter, we will discuss the factors that influence the quality of your data models, and identify the ways in which PowerDesigner can help you to achieve and maintain the quality of your models. The PowerDesigner features we refer to are all explained elsewhere; we will guide you to the other parts of the book that are relevant.

A frequently overlooked aspect of data quality management is that of data model quality. We often build data models quickly, in the midst of a development project, and with the singular goal of database design. Yet the implications of those models are far-reaching and long-lasting. They affect the structure of implemented data, the ability to adapt to change, understanding of and communication about data, definition of data quality rules, and much more. In many ways, high-quality data begins with high-quality data models. A good data model can lead to a good application, and similarly, a bad data model can lead to a bad application.

There are three reasons why sound data design is a requirement for application success:

·         Leverage. All application development, including interfaces, extracts, screen designs, and functionality are built upon the database design. The database design is built from the data model. Therefore, the data model can make or break an entire application. Once an application is built based on a good data model, it becomes easier to support and expand, leading to a long and stable application life.

·         Data quality. The data model can enforce many rules that can dramatically improve data quality and catch many data errors before they are loaded into the application. There are other areas where the model can help with data quality, such as the definitions. Clear and concise definitions can help people make better decisions and identify issues before development begins.

·         ‘Big picture’. Although a data model is usually built to support a particular application, the terminology and structures in the model need to have a level of consistency across the enterprise. This broader view is easier to achieve with sound modeling practices, such as using generic structures and consistent naming standards.

After reviewing hundreds of data models, Steve formalized the criteria he had been using into the Data Model Scorecard®. The Scorecard provides an objective way of measuring what is good or bad about a data model, using ten assessment categories. For data models in your organization to consistently achieve high scores; you will need more than good luck or a competent modeler. The quality of your data models will be influenced greatly by the level of business knowledge of your data modelers (and the people supporting the modelers), the presence and application of modeling standards of various types, the functionality and flexibility provided by your data modeling tools, and how the modeling processes integrate with other analysis, design, and development activities.

For a very detailed examination of the Data Model Scorecard, see Steve’s Modeling Master Class Training Manual (Hoberman, 2011).

In this section, we describe each of the Data Model Scorecard categories, and identify the support provided by PowerDesigner. Look in the index to track down further information.

·         Category 1 – How well do the characteristics of the model support the type of model?

This category ensures that the type of model (subject area, logical, or physical – and then either relational or dimensional) has the appropriate level of detail. In general terms, the subject area model should contain a well-defined scope, the logical data model should be application-independent and represent a business solution, and the physical data model should be tuned for performance, security, and take into consideration hardware and software. The physical data model should represent a technical solution; a dimensional model is built when there is a need to play with numbers, while a relational model is built for everything else. PowerDesigner support for this category is shown in Table 22.1.

Table 22.1 PowerDesigner support for Data Model Scorecard Category 1

PowerDesigner Support

Full support for all 6 types of data models

Full support for over 60 database platforms and 3 XML Schema standards

Automated creation of associative entities to resolve many-to-many relationships.

Drag and drop to move or copy objects and sub-objects.

Ability to extend the types of information we can record against a model.

Denormalization capabilities in PDM.

Domains and Data Items

 

·         Category 2 – How well does the model capture the requirements?

This is the correctness category. That is, we need to understand the content of what is being modeled. This can be the most difficult of all 10 categories to grade because we really need to understand how the business works and what they want from their application(s). If we are modeling a sales data mart, for example, we need to understand both how the invoicing process works in our company, and what reports and queries will be needed to answer key sales questions from the business.

We need to ensure that our model represents the data requirements, as the costs can be devastating if there is even a slight difference between what was required and what was delivered. Besides not delivering what was expected, there is the potential for the IT/business relationship to suffer. The model must support business expectations. PowerDesigner support for this category is shown in Table 22.2.

Table 22.2 PowerDesigner support for Data Model Scorecard Category 2

PowerDesigner Support

Full support for all 6 types of data model

Import of Requirements documents and linking to data models.

Import of analysis material from Excel directly into

·         Data Models

·         Glossary

·         Data Items

Reporting and web portal

Documenting Business Rules

Alternate Identifiers and Keys

Ability to extend the types of information we can record against a model.

Matrix Editor – e.g. show requirements not satisfied by entities

Mapping Editor

Export to Excel for editing and review

Cross-reference to Business Process Models

 

·         Category 3 – How complete is the model?

This is the completeness category. This category checks for data model components that are not in the requirements, or requirements that are not represented on the model. If the scope of the model is greater than the requirements, we have a situation known as scope creep, where we are planning on delivering more than what was originally required. This may not necessarily be a bad thing, as long as this additional scope has been factored into the project plan. If the model scope is less than the requirements, we will be leaving information out of the resulting application, usually leading to an enhancement or Phase II shortly after the application is in production. For completeness, we need to make sure the scope of the project and model match. PowerDesigner support for this category is shown in Table 22.3.

Table 22.3 PowerDesigner support for Data Model Scorecard Category 3

PowerDesigner Support

Model Checking

Matrix Editor – e.g. show requirements not satisfied by entities, or entities not mapped to requirements

List Reports - Ensure that all the necessary data model descriptive information is provided

Export to Excel for editing and review

List Reports – e.g. unreferenced Business Rules, or attributes using default format

Ability to extend the types of information we can record against a model.

Reporting and Portal

Model Comparison

Reverse Engineering

 

·         Category 4 – How structurally sound is the model?

This is the structure category. This category validates the design practices employed in building the model to ensure we can eventually build a database from our data model. Good design practices include avoiding design errors such as having two data elements with the same exact name in the same entity, a null data element in a primary key, and/or partial key relationships[12]. Traditionally, when we review a data model, the violations we catch fall into this category, because we don’t need to understand the content of the model to score this category. Even if the reviewer knows nothing about the industry or subject matter represented by the model, it can still be graded accurately in this category. PowerDesigner support for this category is shown in Table 22.4.

Table 22.4 PowerDesigner support for Data Model Scorecard Category 4

PowerDesigner Support

Domains and Data Items

Model Checking

Model Generation options and tailoring

Auto-creation of Associative entities

Converting relationships into entities

 

·         Category 5 – How well does the model leverage generic structures?

This is the abstraction category. In this category we confirm an appropriate use of generic structures on the model. One of the most powerful tools a data modeler has at her disposal is abstraction, the ability to increase the types of information a design can accommodate using generic concepts. Recall our earlier discussion on abstraction in Chapter 16. Going from Customer Location to a more generic Location, for example, allows the design to handle other types of locations, such as warehouses and distribution centers, more easily. Abstraction can be properly applied (or abused!) at the entity, relationship, and data element levels. PowerDesigner support for this category is shown in Table 22.5.

Table 22.5 PowerDesigner support for Data Model Scorecard Category 5

PowerDesigner Support

Reference Models

Mapping Editor – map generic models to less generic models

Subtypes

Denormalization and Normalization in the PDM

Business Rules

 

·         Category 6 – How well does the model follow naming standards?

This is the standards category. Correct and consistent naming standards are extremely helpful for knowledge transfer and integration. New team members who are familiar with similar naming conventions on other projects will not need to take time to learn a new set of naming standards. Efforts to bring together information from multiple systems will be less painful if the data elements are named consistently across projects. This category focuses on naming standard structure, terms, and syntax. PowerDesigner support for this category is shown in Table 22.6. See Chapter 6 for a more detailed discussion of the importance of names and definitions.

Table 22.6 PowerDesigner support for Data Model Scorecard Category 6

PowerDesigner Support

Name to code conversion

Naming Conventions

Glossary

Naming Templates

PDM Naming Conventions

Domains and Data Items

Model Checking

List Reports

Excel Import

Ability to extend the types of information we can record against a model.

·         Category 7 – How well has the model been arranged for readability?

In this category, we check whether the model is visually easy to follow. This question is definitely the least important category, as it is solely concerned with presentation. However, if your entities, data elements, and relationships are difficult to read, you may not accurately address the more important categories on the Scorecard. PowerDesigner support for this category is shown in Table 22.7.

Table 22.7 PowerDesigner support for Data Model Scorecard Category 7

PowerDesigner Support

Auto Layout

User Profiles

Display Preferences

Multiple Diagrams

Packages

Embellishing your diagrams

Resequencing attributes and columns

Aligning Symbols

Ability to extend the types of information we can record against a model.

 

·         Category 8 – How good are the definitions?

This category deals with whether definitions are clear, complete, and correct.

o        Clear – a reader can understand the meaning of a term by reading the definition only once.

o        Complete – it is at the appropriate level of detail, and includes all the necessary components, such as derivations and examples.

o        Correct – a definition that totally matches what the term means, and is consistent with the rest of the business.

PowerDesigner support for this category is shown in Table 22.8. See Chapter 6 for a more detailed discussion of the importance of names and definitions.

Table 22.8 PowerDesigner support for Data Model Scorecard Category 8

PowerDesigner Support

Accessing data from a database

Reporting and Portal

Export to Excel for review and revision

 

·         Category 9 – How consistent is the model with the enterprise?

This is the consistency category. Does this model complement the big picture? This category ensures that the information is presented in a broad and consistent context, so that the organization uses one set of terminology and rules. The structures that appear in a data model should be consistent in terminology and usage with structures that appear in related data models, and ideally, with the enterprise data model, if one exists. In this way there will be consistency across projects.

An enterprise data model is a subject-oriented and integrated data model containing all of the data produced and consumed across an entire organization. Subject-oriented means that the concepts on a data model fit together as the CEO sees the company, as opposed to how individual functional or department heads see the company. There is one Customer entity, one Order entity, etc. Integration goes hand-in-hand with subject-orientation. Integration means that all of the data and rules in an organization are depicted once and fit together seamlessly. PowerDesigner support for this category is shown in Table 22.9.

Table 22.9 PowerDesigner support for Data Model Scorecard Category 9

PowerDesigner Support

Glossary

Reference Models

Mapping Editor

Model Generation

Excel Import

Model Compare and Merge

Ability to extend the types of information we can record against a model.

 

·         Category 10 – How well does the meta data match the data?

This criterion ensures that the model and the actual data that will be stored within the resulting database or documents are consistent with each other; this reduces the risk of surprises during software development.

This might be very difficult to do early in a project's life cycle, but the earlier the better so you can avoid later surprises, which can be much more costly. PowerDesigner support for this category is shown in Table 22.10.

Table 22.10 PowerDesigner support for Data Model Scorecard Category 10

PowerDesigner Support

Displaying data from a database

Ability to extend the types of information we can record against a model.

Excel Import

Excel Export

 

Key Points

·         The Data Model Scorecard® is a collection of ten categories for verifying the quality of a data model.

·         Applying the Scorecard early in the modeling process saves rework later and increases the chances that your comments on the model will be incorporated.

·         The Scorecard can be customized for a particular organization. Categories and point scores can be changed to make it work for you.

·         The simplest techniques can provide invaluable support.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset