APPENDIX C
Glossary

abstraction

Abstraction brings flexibility to your data models by redefining and combining some of the attributes, entities, and relationships within the model into more generic terms. For example, we may abstract Employee and Consumer into the more generic concept of Person. A Person can play many Roles, two of which are Employee and Consumer.

ACID

ACID stands for Atomic, Consistent, Isolated, and Durable. Atomic means everything within a transaction succeeds or the entire transaction is rolled back. Consistent means that the data accurately reflects any changes up to a certain point in time. A transaction cannot leave the database in an inconsistent state. Isolated means transactions cannot interfere with each other. That is, transactions are independent. Durable means completed transactions persist even when servers restart or there are power failures.

alternate key

An alternate key is a candidate key that, although unique, was not chosen as the primary key but still can be used to find specific entity instances.

architect

An architect is an experienced and skilled designer responsible for system and/or data architecture supporting a broad scope of requirements over time beyond the scope of a single project.

associative entity

An associative entity is an entity that resolves a many-to-many relationship.

attribute

Also known as a “data element,” an attribute is a property of importance to the business. Its values contribute to identifying, describing, or measuring instances of an entity. The attribute Claim Number identifies each claim. The attribute Student Last Name describes the last name of each student. The attribute Gross Sales Amount measures the monetary value of a transaction.

BASE

BASE is short for Basically Available, Soft-state, Eventual consistency. Basically Available means there is a response to every query, but that response could be a response saying there was a failure in getting the data, or the response that comes back may be in an inconsistent or changing state. Soft-state means the NoSQL database plays “catch up,” updating the database continuously even with changes that occurred from earlier in the day. Eventual consistency means that the system will eventually become consistent once it stops receiving input.

business analyst

A business analyst is an IT or business professional responsible for understanding the business processes and the information needs of an organization, for serving as a liaison between IT and business units, and acting as a facilitator of organizational and cultural change.

candidate key

A candidate key is one or more attributes that uniquely identify an entity instance.

cardinality

Cardinality defines the number of instances of each entity that can participate in a relationship. It is represented by the symbols that appear on both ends of a relationship line.

classword

A classword is the last term in an attribute name such as Amount, Code, and Name. Classwords allow for the assignment of common domains.

collection

A collection is a set of one or more documents. If we had a million orders, we could store all of these order documents in one Order collection.

column-oriented

Column-oriented databases such as Cassandra, are NoSQL databases which can work with very complex data types such as unformatted text and imagery, and this data can also be defined on the fly.

concept

A concept is a key idea that is both basic and critical to your audience. “Basic” means this term is probably mentioned many times a day in conversations with the people who are the audience for the model, which includes the people who need to validate the model as well as the people who need to use the model. “Critical” means the business would be very different or non-existent without this concept.

conceptual data model (CDM)

A conceptual data model is a set of symbols and text representing the key concepts and rules binding these key concepts for a specific business or application scope. The CDM represents the high level business solution.

conformed dimension

A conformed dimension is one that is shared across the business intelligence environment. Customer, Account, Employee, Product, Time, and Geography are examples of conformed dimensions.

data model

A data model is a set of symbols and text which precisely explains a business information landscape. A box with the word “Customer” within it represents the concept of a real Customer, such as Bob, IBM, or Walmart, on a data model. A line represents a relationship between two concepts such as capturing that a Customer may own one or many Accounts.

data modeler

A data modeler is one who confirms and documents data requirements. This role performs the data modeling process.

data modeling

Data modeling is the process of learning about the data, and regardless of technology, this process must be performed for a successful application.

database administrator (DBA)

The DBA is the data professional role responsible for database administration and the function of managing the physical aspects of data resources including database design and integrity, backup and recovery, performance and tuning.

denormalization

Denormalization is the process of selectively violating normalization rules and reintroducing redundancy into the model (and therefore the database). This extra redundancy can reduce data retrieval time, which is the primary reason for denormalizing. We can also denormalize to create a more user-friendly model.

developer

A developer is a person who designs, codes and/or tests software. Synonymous with software developer, systems developer, application developer, software engineer, and application engineer.

dimension

A dimension is a subject area whose purpose is to add meaning to the measures. All of the different ways of filtering, sorting, and summing measures make use of dimensions. Dimensions are often, but not exclusively, hierarchies.

dimensional model

A dimensional model (also called “dimensional data model”) focuses on easily answering business questions such as “What is our Gross Sales Amount by day, product, and region?” Dimensional models are built for the three S’s of Simplicity, Speed, and Security.

document

A set of somewhat related data often viewed together. The MongoDB document is analogous to the concept of a record in a relational database.

document-oriented

Document-oriented databases frequently store the business subject in one structure called a “document.” For example, instead of storing title and author information in two distinct relational structures, title, author, and other title-related information can all be stored in a single document called Title. Document-oriented is much more application focused, as opposed to table oriented which is more data focused. MongoDB is a document-based database.

domain

A domain is the complete set of all possible values that an attribute may be assigned. A domain is a set of validation criteria that can be applied to more than one attribute.

entity

An entity represents a collection of information about something that the business deems important and worthy of capture. A noun or noun phrase identifies a specific entity. It fits into one of several categories: who, what, when, where, why, or how.

entity instance

Entity instances are the occurrences or values of a particular entity. The entity Customer may have multiple customer instances with names Bob, Joe, Jane, and so forth. The entity Account can have instances of Bob’s checking account, Bob’s savings account, Joe’s brokerage account, and so on.

field

The concept of a physical attribute (also called a column or field) in relational databases is equivalent to the concept of a field in MongoDB. MongoDB fields contain two parts, a field name and a field value.

foreign key

A foreign key is an attribute that provides a link to another entity. A foreign key allows a database management system to navigate from one entity to another.

forward engineer

The process of building a new application by starting from the conceptual data model and ending with a database.

grain

The grain is the lowest level of detail available in the meter on a dimensional data model.

graph

A graph database is a NoSQL database designed for data whose relations are well represented as a set of nodes with an undetermined number of connections between those nodes. Graph databases are ideal for capturing social relations (where nodes are people), public transport links (where nodes could be bus or train stations), or road maps (where nodes could be street intersections or highway exits).

index

An index is a pointer to something that needs to be retrieved. The index points directly to the place on the disk where the data is stored, thus reducing retrieval time. Indexes work best on attributes whose values are requested frequently but rarely updated.

key-value

A key-value database is a NoSQL database that allows the application to store its data in only two columns (“key” and “value”), with more complex information sometimes stored within the “value” columns.

logical data model (LDM)

A logical data model (LDM) is the detailed business solution to a business problem. It is how the modeler captures the business requirements without complicating the model with implementation concerns such as software and hardware.

measure

A measure is an attribute in a dimensional data model’s meter that helps answer one or more business questions.

metadata

Metadata is text, voice, or image that describes what the audience wants or needs to see or experience. The audience could be a person, group, or software program.

meter

A meter is an entity containing a related set of measures. It is a bucket of common measures. As a group, common measures address a business concern such as Profitability, Employee Satisfaction, or Sales.

natural key

Also known as a business key, a natural key is what the business sees as the unique identifier for an entity.

NoSQL

NoSQL is a name for the category of databases built on non-relational technology. NoSQL is not a good name for what it represents as it is less about how to query the database (which is where SQL comes in) and more about how the data is stored (which is where relational structures comes in).

normalization

Normalization is the process of applying a set of rules with the goal of organizing something. With respect to attributes, normalization ensures that every attribute is single valued and provides a fact completely and only about its primary key.

object

Object includes any data model component such as entities, attributes, and relationships. Objects also include any MongoDB component such as fields, documents, and collections.

partition

Also known as sharding in MongoDB, a partition is a structure that divides or separates. Specific to the physical design, partitioning is used to break a table into rows, columns or both. There are two types of partitioning – vertical and horizontal. Vertical partitioning means separating the columns (the attributes) into separate tables. Horizontal means separating rows (the entity instances) into separate tables.

physical data model (PDM)

The physical data model (PDM) represents the detailed technical solution. The PDM is the logical data model modified for a specific set of software or hardware. The PDM often gives up perfection for practicality, factoring in real concerns such as speed, space, and security.

primary key

A primary key is a candidate key that has been chosen to be the unique identifier for an entity.

program

A program is a large, centrally organized initiative that contains multiple projects. It has a start date and, if successful, no end date. Programs can be very complex and require long-term modeling assignments. Examples include a data warehouse and a customer relationship management system.

project

A project is a plan to complete a software development effort, often defined by a set of deliverables with due dates. Examples include a sales data mart, broker trading application, reservations system, and an enhancement to an existing application.

recursive relationship

A recursive relationship is a relationship between instances of the same entity. For instance, one organization can report to another organization.

Relational Database Management System

The Relational Database Management System represents the traditional relational database invented by E. F. Codd at IBM in 1970 and first commercially available in 1979 (which was Oracle) [Wikipedia].

relational model

A relational model (also called “relational data model”) captures how the business works and contains business rules such as “A Customer must have at least one Account” or “A Product must have a Product Short Name.”

relationship

Rules are captured on our data model through relationships. A relationship is displayed as a line connecting two entities.

reverse engineer

The process of understanding an existing application by starting with its database and working up through the modeling levels until a conceptual data model is built.

secondary key

A secondary key is one or more data elements (if more than one data element, it is called a composite secondary key) that are accessed frequently and need to be retrieved quickly. A secondary key does not have to be unique, or stable, or always contain a value.

slowly changing dimension (SCD)

A Slowly Changing Dimension (SCD) is a term for any reference entity where we need to consider how to handle data changes. There are four basic ways to manage history. An SCD of Type 0 means we are only interested in the original state, an SCD of Type 1 means only the most current state, an SCD of Type 2 means the most current along with all history, and an SCD of Type 3 means the most current and some history will be stored.

spreadsheet

A spreadsheet is a representation of a paper worksheet containing a grid defined by rows and columns, where each cell in the grid can contain text or numbers. The columns often contain different types of information.

stakeholder

A stakeholder is a person who has an interest in the successful completion of a project.

star schema

A star schema is the most common physical dimensional data model structure. A star schema results when each set of structures that make up a dimension is flattened into a single structure. The fact table is in the center of the model and each of the dimensions relate to the fact table at the lowest level of detail.

subject matter expert (SME)

A person with significant experience and knowledge of a given topic or function.

surrogate key

A surrogate key is a primary key that substitutes for a natural key, which is what the business sees as the unique identifier for an entity. It has no embedded intelligence and is used by IT (and not the business) for integration or performance reasons.

user

A user is a person who enters information into an application or queries the application to answer business questions and produce reports.

view

A view is a virtual table. It is a dynamic “view” or window into one or more tables (or other views) where the actual data is stored.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset