Glossary

activity diagram (UML)—A process workflow model (diagram) showing the flow from one activity to the next.

aggregation—A special type of abstraction relationship that defines a higher-level entity that is an aggregate of several lower-level entities; a “part-of” type relationship. For example, a bicycle entity would be an aggregate of wheel, handlebar, and seat entities.

association—A relationship between classes (in UML); associations can be binary, n-ary, reflexive, or qualified.

attribute—A primitive data element that provides descriptive detail about an entity; a data field or data item in a record. For example, lastname would be an attribute for the entity customer. Attributes may also be used as descriptive elements for certain relationships among entities.

automatic summary table (AST)—Materialized (summary) views or aggregates of data saved by OLAP for future use to reduce query time.

binary recursive relationship—A relationship between one occurrence of an entity with another occurrence of the same entity.

binary relationship—A relationship between occurrences of two entities.

Boyce Codd normal form (BCNF)—A table is in Boyce Codd normal form if and only if for every functional dependency X->A, where X and A are either simple or composite attributes (data items), X must be a superkey in that table. This is a strong form of 3NF and is the basis for most practical normalization methodologies.

candidate key—Any subset of the attributes (data items) in a superkey that is also a superkey and is not reducible to another superkey.

CASE tool—Computer-aided software engineering tool, a software design tool to assist in the logical design of large or complex databases. examples include ERwin Data Modeler and Rational Rose using UML.

class—A concept in a real-world system, represented by a noun in UML; similar to an entity in the ER model.

class diagram (UML)—A conceptual data model; a model of the static relationships between data elements of a system (similar to an ER diagram).

completeness constraint—Double-line symbol connecting a supertype entity with the subtypes to designate that the listed subtype entities represent a complete set of possible subtypes.

composition—A relationship between one class and a group of other classes in UML; the class at the diamond (aggregate) end of the relationship is composed of the class(es) at the small (component) end; similar to aggregation in the ER model.

conceptual data model—An organization of data that describes the relationships among the primitive data elements. For example, in the ER model, it is a diagram of the entities, their relationships, and their attributes.

connectivity of a relationship—A constraint on the count of the number of associated entity occurrences in a relationship, either one or many.

data item—The basic component of a data record in a file or database table; the smallest unit of information that has meaning in the real world. Examples include customer lastname, address, and identification number.

data model—An organization of data that describes the relationships among primitive and composite data elements.

data warehouse—A large repository of historical data that can be integrated for decision support.

database—A collection of interrelated stored data that serves the needs of multiple users; a collection of tables in the relational model.

database administrator (DBA)—The person in a software organization who is in charge of designing, creating, and maintaining the databases of an enterprise. The DBA makes use of a variety of software tools provided by a DBMS.

database life cycle—An enumeration and definition of the basic steps in the requirements analysis, design, creation, and maintenance of a database as it evolves over time.

database management system (DBMS)—A generalized software system for storing and manipulating databases. Examples include Oracle, IBM’s DB2, Microsoft SQL Server, or Access.

data mining—A way of extracting knowledge from a database by searching for correlations in the data and presenting promising hypotheses to the user for analysis and consideration.

DBA—See database administrator.

degree of a relationship—The number of entities associated in the relationship: recursive binary (1 entity), binary (2 entities), ternary (3 entities), n-ary (n entities).

denormalization—The consolidation of database tables to increase performance in data retrieval (query), despite the potential loss of data integrity. Decisions on when to denormalize tables are based on cost/benefit analysis by the DBA.

deployment diagram (UML)—Shows the physical nodes on which a system executes. This is more closely associated with physical database design.

dimension table—The smaller tables used in a data warehouse to denote the attributes of a particular dimension, such as time, location, customer characteristics, product characteristics, etc.

disjointness constraint (d)—A symbol in an ER diagram to designate that the lower-level entities in a generalization relationship have nonoverlapping (disjoint) occurrences. If the occurrences overlap, then use the designation (o) in the ER diagram.

entity—A data object that represents a person, place, thing, or event if informational interest; it corresponds to a record in a file when stored. For example, you could define employee, customer, project, team, and department as entities.

entity cluster—The result of a grouping operation on a collection of entities and relationships in an ER model to form a higher-level abstraction, which can be used to more easily keep track of the major components of a large-scale global schema.

entity instance (or occurrence)—A particular occurrence of an entity. For example, an instance of the entity actor would be Johnny Depp.

entity-relationship (ER) diagram—A diagram (or graph) of entities and their relationships, and possibly the attributes of those entities.

entity-relationship (ER) model—A conceptual data model involving entities, relationships among entities, and attributes of those entities.

exclusion constraint—A symbol (+) between two relationships in the ER model with a common entity that implies that either one relationship must hold at a given point in time, or the other must hold, but not both.

existence dependency—A dependency between two entities such that one is dependent upon the other for its existence, and cannot exist alone. For example, an employee work-history entity cannot exist without the corresponding employee entity. Also refers to the connectivity between two entities as being mandatory or optional.

fact table—The dominating table in a data warehouse and its star schema, containing dimension attributes and data measures at the individual data level.

fifth normal form (5NF)—A table is in fifth normal form (5NF) if and only if there are no possible lossless decompositions into any subset of tables; in other words, if there is no possible lossless decomposition, then the table is in 5NF (see Section 6.5).

file—A collection of records of the same type. For example, an employee file is a collection of employee records.

first normal form (1NF)—A table is in first normal form (1NF) if and only if there are no repeating columns of data taken from the same domain and having the same meaning.

foreign key—Any attribute in a SQL table (key or nonkey) that is taken from the same domain of values as the primary key in another SQL table and can be used to join the two tables (without loss of data integrity) as part of a SQL query.

fourth normal form (4NF)—A table is in fourth normal form (4NF) if and only if it is at least in BCNF and if whenever there exists a nontrivial multivalued dependency of the form X->>Y, then X must be a superkey in the table.

functional dependency (FD)—The property of one or more attributes (data items) that uniquely determines the value of one or more other attributes (data items). Given a table R, a set of attributes B is functionally dependent on another set of attributes A if, at each instant of time, each A value is associated with only one B value.

generalization—A special type of abstraction relationship that specifies that several types of entities with certain common attributes can be generalized (or abstractly defined) with a higher-level entity type, a supertype entity; an “is-a” type relationship. For example, employee is a generalization of engineer, manager, and administrative assistant, based on the common attribute job-title. A tool often used to make view integration possible.

global schema—A conceptual data model that shows all the data and their relationships in the context of an entire database.

key—A generic term for a set of one or more attributes (data items) that, taken collectively, enables one to identify uniquely an entity or a record in a SQL table; a superkey.

logical design—The steps in the database life cycle involved with the design of the conceptual data model (schema), schema integration, transformation to SQL tables, and table normalization; the design of a database in terms of how the data is related, but without regard to how it will be stored.

lossless decomposition—A decomposition of a SQL table into two or more smaller tables is lossless if and only if the cycle of table decomposition (normalization) and the recomposition (joining the tables back through common attributes) can be done without loss of data integrity.

mandatory existence—A connectivity between two entities that has a lower bound of one. One example is the “works-in” relationship between an employee and a department; every department has at least one employee at any given time. Note: if this is not true, then the existence is optional.

multiplicity—In UML, the multiplicity of a class is an integer that indicates how many instances of that class are allowed to exist.

multivalued dependency (MVD)—The property of a pair of rows in a SQL table such that if the two rows contain duplicate values of attribute(s) X, then there is also a pair of rows obtained by interchanging the values of Y in the original pair. This is a multivalued dependency of X->>Y. For example, if two rows have the attribute values ABC and ADE, where X=A and Y has the values C and E, then the rows ABE and ADC must also exist for an MVD to occur. A trivial MVD occurs when Y is a subset of X or X union Y is the entire set of attributes in the table.

normalization—The process of breaking up a table into smaller tables to eliminate problems with unwanted loss of data (the egregious side effects of losing data integrity) from the deletion of records and inefficiencies associated with multiple data updates.

online analytical processing (OLAP)—A query service that overlays a data warehouse by creating and maintaining a set of summary views (automatic summary tables, or ASTs) to enable quick access to summary data.

optional existence—A connectivity between two entities that has a lower bound of zero. For example, for the “occupies” relationship between an employee and an office, there may exist some offices that are not currently occupied.

package—In UML, a package is a graphical mechanism used to organize classes into groups for better readability.

physical design—The step in the database life cycle involved with the physical structure of the data; that is, how it will be stored, retrieved, and updated efficiently. In particular, it is concerned with issues of table indexing and data clustering on secondary storage devises (disk).

primary key—A key that is selected from among the candidate keys for a SQL table to be used to create an index for that table.

qualified association—In UML, an association between classes may have constraints specified in the class diagram.

record—A group of data items treated as a unit by an application; a row in a database table.

referential integrity—A constraint in a SQL database that requires, for every foreign key instance that exists in a table, that the row (and thus the primary key instance) of the parent table associated with that foreign key instance must also exist in the database.

reflexive association—In UML, a reflexive association relates a class to itself.

relationship—A real-world association among one or more entities. For example, “purchased” could be a relationship between customer and product.

requirements specification—A formal document that defines the requirements for a database in terms of the data needed, the major users and their applications, the physical platform and software system, and any special constraints on performance, security, and data integrity.

row—A group of data items treated as a unit by an application; a record; a tuple in relational database terminology.

schema—A conceptual data model that shows all the relationships among the data elements under consideration in a given context; the collection of table definitions in a relational database.

second normal form (2NF)—A table is in second normal form (2NF) if and only if each nonkey attribute (data item) is fully dependent on the primary key, that is either the left side of every functional dependency (FD) is a primary key or can be derived from a primary key.

star schema—The basic form of data organization for a data warehouse, consisting of a single large fact table and many smaller dimension tables.

subtype entity—The higher-level abstract entity in a generalization relationship.

superkey—A set of one or more attributes (data items) that, taken collectively, allows the unique identification of an entity or a record in a relational table.

supertype entity—The lower-level entity in a generalization relationship.

table—In a relational database, the collection of rows (or records) of a single type (similar to a file).

ternary relationship—A relationship that can only be defined among occurrences of three entities.

third normal form (3NF)—A table is in third normal form (3NF) if and only if for every functional dependency X->A, where X and A are either simple or composite attributes (data items), either X must be a superkey or A must be a member attribute of a candidate key in that table.

UML—Unified Modeling Language; a popular form of diagramming tools used to define data models and processing steps in a software application.

view integration—A step in the logical design part of the database life cycle that collects individual conceptual data models (views) into a single unified global schema. Techniques such as generalization are used to consolidate the individual data models.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset