6 Data Operations Management

Data Operations Management is the fourth Data Management Function in the data management framework shown in Figures 1.3 and 1.4. It is the third data management function that interacts with and is influenced by the Data Governance function. Chapter 6 defines the data operations management function and explains the concepts and activities involved in data operations management.

6.1 Introduction

Data operations management is the development, maintenance, and support of structured data to maximize the value of the data resources to the enterprise. Data operations management includes two sub-functions: database support and data technology management.

The goals of data operations management include:

  1. Protect and ensure the integrity of structured data assets.
  2. Manage the availability of data throughout its lifecycle.
  3. Optimize performance of database transactions.

The context diagram for data operations management is shown in Figure 6.1.

6.2 Concepts and Activities

Chapter 1 stated that data operations management is the function of providing support from data acquisition to data purging. Database administrators (DBAs) play a key role in this critical function. The concepts and activities related to data operations management and the roles of database administrators are presented in this section.

6.2.1 Database Support

Database support is at the heart of data management, and is provided by DBAs. The role of DBA is the most established and most widely adopted data professional role, and database administration practices are perhaps the most mature of all data management practices. DBAs play the dominant role in data operations management, as well as in Data Security Management (see Chapter 7). As discussed in Chapter 5, DBAs also play critical roles in Data Development, particularly in physical data modeling and database design, as well as support for development and test database environments.

In fact, many DBAs specialize as Development DBAs or Production DBAs. Development DBAs focus on data development activities, while Production DBAs perform data operations management activities. In some organizations, each specialized role reports to different organizations within IT. Production DBAs may be part of a production infrastructure and operations support group. Development DBAs and / or production DBAs are sometimes integrated into application development organizations.

Figure 6.1 Data Operations Management Context Diagram

Production DBAs take primary responsibility for data operations management, including:

  • Ensuring the performance and reliability of the database, including performance tuning, monitoring, and error reporting.
  • Implementing appropriate backup and recovery mechanisms to guarantee the recoverability of the data in any circumstance.
  • Implementing mechanisms for clustering and failover of the database, if continual data availability data is a requirement.
  • Implementing mechanisms for archiving data operations management.

The Production DBA is responsible for the following primary deliverables:

  1. A production database environment, including an instance of the DBMS and its supporting server, of a sufficient size and capacity to ensure adequate performance, configured for the appropriate level of security, reliability, and availability. Database System Administration is responsible for the DBMS environment.
  2. Mechanisms and processes for controlled implementation and changes to databases into the production environment.
  3. Appropriate mechanisms for ensuring the availability, integrity, and recoverability of the data in response to all possible circumstances that could result in loss or corruption of data.
  4. Appropriate mechanisms for detecting and reporting any error that occurs in the database, the DBMS, or the data server.
  5. Database availability, recovery, and performance in accordance with service level agreements.

DBAs do not perform all the activities of data operations management exclusively. Data stewards, data architects, and data analysts participate in planning for recovery, retention, and performance. Data stewards, data architects, and data analysts may also participate in obtaining and processing data from external sources

6.2.1.1 Implement and Control Database Environments

Database systems administration includes the following tasks:

  • Updating DBMS software – DBAs install new versions of the DBMS software and apply maintenance fixes supplied by the DBMS vendor in all environments, from development to production.
  • Maintaining multiple installations, including different DBMS versions – DBAs install and maintain multiple instances of the DBMS software in development, testing, and production environments, and manage migration of the DBMS software versions through environments.
  • Installing and administering related data technology, including data integration software and third party data administration tools.
  • Setting and tuning DBMS system parameters.
  • Managing database connectivity – In addition to data security issues (see Chapter 7), accessing databases across the enterprise requires technical expertise. DBAs provide technical guidance and support for IT and business users requiring database connectivity.
  • Working with system programmers and network administrators to tune operating systems, networks, and transaction processing middleware to work with the DBMS.
  • Dedicating appropriate storage for the DBMS, and enabling the DBMS to work with storage devices and storage management software. Storage management optimizes the use of different storage technology for cost-effective storage of older, less frequently referenced data. Storage management software migrates less frequently referenced data to less expensive storage devices, resulting in slower retrieval time. Some databases work with storage management software so that partitioned database tables can be migrated to slower, less expensive storage. DBAs work with storage administrators to set up and monitor effective storage management procedures.

Prepare checklists to ensure these tasks are performed at a high level of quality. These checklists lay out the steps involved. The work of one DBA should be audited by another DBA before the changes go into production.

The DBA is the custodian of all database changes. While many parties may request changes, the DBA defines the precise changes to make to the database, implements the changes, and controls the changes. DBAs should use a controlled, documented, and auditable process for moving application database changes to the Quality Assurance or Certification (QA) and Production environments, in part due to Sarbanes-Oxley and other regulatory requirements. A manager-approved service request or change request usually initiates the process. In most cases, the DBA should have a back out plan to reverse changes in case of problems.

Test all changes to the QA environment in the development / test environment, first, and test all changes to production, except for emergency changes, in the QA environment. While Development DBAs control changes to development / test environments, Production DBAs control changes to production environments, as well as usually controlling QA environments.

6.2.1.2 Obtain Externally Sourced Data

Most organizations obtain some data from external third-party sources, such as lists of potential customers purchased from an information broker, or product data provided by a supplier. The data is either licensed or provided free of charge; is provided in a number of different formats (CD, DVD, EDI, XML, RSS feeds, text files); and is a one-time-only or regularly updated via a subscription service. Some acquisitions require legal agreements.

A managed approach to data acquisition centralizes responsibility for data subscription services with data analysts. The data analyst will need to document the external data source in the logical data model and data dictionary. A developer may design and create scripts or programs to read the data and load it into a database. The DBA will be responsible for implementing the necessary processes to load the data into the database and / or make it available to the application.

6.2.1.3 Plan for Data Recovery

Data governance councils should establish service level agreements (SLAs) with IT data management services organizations for data availability and recovery. SLAs set availability expectations, allowing time for database maintenance and backup, and set recovery time expectations for different recovery scenarios, including potential disasters.

DBAs must make sure a recovery plan exists for all databases and database servers, covering all possible scenarios that could result in loss or corruption of data. This includes, but is not limited to:

  • Loss of the physical database server.
  • Loss of one or more disk storage devices.
  • Loss of a database, including the DBMS master database, temporary storage database, transaction log segment, etc.
  • Corruption of database index or data pages.
  • Loss of the database or log segment file system.
  • Loss of database or transaction log backup files.

Management and the organization’s business continuity group, if one exists, should review and approve the data recovery plan. The DBA group must have easy access to all data recovery plans.

Keep a copy of the plan, along with all necessary software, such as the software needed to install and configure the DBMS, instructions, and security codes, such as the administrator password, in a secure, off-site location in the event of a disaster. Backups of all databases should be kept in a secure, off-site location.

6.2.1.4 Backup and Recover Data

Make regular backups of databases and, for OLTP databases, the database transaction logs. The SLA for the database should include an agreement with the data owners as to how frequently to make these backups. Balance the importance of the data against the cost of protecting it. For large databases, frequent backups can consume large amounts of disk storage and server resources. At least once a day, make a complete backup of each database.

Furthermore, databases should reside on some sort of managed storage area, ideally a RAID array on a storage area network or SAN, with daily back up to tape. For OLTP databases, the frequency of transaction log backups will depend on the frequency of updating, and the amount of data involved. For frequently updated databases, more frequent log dumps will not only provide greater protection, but will also reduce the impact of the backups on server resources and applications. Backup files should be kept on a separate file system from the databases, and should be backed up to tape, or some separate storage medium, daily. Store copies of the daily backups in a secure off-site facility.

For extremely critical data, the DBA will need to implement some sort of replication scheme in which data moves to another database on a remote server. In the event of database failure, applications can then “fail over” to the remote database and continue processing. Several different replication schemes exist, including mirroring and log shipping. In mirroring, updates to the primary database are replicated immediately (relatively speaking) to the secondary database, as part of a two-phase commit process. In log shipping, a secondary server receives and loads copies of the primary database’s transaction logs at regular intervals. The choice of replication method depends on how critical the data is, and how important it is that failover to the secondary server be immediate. Mirroring is usually a more expensive option than log shipping. For one secondary server, use mirroring; use log shipping to update additional secondary servers.

Other data protection options include server clustering, in which databases on a shared disk array can failover from one physical server to another, and server virtualization, where the failover occurs between virtual server instances residing on two or more physical machines.

Most DBMSs support hot backups of the database - backups taken while applications are running. When some updates occur in transit, they will roll either forward to completion, or roll back when the backup reloads. The alternative is a cold backup taken when the database is off-line. However, this may not be a viable option if applications need to be continuously available.

The DBA will also, when necessary, recover lost or damaged databases by reloading them from the necessary database and transaction log backups to recover as much of the data as possible.

6.2.1.5 Set Database Performance Service Levels

Database performance has two facets - availability and performance. Performance cannot be measured without availability. An unavailable database has a performance measure of zero.

SLAs between data management services organizations and data owners define expectations for database performance. Typically, the agreement will identify an expected timeframe of database availability, and a select few application transactions (a mix of complex queries and updates), each with a specified maximum allowable execution time during identified availability periods. If process execution times consistently exceed the SLA, or database availability is not consistently compliant with the SLA, the data owners will ask the DBA to identify the source of the problem and take appropriate remedial action.

Availability is the percentage of time that a system or database can be used for productive work. Availability requirements are constantly increasing, raising the business risks and costs of unavailable data. Activities to ensure availability are increasingly performed in shrinking maintenance windows.

Four related factors affect availability:

  • Manageability: The ability to create and maintain an effective environment.
  • Recoverability: The ability to reestablish service after interruption, and correct errors caused by unforeseen events or component failures.
  • Reliability: The ability to deliver service at specified levels for a stated period.
  • Serviceability: The ability to determine the existence of problems, diagnose their causes, and repair / solve the problems.

Many things may cause a loss of database availability, including:

  • Planned and unplanned outages.
  • Loss of the server hardware.
  • Disk hardware failure.
  • Operating system failure.
  • DBMS software failure.
  • Application problems.
  • Network failure.
  • Data center site loss.
  • Security and authorization problems.
  • Corruption of data (due to bugs, poor design, or user error).
  • Loss of database objects.
  • Loss of data.
  • Data replication failure.
  • Severe performance problems.
  • Recovery failures.
  • Human error.

DBAs are responsible for doing everything possible to ensure databases stay online and operational, including:

  • Running database backup utilities.
  • Running database reorganization utilities.
  • Running statistics gathering utilities.
  • Running integrity checking utilities.
  • Automating the execution of these utilities.
  • Exploiting table space clustering and partitioning.
  • Replicating data across mirror databases to ensure high availability.

6.2.1.6 Monitor and Tune Database Performance

DBAs optimize database performance both proactively and reactively, by monitoring performance and by responding to problems quickly and competently. Most DBMSs provide the capability of monitoring performance, allowing DBAs to generate analysis reports. Most server operating systems have similar monitoring and reporting capabilities. DBAs should run activity and performance reports against both the DBMS and the server on a regular basis, including during periods of heavy activity. They should compare these reports to previous reports to identify any negative trends and save them to help analyze problems over time.

Data movement may occur in real time through online transactions. However, many data movement and transformation activities are performed through batch programs, which may be Extract-Transform-Load (ETL) programs or limited to one system internally. These batch jobs must complete within specified windows in the operating schedule. DBAs and data integration specialists monitor the performance of batch data jobs, noting exceptional completion times and errors, determining the root cause of errors, and resolving these issues.

When performance problems occur, the DBA should use the monitoring and administration tools of the DBMS to help identify the source of the problem. A few of the most common possible reasons for poor database performance are:

  • Memory allocation (buffer / cache for data).
  • Locking and blocking: In some cases, a process running in the database may lock up database resources, such as tables or data pages, and block another process that needs them. If the problem persists over too long an interval of time, the DBA can kill the blocking process. In some cases, two processes may “deadlock”, with each process locking resources needed by the other. Most DBMSs will automatically terminate one of these processes after a certain interval of time. These types of problems are often the result of poor coding, either in the database or in the application.
  • Failure to update database statistics: Most relational DBMSs have a built-in query optimizer, which relies on stored statistics about the data and indexes to make decisions about how to execute a given query most effectively. Update these statistics regularly and frequently, especially in databases that are very active. Failure to do so will result in poorly performing queries.
  • Poor SQL coding: Perhaps the most common cause of poor database performance is poorly coded SQL. Query coders need a basic understanding of how the SQL query optimizer works, and should code SQL in a way that takes maximum advantage of the optimizer’s capabilities. Encapsulate complex SQL in stored procedures, which can be pre-compiled and pre-optimized, rather than embed it in application code. Use views to pre-define complex table joins. In addition, avoid using complex SQL, including table joins, in database functions, which, unlike stored procedures, are opaque to the query optimizer.
  • Insufficient indexing: Code complex queries and queries involving large tables to use indexes built on the tables. Create the indexes necessary to support these queries. Be careful about creating too many indexes on heavily updated tables, as this will slow down update processing.
  • Application activity: Ideally, applications should be running on a separate server from the DBMS, so that they are not competing for resources. Configure and tune database servers for maximum performance. In addition, the new DBMSs allow application objects, such as Java and .NET classes, to be encapsulated in database objects and executed in the DBMS. Be careful about making use of this capability. It can be very useful in certain cases, but executing application code on the database server can affect the performance of database processes.
  • Increase in the number, size, or use of databases: For DBMSs that support multiple databases, and multiple applications, there may be a “breaking point” where the addition of more databases has an adverse effect on the performance of existing databases. In this case, create a new database server. In addition, relocate databases that have grown very large, or that are being used more heavily than before, to a different server. In some cases, address problems with large databases by archiving less-used data to another location, or by deleting expired or obsolete data.
  • Database volatility: In some cases, large numbers of table inserts and deletes over a short while can create inaccurate database distribution statistics. In these cases, turn off updating database statistics for these tables, as the incorrect statistics will adversely affect the query optimizer.

After the cause of the problem is identified, the DBA will take whatever action is needed to resolve the problem, including working with application developers to improve and optimize the database code, and archiving or deleting data that is no longer actively needed by application processes.

In exceptional cases, the DBA may consider working with the data modeler to de-normalize the affected portion of the database. Do this only after other measures, such as the creation of views and indexes, and the rewriting of SQL code, have been tried; and only after careful consideration of the possible consequences, such as loss of data integrity and the increase in complexity of SQL queries against de-normalized tables. This caveat applies only to OLTP databases. For read-only reporting and analytical databases, de-normalization for performance and ease of access is the rule rather than the exception, and poses no threat or risk.

6.2.1.7 Plan for Data Retention

One important part of the physical database design is the data retention plan. Discuss data retention with the data owners at design time, and reach agreement on how to treat data over its useful life. It is incorrect to assume that all data will reside forever in primary storage. Data that is not actively needed to support application processes should be archived to some sort of secondary storage on less-expensive disk, or tape, or a CD / DVD jukebox, perhaps on a separate server. Purge data that is obsolete and unnecessary, even for regulatory purposes. Some data may become a liability if kept longer than necessary. Remember that one of the principal goals of data management is that the cost of maintaining data should not exceed its value to the organization.

6.2.1.8 Archive, Retain, and Purge Data

The DBAs will work with application developers and other operations staff, including server and storage administrators, to implement the approved data retention plan. This may require creating a secondary storage area, building a secondary database server, replicating less-needed data to a separate database, partitioning existing database tables, arranging for tape or disk backups, and creating database jobs which periodically purge unneeded data.

6.2.1.9 Support Specialized Databases

Do not assume that a single type of database architecture or DBMS works for every need. Some specialized situations require specialized types of databases. Manage these specialized databases differently from traditional relational databases. For example, most Computer Assisted Design and Manufacturing (CAD / CAM) applications will require an Object database, as will most embedded real-time applications. Geospatial applications, such as MapQuest, make use of specialized geospatial databases. Other applications, such as the shopping-cart applications found on most online retail web sites, make use of XML databases to initially store the customer order data. This data is then copied into one or more traditional OLTP databases or data warehouses. In addition, many off-the-shelf vendor applications may use their own proprietary databases. At the very least, their schemas will be proprietary and mostly concealed, even if they sit on top of traditional relational DBMSs.

Administration of databases used only to support a particular application should not present any great difficulty. The DBA will mostly be responsible for ensuring regular backups of the databases and performing recovery tests. However, if data from these databases needs to be merged with other existing data, say in one or more relational databases, it may present a data integration challenge. These considerations should be discussed and resolved whenever such databases are proposed or brought into the organization.

6.2.2 Data Technology Management

DBAs and other data professionals manage the technology related to their field. Managing data technology should follow the same principles and standards for managing any technology.

The leading reference model for technology management is the Information Technology Infrastructure Library (ITIL), a technology management process model developed in the United Kingdom. ITIL principles apply to managing data technology. For more information, refer to the ITIL website, http://www.itil-officialsite.com.

6.2.2.1 Understand Data Technology Requirements

It is important to understand not only how technology works, but also how it can provide value in the context of a particular business. The DBA, along with the rest of the data services organization, should work closely with business users and managers to understand the data and information needs of the business. This will enable them to suggest the best possible applications of technology to solve business problems and take advantage of new business opportunities.

Data professionals must first understand the requirements of a data technology before determining what technical solution to choose for a particular situation. These questions are a starting point for understanding suitability of a data technology and are not all-inclusive.

  1. What problem does this data technology mean to solve?
  2. What does this data technology do that is unavailable in other data technologies?
  3. What does this data technology not do that is available in other data technologies?
  4. Are there any specific hardware requirements for this data technology?
  5. Are there any specific Operating System requirements for this data technology?
  6. Are there any specific software requirements or additional applications required for this data technology to perform as advertised?
  7. Are there any specific storage requirements for this data technology?
  8. Are there any specific network or connectivity requirements for this data technology?
  9. Does this data technology include data security functionality? If not, what other tools does this technology work with that provides for data security functionality?
  10. Are there any specific skills required to be able support this data technology? Do we have those skills in-house or must we acquire them?

6.2.2.2 Define the Data Technology Architecture

Data technology is part of the enterprise’s overall technology architecture, but it is also often considered part of its data architecture.

Data technology architecture addresses three basic questions:

  1. What technologies are standard (which are required, preferred, or acceptable)?
  2. Which technologies apply to which purposes and circumstances?
  3. In a distributed environment, which technologies exist where, and how does data move from one node to another?

Data technologies to be included in the technology architecture include:

  • Database management systems (DBMS) software.
  • Related database management utilities.
  • Data modeling and model management software.
  • Business intelligence software for reporting and analysis.
  • Extract-transform-load (ETL) and other data integration tools.
  • Data quality analysis and data cleansing tools.
  • Meta-data management software, including meta-data repositories.

Technology architecture components are sometimes referred to as “bricks”. Several categories or views representing facets of data technology bricks are:

  • Current: Products currently supported and used.
  • Deployment Period: Products to be deployed for use in the next 1-2 years.
  • Strategic Period: Products expected to be available for use in the next 2+ years.
  • Retirement: Products the organization has retired or intends to retire this year.
  • Preferred: Products preferred for use by most applications.
  • Containment: Products limited to use by certain applications.
  • Emerging: Products being researched and piloted for possible future deployment.

The technology road map for the organization consists of these reviewed, approved, and published bricks, and this helps govern future technology decisions.

It is important to understand several things about technology:

  • It is never free. Even open-source technology requires care and feeding.
  • It should always be regarded as the means to an end, rather than the end itself.
  • Most importantly, buying the same technology that everyone else is using, and using it in the same way, does not create business value or competitive advantage for the enterprise.

After the necessary discussions with the business users and managers, the data services group can summarize the data technology objectives for the business in the form of a strategic roadmap that can be used to inform and direct future data technology research and project work.

6.2.2.3 Evaluate Data Technology

Selecting appropriate data related technology, particularly the appropriate database management technology, is an important data management responsibility. Management selects data technology to meet business needs, including total cost, reliability, and integration.

Selecting data technology involves business data stewards, DBAs, data architects, data analysts, other data management professionals, and other IT professionals. Data technologies to be researched and evaluated include:

  • Database management systems (DBMS) software.
  • Database utilities, such as backup and recovery tools, and performance monitors.
  • Data modeling and model management software.
  • Database management tools, such as editors, schema generators, and database object generators.
  • Business intelligence software for reporting and analysis.
  • Extract-transfer-load (ETL) and other data integration tools.
  • Data quality analysis and data cleansing tools.
  • Data virtualization technology.
  • Meta-data management software, including meta-data repositories.

In addition, data professionals may have unique requirements for tools used in other fields, including:

  • Change management (source code library and configuration) tools.
  • Problem and issue management tools.
  • Test management tools.
  • Test data generators.

Make selection decisions using a standard technology evaluation process and applying the decision analysis concepts defined by Kepner and Tregoe in The Rational Manager. List alternatives and compare them against a defined set of weighted decision criteria, including feature requirements and functional objectives. The basic method includes the following steps:

  1. Understand user needs, objectives, and related requirements.
  2. Understand the technology in general.
  3. Identify available technology alternatives.
  4. Identify the features required.
  5. Weigh the importance of each feature.
  6. Understand each technology alternative.
  7. Evaluate and score each technology alternative’s ability to meet requirements.
  8. Calculate total scores and rank technology alternatives by score.
  9. Evaluate the results, including the weighted criteria.
  10. Present the case for selecting the highest ranking alternative.

Selecting strategic DBMS software is particularly important. DBMS software has a major impact on data integration, application performance, and DBA productivity. Some of the factors to consider when selecting DBMS software include:

  • Product architecture and complexity.
  • Application profile, such as transaction processing, business intelligence, and personal profiles.
  • Organizational appetite for technical risk.
  • Hardware platform and operating system support.
  • Availability of supporting software tools.
  • Performance benchmarks.
  • Scalability.
  • Software, memory, and storage requirements.
  • Available supply of trained technical professionals.
  • Cost of ownership, such as licensing, maintenance, and computing resources.
  • Vendor reputation.
  • Vendor support policy and release schedule.
  • Customer references.

The DBA will need to assist in evaluating technology alternatives. A number of factors come into play here:

  • The availability, stability, maturity, and cost of current products.
  • The suitability of a given product to meet the current business need / problem.
  • The extensibility of a given product to meet other business needs.
  • The product’s “fit” with the organization’s technology and architecture roadmap (see section 4.2.2.4).
  • The product’s “fit” with other products and technology used by the organization.
  • The vendor’s reputation, stability, and expected longevity – Is this a vendor that the company will want to, and be able to, do business with over an extended period?
  • The degree of support expected from the vendor – Will upgrades be made available frequently and at minimal cost? Will help from the vendor be available when needed?

The DBA will need to carefully test each candidate product to determine its strengths and weaknesses, ease of implementation and use, applicability to current and future business needs and problems, and whether it lives up to the vendor’s hype.

6.2.2.4 Install and Administer Data Technology

The DBAs face the work of deploying new technology products in development / test, QA / certification, and production environments. They will need to create and document processes and procedures for administering the product with the least amount of effort and expense. Remember that the expense of the product, including administration, licensing, and support must not exceed the product’s value to the business. Remember also that the purchase of new products, and the implementation of new technology, will probably not be accompanied by an increase in staffing, so the technology will need to be, as much as possible, self-monitoring and self-administering.

Also, remember that the cost and complexity of implementing new technology is usually under-estimated, and the features and benefits are usually over-estimated. It is a good idea to start with small pilot projects and proof-of-concept (POC) implementations, to get a good idea of the true costs and benefits before proceeding with a full-blown production implementation.

6.2.2.5 Inventory and Track Data Technology Licenses

Organizations must comply with all licensing agreements and regulatory requirements. Carefully track and conduct yearly audits of software license and annual support costs, as well as server lease agreements and other fixed costs. Being out-of-compliance with licensing agreements poses serious financial and legal risks for an organization.

This data can also determine the total cost-of-ownership (TCO) for each type of technology and technology product. Regularly evaluate technologies and products that are becoming obsolete, unsupported, less useful, or too expensive.

6.2.2.6 Support Data Technology Usage and Issues

When a business need requires new technology, the DBAs will work with business users and application developers to ensure the most effective use of the technology, to explore new applications of the technology, and to address any problems or issues that surface from its use.

DBAs and other data professionals serve as Level 2 technical support, working with help desks and technology vendor support to understand, analyze, and resolve user problems.

The key to effective understanding and use of any technology is training. Organizations should make sure they have an effective training plan and budget in place for everyone involved in implementing, supporting, and using data and database technology. Training plans should include appropriate levels of cross training to better support application development, especially Agile development. DBAs should have, and take the opportunity to learn, application development skills such as class modeling, use-case analysis, and application data access. Developers should learn some database skills, especially SQL coding!

6.3 Summary

The guiding principles for implementing data operations management into an organization, a summary table of the roles for each data operations management activity, and organization and cultural issues that may arise during data operations management are summarized below.

6.3.1 Guiding Principles

In his book Database Administration, Craig Mullins offers DBAs the following rules of thumb for data operations management:

  1. Write everything down.
  2. Keep everything.
  3. Whenever possible, automate a procedure.
  4. Focus to understand the purpose of each task, manage scope, simplify, do one thing at a time.
  5. Measure twice, cut once.
  6. Don’t panic; react calmly and rationally, because panic causes more errors.
  7. Understand the business, not just the technology.
  8. Work together to collaborate, be accessible, audit each other’s work, share your knowledge.
  9. Use all of the resources at your disposal.
  10. Keep up to date.

6.3.2 Process Summary

The process summary for the data operations management function is shown in Table 6.1. The deliverables, responsible roles, approving roles, and contributing roles are shown for each activity in the data operations management function. The Table is also shown in Appendix A9.

Activities

Deliverables

Responsible Roles

Approving Roles

Contributing Roles

4.1.1 Implement and Control Database Environments

Production database environment maintenance, managed changes to production databases, releases

DBAs

DM Executive

System programmers, data stewards, data analysts, software developers, project managers

4.1.2 Acquire Externally Sourced Data (O)

Externally sourced data

DBAs, data analysts, data stewards

Data Governance Council

Data stewards, data analysts

4.1.3 Plan for Data Recovery (P)

Data availability SLAs, data recovery plans

DBAs

DM Executive, Data Governance Council

4.1.4 Backup and Recover Data (O)

Database backups and logs,restored databases,business continuity

DBAs

DM Executive

4.1.5 Set Database Performance Service Levels (P)

Database performance SLAs

DBAs

DM Executive, Data Governance Council

4.1.6 Monitor and Tune Database Performance (O)

Database performance reporting, Database performance

DBAs

4.1.7 Plan for Data Retention (P)

Data retention plan, storage management procedures

DBAs

DM Executive

Storage management specialists

4.1.8 Archive, Retrieve and Purge Data (O)

Archived data, retrieved data, purged data

DBAs

DM Executive

4.1.9 Manage Specialized Databases (O)

Geospatial databases, CAD / CAM databases, XML databases, object databases

DBAs

DM Executive

Data stewards, Subject matter experts

4.2.1 Understand Data Technology Requirements (P)

Data technology requirements

Data architect, DBAs

DM Executive

Data stewards, other IT professionals

4.2.2 Define the Database Architecture (P) (same as 2.3)

Data technology architecture

Data architect

DM Executive, Data Governance Council

DBAs, data analysts, data stewards

4.2.3 Evaluate Data Technology (P)

Tool evaluation findings, tool selection decisions

Data analysts, DBAs

DM Executive, Data Governance Council

Data stewards, other IT professionals

4.2.4 Install and Administer Data Technology (O)

Installed technology

DBAs

DM Executive

Data analysts, other data professionals

4.2.5 Inventory and Track Data Technology Licenses (C)

License inventory

DBAs

DM Executive

Other data professionals

4.2.6 Support Data Technology Usage and Issues (O)

Identified and resolved technology issues

DBAs

DM Executive

Other data professionals

Table 6.1 Data Operations Management Process Summary

6.3.3 Organizational and Cultural Issues

Q1: What are common organizational and cultural obstacles to database administration?

A1: DBAs often do not effectively promote the value of their work to the organization. They need to recognize the legitimate concerns of data owners and data consumers, balance short-term and long-term data needs, educate others in the organization about the importance of good data management practices, and optimize data development practices to ensure maximum benefit to the organization and minimal impact on data consumers. By regarding data work as an abstract set of principles and practices, and disregarding the human elements involved, DBAs risk propagating an “us versus them” mentality, and being regarded as dogmatic, impractical, unhelpful, and obstructionist.

Many disconnects, mostly clashes in frames of reference, contribute to this problem. Organizations generally regard information technology in terms of specific applications, not data, and usually see data from an application-centric point of view. The long-term value to organizations of secure, reusable, high-quality data, such as data as a corporate resource, is not as easily recognized or appreciated.

Application development often sees data management as an impediment to application development, as something that makes development projects take longer and cost more without providing additional benefit. DBAs have been slow to adapt to changes in technology, such as XML, objects, and service-oriented architectures, and new methods of application development, such as Agile Development, XP, and Scrum. Developers, on the other hand, often fail to recognize how good data management practices can help them achieve their long-term goals of object and application reuse, and true service-oriented application architecture.

There are several things that DBAs and other data-management practitioners can do to help overcome these organizational and cultural obstacles, and promote a more helpful and collaborative approach to meeting the organization’s data and information needs:

  • Automate database development processes, developing tools and processes that shorten each development cycle, reduce errors and rework, and minimize the impact on the development team. In this way, DBAs can adapt to more iterative (agile) approaches to application development.
  • Develop, and promote the use of, abstracted and reusable data objects that free applications from being tightly coupled to database schemas; the so-called object-relational impedance mismatch. A number of mechanisms exist for doing this, including database views, triggers, functions and stored procedures, application data objects and data-access layers, XML and XSLT, ADO.NET typed datasets, and web services. The DBA should be familiar with all available means of virtualizing data and be able to recommend the best approach for any situation. The end goal is to make using the database as quick, easy, and painless as possible.
  • Promote database standards and best practices as requirements, but be flexible enough to deviate from them if given acceptable reasons for these deviations. Database standards should never be a threat to the success of a project.
  • Link database standards to various levels of support in the SLA. For example, the SLA can reflect DBA-recommended and developer-accepted methods of ensuring data integrity and data security. The SLA should reflect the transfer of responsibility from the DBAs to the development team if the development team will be coding their own database update procedures or data access layer. This prevents an “all or nothing” approach to standards.
  • Establish project needs and support requirements up-front, to reduce misunderstandings about what the project team wants, and does not want, from the data group. Make sure that everyone is clear about what work the DBAs will, and won’t, be doing - the way in which the work will be done, the standards that will, or won’t, be followed, the timeline for the project, the number of hours and resources involved, and the level of support that will be required during development and after implementation. This will help forestall unpleasant surprises midway through the development process.
  • Communicate constantly with the project team, both during development and after implementation, to detect and resolve any issues as early as possible. This includes reviewing data access code, stored procedures, views, and database functions written by the development team. This will also help surface any problems with or misunderstandings about the database design.
  • Stay business-focused. The objective is meeting the business requirements and deriving the maximum business value from the project. It does not help to win the battles and lose the war.
  • Adopt a “can do” attitude and be as helpful as possible. If you are always telling people “no”, don’t be surprised when they choose to ignore you and find another path. Recognize that people need to do whatever they need to do, and if you don’t help them succeed, they may help you fail.
  • Accept any defeats and failures encountered during a project as “lessons learned”, and apply that to future projects. You do not have to win every battle. If problems arise from having done things wrong, you can always point to them later as reasons for doing things right in the future.
  • Communicate with people on their level and in their terms. It is better to talk with business people in terms of business needs and ROI, and with developers in terms of object-orientation, loose coupling, and ease of development.
  • Concentrate on solving other people’s problems, not your own.

To sum up, we need to understand who our stakeholders are, and what their needs and concerns are. We need to develop a set of clear, concise, practical, business-focused standards for doing the best possible work in the best possible way. Moreover, we need to teach and implement those standards in a way that provides maximum value to our stakeholders, and earns their respect for us as facilitators, contributors, and solution providers.

Q2: How many DBAs does an organization need?

A2: The answer to this question varies by organization. There is no standard staffing rule of thumb. However, there may be a significant business cost to understaffing. An overworked DBA staff can make mistakes that cost much more in downtime and operational problems than might be saved in salary cost avoidance by minimizing the DBA staff. Many factors need to be considered when determining the optimal number of DBAs for the organization. These factors include:

  • The number of databases.
  • The size and complexity of the databases.
  • The number of DBMS platforms and environments.
  • The number of users.
  • The number of supported applications.
  • The type and complexity of applications.
  • Availability requirements.
  • The business risk and impact of downtime.
  • Performance requirements.
  • Service level agreements and related customer expectations.
  • The number of database change requests made.
  • DBA staff experience.
  • Software developer experience with databases.
  • End user experience.
  • The maturity of DBA tools.
  • The extent of DBA responsibilities for database logic (stored procedures, triggers, user-defined functions), integration, access interfaces, and information products.

Q3: What is an application DBA?

A3: An application DBA is responsible for one or more databases in all environments (development / test, QA, and production), as opposed to database systems administration for any of these environments. Sometimes, application DBAs report to the organizational units responsible for development and maintenance of the applications supported by their databases. There are pros and cons to staffing application DBAs. Application DBAs are viewed as integral members of an application support team, and by focusing on a specific database, they can provide better service to application developers. However, application DBAs can easily become isolated and lose sight of the organization’s overall data needs and common DBA practices. Constant collaboration between DBAs and data analysts, modelers, and architects is necessary to prevent DBA isolation and disengagement.

Q4: What is a procedural DBA?

A4: A procedural DBA specializes in development and support of procedural logic controlled and execute by the DBMS: stored procedures, triggers, and user defined functions (UDFs). The procedural DBA ensures this procedural logic is planned, implemented, tested, and shared (reused). Procedural DBAs lead the review and administration of procedural database objects.

6.4 Recommended Reading

The references listed below provide additional reading that support the material presented in Chapter 6. These recommended readings are also included in the Bibliography at the end of the Guide.

Dunham, Jeff. Database Performance Tuning Handbook. McGraw-Hill, 1998. ISBN 0-07-018244-2.

Hackathorn, Richard D. Enterprise Database Connectivity. Wiley Professional Computing, 1993. ISBN 0-4761-57802-9. 352 pages.

Hoffer, Jeffrey, Mary Prescott, and Fred McFadden. Modern Database Management, 7th Edition. Prentice Hall, 2004. ISBN 0-131-45320-3. 736 pages.

Kepner, Charles H. and Benjamin B. Tregoe. The New Rational Manager. Princeton Research Press, 1981. 224 pages.

Kroenke, D. M. Database Processing: Fundamentals, Design, and Implementation, 10th Edition. Pearson Prentice Hall, 2005. ISBN 0-131-67626-3. 696 pages.

Martin, James. Information Engineering Book II: Planning and Analysis. Prentice-Hall, Inc., 1990. Englewoood Cliffs, New Jersey.

Mattison, Rob. Understanding Database Management Systems, 2nd Edition. McGraw-Hill, 1998. ISBN 0-07-049999-3. 665 pages.

Mullins, Craig S. Database Administration: The Complete Guide to Practices and Procedures. Addison-Wesley, 2002. ISBN 0-201-74129-6. 736 pages.

Parsaye, Kamran and Mark Chignell. Intelligent Database Tools and Applications: Hyperinformation Access, Data Quality, Visualization, Automatic Discovery. John Wiley & Sons, 1993. ISBN 0-471-57066-4. 560 pages.

Pascal, Fabian, Practical Issues In Database Management: A Reference For The Thinking Practitioner. Addison-Wesley, 2000. ISBN 0-201-48555-9. 288 pages.

Piedad, Floyd, and Michael Hawkins. High Availability: Design, Techniques and Processes. Prentice Hall, 2001. ISBN 0-13-096288-0.

Rob, Peter, and Carlos Coronel. Database Systems: Design, Implementation, and Management, 7th Edition. Course Technology, 2006. ISBN 1-418-83593-5. 688 pages.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset