Chapter 9
Architectural Refactorings

Architectural refactorings are changes that improve the overall manner in which external programs interact with a database. The architectural refactorings are as follows:

• Add CRUD Methods

• Add Mirror Table

• Add Read Method

• Encapsulate Table With View

• Introduce Calculation Method

• Introduce Index

• Introduce Read-Only Table

• Migrate Method From Database

• Migrate Method To Database

• Replace Method(s) With View

• Replace View With Method(s)

• Use Official Data Source

Add CRUD Methods

Introduce four stored procedures (methods) to implement the creation, retrieval, update, and deletion (CRUD) of the data representing a business entity. After these stored procedures are in place, access to the source tables is restricted to just these procedures and administrators.

Motivation

There are several reasons to apply Add CRUD Methods:

Encapsulate data access. Stored procedures are a common way to encapsulate access to your database, albeit not as effective as persistence frameworks (see Chapter 1, “Evolutionary Database Development”).

Decouple application code from database tables. Stored procedures are an effective way to decouple applications from database tables. They enable you to change database tables without changing application code.

Implement entity-based security access control (SAC). Instead of directly accessing source tables, applications instead invoke the relevant stored procedures. CRUD stored procedures provide a way to implement an entity-level approach to SAC. Database products typically enable data SAC at the table, column, and sometimes row levels. But, when the data for complex business entities such as Customer are stored in several tables, SAC, at the entity level, can become quite complicated if you restrict yourself to data-oriented SAC. Luckily, most database products also implement method SAC, and thus you can implement entity-level SAC via CRUD methods.

Potential Tradeoffs

The primary advantage of encapsulating database access in this manner is that it makes it easier to refactor your table schema. When you implement common schema refactorings such as Rename Table (page 113), Move Column (page 103), and Split Column (page 140), you should only have to refactor the corresponding CRUD methods that access them (assuming that no other external program directly accesses the table schema).

Unfortunately, this approach to database encapsulation comes at a cost. Methods (stored procedures, stored functions, and triggers) are specific to database vendors—Oracle methods are not easily portable to Sybase, for example. Furthermore, there is no guarantee that the methods that work in one version of a database will be easy to port to a newer version, so you may be increasing your upgrade burden, too. Another issue is lack of flexibility. What happens if you want to access a portion of a business entity? Do you really want to work with all the entity’s data each time or perhaps introduce CRUD methods for that subentity? What happens if you need data that crosses entities, perhaps for a report? Do you invoke several stored procedures to read each business entity that you require and then select the data you require, or do you apply Add Read Method (page 240) to retrieve the data specifically required by the report?

Schema Update Mechanics

To update the database schema, you must do the following:

1. Identify the business data. You need to identify the business entity, such as Customer of Figure 9.1, for which you want to implement the CRUD methods. After you know what entity you are encapsulating access to, you need to identify the source data within the database for that entity.

2. Write the stored procedures. You need to write at least four stored procedures, one to create the entity, one to read it based on its primary key, one to update the entity, and one to delete it. Additional stored procedures to retrieve the entity via means other than the primary key can be added by applying the Add Read Method (page 240) database refactoring.

3. Test the stored procedures. One of the best ways to work is to take a Test-Driven Development (TDD) approach; see Chapter 1, “Evolutionary Database Development.”

Figure 9.1. Adding Customer CRUD methods.

image

Figure 9.1 depicts how to introduce the CRUD methods for the Customer entity. The code for ReadCustomer is shown here. The code for the other stored procedures is self-explanatory:

CREATE OR REPLACE PACKAGE CustomerCRUD AS
    TYPE customerType IS REF CURSOR RETURN
    Customer%ROWTYPE;
    PROCEDURE ReadCustomer
      (readCustomerId IN NUMBER,customers OUT
      customerType);
    PROCEDURE CreateCustomer(....);
    PROCEDURE UpdateCustomer(....);
    PROCEDURE DeleteCustomer(....);
    END CustomerCRUD;
    /

    CREATE OR REPLACE PACKAGE BODY CustomerCRUD AS

    PROCEDURE ReadCustomer
      (readCustomerId IN NUMBER,customerReturn OUT
      customerType) IS
    BEGIN
    OPEN refCustomer FOR
      SELECT * FROM Customer WHERE CustomerID =
      readCustomerId;
    END ReadCustomer;

    END CustomerCRUD;
/


Data-Migration Mechanics

There is no data to migrate for this database refactoring.

Access Program Update Mechanics

When you introduce the CRUD methods, you need to identify all the places in external applications where the entity is currently used and then refactor that code to invoke the new stored procedures as appropriate. You may discover that the individual programs require different subsets of the data, or another way to look at it is that some programs require just a little bit more data than others. You may discover that you need to apply Add CRUD Methods for several entities simultaneously, potentially a major change.

The first code example shows how the Customer Java class originally submitted hard-coded SQL to the database to retrieve the appropriate data. The second code example shows how it would be refactored to invoke the stored procedure:

// Before code
stmt.prepare(
  "SELECT FirstName, Surname, PhoneNumber FROM
  Customer " +
    "WHERE CustomerId=?");
stmt.setLong(1,customerId);
stmt.execute();
ResultSet rs = stmt.executeQuery();

//After code
stmt = conn.prepareCall("begin ? := ReadCustomer(?); end;");
stmt.registerOutParameter(1, OracleTypes.CURSOR);
stmt.setLong(2, customerId);
stmt.execute();
ResultSet rs = stmt.getObject(1);


Add Mirror Table

Create a mirror table, an exact duplicate of an existing table in one database, in another database.

Motivation

There are several reasons why you may want to apply Add Mirror Table:

Improve query performance. Querying a given set of tables may be slow due to the database being in a remote location; therefore, a prepopulated table on a local server may improve overall performance.

Create redundant data. Many applications query data in real time from other databases. A table containing this data in your local database reduces your dependency on these other database(s), providing a buffer for when they go down or are taken down for maintenance.

Replace redundant reads. Several external programs, or stored procedures for that matter, often implement the same retrieval query. These queries to the remote database could be replaced by a mirror table on the local database that replicates with the remote database.

Potential Tradeoffs

The primary challenge when introducing a mirror table is “stale data.” Stale data occurs when one table is updated but not the other; remember, either the source or the mirror table could potentially be updated. This problem increases as more mirrors of Customer are created in other databases. As you see in Figure 9.2, you need to implement some sort of synchronization strategy.

Figure 9.2. Adding a Customer mirror table.

image

Schema Update Mechanics

As depicted in Figure 9.2, to update the database schema, when you perform Add Mirror Table you must do the following:

1. Determine the location. You must decide where the mirrored table, Customer, will reside—in this case, we will be mirroring it in DesktopDB.

2. Introduce the mirror table. Create DesktopDB.Customer in the other database using the CREATE TABLE command of SQL.

3. Determine synchronization strategy. The real-time approach of Figure 9.2 should be taken when your end users require up-to-date information, the synchronization of data.

4. Allow updates. If you want to allow updates to DesktopDB.Customer, you must provide a way to synchronize the data from DesktopDB.Customer to the Customer. An updatable DesktopDB.Customer is known as a peer-to-peer mirror; it is also known as a master/slave mirror.

The following code depicts the DDL to introduce the DesktopDB.Customer table:

CREATE TABLE Customer (
  CustomerID  NUMBER NOT NULL,
  Name  VARCHAR(40),
  PhoneNumber VARCHAR2(40),
  CONSTRAINT PKCustomer
    PRIMARY KEY (CustomerID)
);

COMMENT ON Customer 'Mirror table of Customer on Remote Location"


Data-Migration Mechanics

You must initially copy all the relevant source data into the mirror table, and then apply your synchronization strategy (real-time update or use database replication). There are several implementation strategies you can apply for your synchronization strategy:

1. Periodic refresh. Use a scheduled job that synchronizes your Customer and DesktopDB.Customer table. The job must be able to deal with data changes on both the tables and be able to update data both ways. Periodic refreshes are usually better when you are building data warehouses and data marts.

2. Database replication. Database products provide a feature where you can set up tables to be replicated both ways called multimaster replication. The database keeps both the tables synchronized. Generally speaking, you would use this approach when you want to have the Customer and DesktopDB.Customer updatable by the users. If your database product provides this feature, it is advisable to use this feature over using custom-coded solutions.

3. Use trigger-based synchronization. Create triggers on the Customer so that source data changes are propagated to the DesktopDB.Customer and create triggers on the DesktopDB.Customer so that changes to the table are propagated to Customer. This technique enables you to custom code the data synchronization, which is desirable when you have complex data objects that need to be synchronized; however, you must write all the triggers, which could be time consuming.

The following code depicts how to synchronize data in ProductionDB.Customer table and DesktopDB.Customer using triggers:

CREATE OR REPLACE TRIGGER
  UpdateCustomerMirror
  AFTER UPDATE OR INSERT OR DELETE
  ON Customer
  REFERENCING OLD AS OLD NEW AS NEW
  FOR EACH ROW
    DECLARE
      BEGIN
        IF DELETING THEN
          DeleteCustomerMirror;
        END IF;
        IF (UPDATING OR INSERTING) THEN
          UpdateCustomerMirror;
        END IF;
      END;
   END;
   /

CREATE OR REPLACE TRIGGER
  UpdateCustomer

  AFTER UPDATE OR INSERT OR DELETE
  ON CustomerMirror
  REFERENCING OLD AS OLD NEW AS NEW
  FOR EACH ROW
  DECLARE
  BEGIN
    IF DELETING THEN
      DeleteCustomer;
    END IF;
    IF (UPDATING OR INSERTING) THEN
      UpdateCustomer;
    END IF;
  END;
END;
/


Access Program Update Mechanics

You will have already analyzed why you are introducing this DesktopDB.Customer. For example, if you are going to mirror Customer on your remote database for easy access on the local database, you must make sure that the application connects to this local database when it wants to query the Customer table instead of the production database. You can also set up alternate connection properties that enable you to use DesktopDB or ProductionDB to query your data, as you see in the following code:

// Before code
stmt = remoteDB.prepare("select
CustomerID,Name,PhoneNumber FROM Customer WHERE
CustomerID = ?");
stmt.setLong(1, customerID);
stmt.execute();
ResultSet rs = stmt.executeQuery();

//After code
stmt = localDB.prepare("select
CustomerID,Name,PhoneNumber FROM Customer WHERE
CustomerID = ?");
stmt.setLong(1, customerID);
stmt.execute();
ResultSet rs = stmt.executeQuery();


Add Read Method

Introduce a method, in this case a stored procedure, to implement the retrieval of the data representing zero or more business entities from the database.

Motivation

The primary reason to apply Add Read Method is to encapsulate the logic required to retrieve specific data from the database in accordance with defined criteria. The method, often a stored procedure, might be required to replace one or more SELECT statements implemented in your application and/or reporting code. Another common motivation is to implement a consistent search strategy for a given business entity.

Sometimes this refactoring is applied to support the Introduce Soft Delete (page 222) or Introduce Hard Delete (page 219) refactorings. When either of these refactorings is applied, the way in which data is deleted changes—either you mark a row as deleted or you physically delete the data, respectively. A change in strategy such as this may require a multitude of changes to be implemented within your applications to ensure that retrievals are performed correctly. By encapsulating the data retrieval in a stored procedure, and then invoking that stored procedure appropriately, it is much easier to implement these two refactorings because the requisite retrieval logic only needs to be reworked in one place (the stored procedure).

Potential Tradeoffs

The primary advantage of encapsulating retrieval logic in this manner, often in combination with Add CRUD Methods (page 232), is that it makes it easier to refactor your table schema. Unfortunately, this approach to database encapsulation comes at a cost. Stored procedures are specific to database vendors, reducing your potential for portability, and may decrease your overall performance if they are written poorly.

Schema Update Mechanics

To update the database schema, you must do the following:

1. Identify the data. You need to identify the data you want to retrieve, which may come from several tables.

2. Identify the criteria. How do you want to specify the subset of data to retrieve? For example, would you like to be able to retrieve information for bank accounts whose balance is over a specific amount, or which have been opened at a specific branch, or which have been accessed during a specific period of time, or for a combination thereof. Note that the criteria may or may not involve the primary key.

3. Write and test the stored procedure. We are firm believers in writing a full, 100 percent regression test suite for your stored procedures. Better yet, we recommend a Test-Driven Development (TDD) approach where you write a test before writing a little bit of code within your stored procedures (Astels 2003; Beck 2003).

Figure 9.3 shows how to introduce a read stored procedure for the Customer entity that takes as parameters a first and last name. The code for the Read-Customer stored procedure is shown below. It is written assuming that it would get parameters passed to it such as S% and Ambler and return all the people with the surname Ambler whose first name begins with the letter S.

Figure 9.3. Adding a Customer read method.

image

PROCEDURE ReadCustomer
  (
  firstNameToFind IN VARCHAR,
  lastNameToFind IN VARCHAR,
  customerRecords OUT CustomerREF
) IS
BEGIN
OPEN refCustomer FOR

  SELECT * FROM Customer WHERE
  Customer.FirstName = firstNameToFind
  AND Customer.LastName = lastNameToFind;
END ReadCustomer;
/


Data-Migration Mechanics

There is no data to migrate for this database refactoring.

Access Program Update Mechanics

When you introduce the stored procedure, you need to identify all the places in external applications where the data is read and then refactor that code to invoke the new stored procedure appropriately. You may discover that the individual programs require different subsets of the data, and therefore you may need to apply Add Read Method several times, one for each major collection of data elements.

In the following code, applications would submit SELECT statements to the database to retrieve customer information; but after the refactoring, they just invoke the stored procedure:

// Before code
stmt.prepare(
  "SELECT * FROM Customer" +
    "WHERE Customer.FirstName=? AND
    Customer.LastName=?");
stmt.setString(1,firstNameToFind);
stmt.setString(2,lastNameToFind);
stmt.execute();
ResultSet rs = stmt.executeQuery();

//After code
stmt = conn.prepareCall("begin ? := ReadCustomer(?,?,?); end;");
stmt.registerOutParameter(1, OracleTypes.CURSOR);
stmt.setString(2, firstNameToFind);
stmt.setString(3,lastNameToFind);
stmt.execute();
ResultSet rs = stmt.getObject(1);
while (rs.next()) {
  getCustomerInformation(rs);
}


Encapsulate Table With View

Wrap access to an existing table with a view.

Motivation

There are two reasons to apply Encapsulate Table With View. First, to implement a façade around an existing table that you intend to later refactor. By writing applications to access source data through views instead of directly through tables, you reduce the direct coupling to the table, making it easier to refactor. For example, you may want to encapsulate access to a table via a view first before you apply refactorings such as Rename Column (page 109) or Drop Column (page 72).

Second, you may want to put a security access control (SAC) strategy in place for your database. You can do this by encapsulating access to a table via several views, and then restricting access to the table and the various views appropriately. Few users, if any, would have direct access to the source table; users would instead be granted access rights to one or more views. The first step of this process is to encapsulate access to the original table with a view, and then you introduce new views as appropriate.

Potential Tradeoffs

This refactoring, as shown in Figure 9.4, will only work when your database supports the same level of access through views as it does to tables. For example, if Customer is updated by an external program, your database must support updatable views. If it does not, you should consider the refactoring Add CRUD Method (page 232) to encapsulate access to this table instead.

Figure 9.4. Encapsulating the TCustomer table with a view.

image

Schema Update Mechanics

As you can see in Figure 9.4, this refactoring is straightforward. The first step is to rename the existing table, in this case to TCustomer. The second step is to introduce a view with the name of the original table, in this case Customer. From the point of view of the external programs that access the table, nothing has really changed. They simply access the source data through the view, which to them appears to be the original table.

One way to rename the Customer table is to use the RENAME TO clause of the ALTER TABLE SQL command, as shown below. If your database does not support this clause, you must create TCustomer, copy the data from Customer, and then drop the Customer table. An example of this approach is provided with the Rename Table refactoring (page 113). Regardless of how you rename the table, the next step is to add the view via the CREATE VIEW command:

ALTER TABLE Customer RENAME TO TCustomer;

CREATE VIEW Customer AS
  SELECT * FROM TCustomer;


Data-Migration Mechanics

There is no data to migrate for this database refactoring.

Access Program Update Mechanics

You should not have to update any access programs because the view is identical in structure to the original table.

Introduce Calculation Method

Introduce a new method, typically a stored function, that implements a calculation that uses data stored within the database.

Figure 9.5. Introducing a calculation method for Customer.

image

Motivation

There are several reasons to apply Introduce Calculation Method:

To improve application performance. You can improve overall performance by implementing a calculation that requires significant amounts of data within the database and then responding with just the answer. This avoids shipping the required data across the network to do the calculation.

To implement a common, reusable calculation. A calculation may be implemented by several existing methods. Therefore, it makes sense to extract that calculation into a single database method that is invoked by the others.

To support Introduce Calculated Column (page 81). You can implement the logic required to calculate the value of the column in a stored function.

To replace a calculated column. You may choose to replace a calculated column with a stored procedure that you invoke instead. To remove the column, apply the Drop Column (page 72) refactoring.

Potential Tradeoffs

When too much logic is implemented within your database, it can become a performance bottleneck if the database has not been architected to handle the load. You need to either ensure that your database is scalable or limit the amount of functionality that you implement in it.

Schema Update Mechanics

To update the database schema, you must write and then test the stored procedure. If existing stored procedures implement this logic, they should be refactored to invoke this stored procedure. The following code shows how to introduce the stored function GetCustomerAccountTotal to the CustomerDB database:

CREATE OR REPLACE FUNCTION getCustomerAccountTotal
(CustomerID IN NUMBER) RETURN NUMBER IS
customerTotal NUMBER;
  BEGIN
    SELECT SUM(Amount) INTO customerTotal FROM Policy
      WHERE PolicyCustomerId=CustomerID;
        RETURN customerTotal;
      EXCEPTION WHEN no_data_found THEN
        RETURN 0;
  END;
END;
/


Data-Migration Mechanics

There is no data to migrate for this database refactoring.

Access Program Update Mechanics

When you introduce the calculation method, you need to identify all the places in external applications where this calculation is currently implemented and then refactor the code to appropriately invoke the new stored procedure. The following code shows how the calculation used to be performed as a Java operation, but afterward the operation simply invokes the appropriate stored function:

//Before code
private BigDecimal getCustomerTotalBalance() {
BigDecimal customerTotalBalance = new BigDecimal(0);

for (Iterator iterator =
customer.getPolicies().iterator(); iterator.hasNext();)
{
  Policy policy = (Policy) iterator.next();
  customerTotalBalance.add(policy.getBalance());
}
return customerTotalBalance;
}

//After code
private BigDecimal getCustomerTotalBalance() {
  BigDecimal customerTotalBalance = new BigDecimal(0);
  stmt = connection.prepareCall("{call
  getCustomerAccountTotal(?)}");
  stmt.registerOutParameter(1, Types.NUMBER);
  stmt.setString(1, customer.getId);
  stmt.execute();
  customerTotalBalance = stmt.getBigDecimal(1);
  return customerTotalBalance;
}


You may discover that the calculation is implemented in slightly different ways in various applications; perhaps the business rule has changed over time but the application was not updated, or perhaps there is a valid reason for the differences. Regardless, you need to negotiate any changes with the appropriate project stakeholders.

Introduce Index

Introduce a new index of either unique or nonunique type.

Motivation

The reason why you want to introduce an index to a table is to increase query performance on your database reads. You may also need to introduce an index to create a primary key for a table, or to support a foreign key to another table, when applying Consolidate Key Strategy (page 168).

Potential Tradeoffs

Too many indexes on a table will degrade performance when you update, insert into, or delete from the table. Many times you may want to introduce a unique index but will not be able to because the existing data contains duplicate values, forcing you to remove the duplicates before applying this refactoring.

Schema Update Mechanics

Applying Introduce Index can be complicated because of data dependencies, potentially requiring updates to both data and application code. You need to do the following:

1. Determine type of index. You have to decide whether you need a unique or nonunique index, which is usually determined by the business rules surrounding the data attributes and your usage of the data. For example, in the United States, individuals are assigned unique Social Security Numbers (SSNs). Within most companies, customers are assigned unique customer numbers. But, your telephone number may not be unique (several people may share a single number). Therefore, both SSN and CustomerNumber would potentially be valid columns to define a unique index for, but TelephoneNumber would likely require a nonunique index.

2. Add a new index. In the example of Figure 9.6, a new index based on SocialSecurityNumber needs to be introduced for Customer, using the CREATE INDEX command of SQL.

3. Provide more disk space. When you use the Introduce Index refactoring, you must plan for more disk space usage, so you may need to allocate more disk space.

Figure 9.6. Introducing an index for the Customer table.

image

Data-Migration Mechanics

There is no data to be migrated per se, although the value of Customer.SocialSecurityNumber must be checked for duplicate values if you are introducing a unique index. Duplicate values must be addressed by either updating them or by deciding to use a nonunique index. The following code determines whether we have duplicate values in the Customer.SocialSecurityNumber column:

––Find any duplicate values in customer.socialsecuritynumber.
SELECT SocialSecurityNumber,COUNT(SocialSecurityNumber)
  FROM Customer
  GROUP BY SocialSecurityNumber
  HAVING COUNT(SocialSecurityNumber)>1;


If you find any duplicate values, you will have to change them before you can apply Introduce Index. The following code shows a way to do so: First, you use the Customer.CustomerID as an anchor to find duplicate rows, and then you replace the duplicate values by apply the Update Data refactoring (page 310):

––Create a temp table of the duplicates
CREATE TABLE temp_customer
AS
  SELECT * FROM Customer parent
  WHERE CustomerID != (SELECT MAX(CustomerID)
    FROM Customer child
    WHERE parent.SocialSecurityNumber =
    child.SocialSecurityNumber);

––Now create the Unique Index
CREATE UNIQUE INDEX Customer_Name_UNQ
  ON Customer(SocialSecurityNumber)
;


Access Program Update Mechanics

To update access programs, you need to first analyze the dependencies to determine which external programs access data in Customer. First, these external programs must be able to manage database exceptions that are thrown by the database in case they generate duplicate values. Second, you should change your queries to make use of this new index to get better performance of data retrieval. Some database products enable you to specify what index to use while retrieving rows from the database through the use of hints, an example of which follows:

SELECT /*+ INDEX(Customer_Name_UNQ) */
CustomerID,
SocialSecurityNumber
FROM Customer
WHERE
  SocialSecurityNumber = 'XXX-XXX-XXX';


Introduce Read-Only Table

Create a read-only data store based on existing tables in the database. There are two ways to implement this refactoring: as a table populated in real time or as a table populated in batch.

Motivation

There are several reasons why you may want to apply Introduce Read-Only Table:

Improve query performance. Querying a given set of tables may be very slow because of the requisite joins; therefore, a prepopulated table may improve overall performance.

Summarize data for reporting. Many reports require summary data, which can be prepopulated into a read-only table and then used many times over.

Create redundant data. Many applications query data in real time from other databases. A read-only table containing this data in your local database reduces your dependency on these other database(s), providing a buffer for when they go down or are taken down for maintenance.

Replace redundant reads. Several external programs, or stored procedures for that matter, often implement the same retrieval query. These queries can be replaced by a common read-only table or a new view; see Introduce View (page 306).

Data security. A read-only table enables end users to query the data but not update it.

Improve database readability. If you have a highly normalized database, it is usually difficult for users to navigate through all the tables to get to the required information. By introducing read-only tables that capture common, denormalized data structures, you make your database schema easier to understand because people can start by focusing just on the denormalized tables.

Potential Tradeoffs

The primary challenge with introducing a read-only table is “stale data,” data that does not represent the current state of the source where it was populated from. For example, you could pull data from a remote system to populate a local table, and immediately after doing so, the source data gets updated. The users of the read-only table need to understand both the timeliness of the copied data as well as the volatility of the source data to determine whether the read-only table is acceptable to them.

Schema Update Mechanics

As depicted in Figure 9.7, to update the database schema when you perform Introduce Read-Only Table, you must first introduce the new table. You must decide on the structure of the read-only table that you want to introduce and then can create it using the CREATE TABLE command. Next, determine a population strategy. A real-time approach should be taken when your end users require up-to-date information; the join to create the read-only table is relatively simple, and the data can therefore be safely updated; and the resulting table is small, thus allowing for adequate performance. A batch approach should be taken when the real-time approach is not appropriate.

Figure 9.7. Introducing the read-only table CustomerPortfolio.

image

Figure 9.7 shows an example where the CustomerPortfolio table is a read-only table based on the Customer, Account, and Insurance tables that summarizes the business that we do with each customer. This provides an easier way for end users to do ad-hoc queries that analyze the customer information. The following code depicts the DDL to introduce the CustomerPortfolio table:

CREATE TABLE CustomerPortfolio (
  CustomerID NUMBER NOT NULL,
  Name VARCHAR(40),
  PhoneNumber VARCHAR2(40),
  AccountsTotalBalance NUMBER,
  InsuranceTotalPayment NUMBER,
  InsuranceTotalValue NUMBER,
  CONSTRAINT PKCustomerPortfolio
    PRIMARY KEY (CustomerID)
  );


There are several implementation strategies. First, you could trust developers to do the right thing and simply mark the table as read-only via a comment. Second, you could implement triggers which throw an error on insertion, update, or deletion. Third, you could restrict access to the table via your security access control (SAC) strategy. Fourth, apply Encapsulate Table With View (page 243) and make the view read-only.

Data-Migration Mechanics

You must initially copy all the relevant source data into the read-only table, and then apply your population strategy (real-time update or periodic batch update). There are several implementation strategies you can apply for your population strategy:

1. Periodic refresh. Use a scheduled job that refreshes your read-only table. The job may refresh all the data in the read-only table or it may just update the changes since the last refresh. Note that the amount of time taken to refresh the data should be less than the scheduled interval time of the refresh. This technique is particularly suited for data warehouse kind of environments, where data is generally summarized and used the next day. Hence, stale data can be tolerated; also, this approach provides you with an easier way to synchronize the data.

2. Materialized views. Some database products provide a feature where a view is no longer just a query; instead, it is actually a table based on a query. The database keeps this materialized view current based on the options you choose when you create it. This technique enables you to use the database’s built-in features to refresh the data in the materialized view, with the major downside being the complexity of the view SQL. When the view SQL gets more complicated, the database products tend not to support automated synchronization of the view.

3. Use trigger-based synchronization. Create triggers on the source tables so that source data changes are propagated to the read-only table. This technique enables you to custom code the data synchronization, which is desirable when you have complex data objects that need to be synchronized; however, you must write all of the triggers, which could be time consuming.

4. Use real-time application updates. You can change your application so that it updates the read-only table, making the data current. This can only work when you know all the applications that are writing data to your source database tables. This technique allows for the application to update the read-only table, and hence its always kept current, and you can make sure that the data is not used by the application. The downside of the technique is you must write your information twice, first to the original table and second to the denormalized read-only table; this could lead to duplication and hence bugs.

The following code depicts how to populate data in CustomerPortfolio table for the first time:

INSERT INTO CustomerPortfolio
  SELECT Customer.CustomerID,
    Customer.Name,
    Customer.PhoneNumber,
    SUM(Account.Balance),
    SUM(Insurance.Payment),
    SUM(Insurance.Value)
  FROM
    Customer,Account,Insurance
  WHERE
    Customer.CustomerID=Account.CustomerID
    AND Customer.CustomerID=Insurance.CustomerID
  GROUP BY
    Customer.CustomerID,
    Customer.Name,
    Customer.PhoneNumber;


The following code depicts how to synchronize data from the source tables of Customer, Account, and Insurance. Because the source tables are updatable but CustomerPortfolio is not, we only need one-way updates:

CREATE OR REPLACE TRIGGER
UpdateCustomerPortfolioCustomer
  AFTER UPDATE OR INSERT OR DELETE
  ON Customer
  REFERENCING OLD AS OLD NEW AS NEW
  FOR EACH ROW
  DECLARE
  BEGIN
    IF DELETING THEN
      DeleteCustomerPortfolio;
    END IF;
    IF (UPDATING OR INSERTING) THEN
      UpdateCustomerPortfolio;
    END IF;
  END;
END;
/

CREATE OR REPLACE TRIGGER
UpdateCustomerPortfolioAccount
  AFTER UPDATE OR INSERT OR DELETE
  ON Account
  REFERENCING OLD AS OLD NEW AS NEW
  FOR EACH ROW
  DECLARE
  BEGIN
    UpdateCustomerPortfolioForAccount;
  END;
END;
/

CREATE OR REPLACE TRIGGER
UpdateCustomerPortfolioInsurance
  AFTER UPDATE OR INSERT OR DELETE
  ON Insurance
  REFERENCING OLD AS OLD NEW AS NEW
  FOR EACH ROW
  DECLARE
  BEGIN
    UpdateCustomerPortfolioForInsurance;
  END;
END;
/


Access Program Update Mechanics

You will have already analyzed why you are introducing this read-only table. For example, in Figure 9.7, if you are going to use CustomerPortfolio inside the application, you must make sure that the application uses this for read-only purposes. If you are providing CustomerPortfolio as a summarized table for integration to some other application, you must keep the data updated in CustomerPortfolio. When you decide to use data from CustomerPortfolio, you must change all the places where you currently access the source tables and rework them to use CustomerPortfolio instead.

You may also want to show your end users how recent the data is by displaying the value of the CustomerPortfolio.LastUpdatedTime column. Furthermore, if batch reports require up-to-date data, the batch job will need to be reworked to first refresh CustomerPortfolio and then run the reports:

// Before code
stmt.prepare(
"SELECT Customer.CustomerId, " +
"  Customer.Name, " +
"  Customer.PhoneNumber, " +
"  SUM(Account.Balance) AccountsTotalBalance, " +
"  SUM(Insurance.Payment) InsuranceTotalPayment, " +
"  SUM(Insurance.Value) InsuranceTotalPayment " +
"FROM Customer, Account, Insurance " +
"WHERE " +
"  Customer.CustomerId = Account.CustomerId " +
"  AND Customer.CustomerId = Insurance.CustomerId " +
"  AND Customer.CustomerId = ?");
stmt.setLong(1,customer.getCustomerID);
stmt.execute();
ResultSet rs = stmt.executeQuery();

// After code
stmt.prepare(
"SELECT CustomerId, " +
"  Name, " +
"  PhoneNumber, " +
"  AccountsTotalBalance, " +
"  InsuranceTotalPayment, "  +
"  InsuranceTotalPayment " +
"FROM CustomerPortfolio " +
"WHERE CustomerId = ?");
stmt.setLong(1,customer.getCustomerID);
stmt.execute();
ResultSet rs = stmt.executeQuery();


Migrate Method From Database

Rehost an existing database method (a stored procedure, stored function, or trigger) in the application(s) that currently invoke it.

Motivation

There are four reasons why you may want to apply Migrate Method From Database:

To support variability. When the method was originally implemented in the database, the business logic was consistent, or at least was thought to be consistent, between applications. Over time, the application requirements evolved, and now different logic is required in the individual applications.

To increase portability. Database method languages are specific to individual database vendors. By implementing logic outside of the database, perhaps in a shared component or Web service, you will make it easy to port to a new database.

To increase scalability. Although it is possible to scale databases through the use of grid technology, the fact still remains that you can also scale via other means such through the addition of additional application servers. Your enterprise architectural strategy may be to scale your systems via nondatabase means; and if the method is proving to be a bottleneck, you will want to migrate it out of the database.

To increase maintainability. The development tools for leading programming languages such as Java, C#, and Visual Basic are often much more sophisticated than the tools for database programming languages. Better tools—and, of course, better development practices—will make your code easier and therefore less expensive to maintain. Also, by programming in a single language, you reduce the programming skill requirements for project team members—it is easier to find someone with C# experience than it is to find someone with C# and Oracle PLSQL experience.

Potential Tradeoffs

There are several trade-offs associated with this refactoring:

Reduced reusability. Relational databases are a lowest common denominator technology. Virtually any application technology can access them, and therefore implementing shared logic within a relational database offers the greatest chance for reusability.

Performance degradation. There is the potential for decreased performance, particularly if the method accesses significant amounts of data, which would need to get transmitted to the method before it could get processed.

Schema Update Mechanics

As depicted in Figure 9.8, the change to the database schema is straightforward—you just mark the method to be removed, and then after the transition period has ended, you physically remove the method from the database. In this example, the GetValidPolicies stored procedure was migrated from CustomerDB to the policies administration application. This was done for several reasons: Only one application required that business logic, the network traffic remained the same regardless of where the functionality was performed, and the project team wanted to improve the maintainability of their application. It is interesting to note that the name of the method changed slightly to reflect the naming conventions of the programming language.

Figure 9.8. Migrating policy retrieval code into the policy administration application.

image

Data-Migration Mechanics

There is no data to migrate with this database refactoring.

Access Program Update Mechanics

The method must be developed and tested within the appropriate application(s). You need to search your program code for invocations of the existing method, and then replace that code with the corresponding function call. The following code example shows how application code will be developed to replace functionality provided by the method:

//Before code
stmt.prepareCall("begin ? := getValidPolicies(); end;");
stmt.registerOutParameter(1, OracleTypes.CURSOR);
stmt.execute();
ResultSet rs = stmt.getObject(1);
List validPolicies = new ArrayList();
Policy policy = new Policy();
while (rs.next()) {
  policy.setPolicyId(rs.getLong(1));
  policy.setCustomerID(rs.getLong(2));
  policy.setActivationDate(rs.getDate(3));
  policy.setExpirationDate(rs.getDate(4));
  validPolicies.add(policy);
}
return validPolicies;

//After code
stmt.prepare(
  "SELECT PolicyId, CustomerId, "+
  "ActivationDate, ExpirationDate " +
  "FROM Policy" +
  "WHERE ActivationDate < TRUNC(SYSDATE) AND "+
  "ExpirationDate IS NOT NULL AND "+
  "ExpirationDate > TRUNC(SYSDATE)");
ResultSet rs = stmt.execute();
List validPolicies = new ArrayList();
Policy policy = new Policy();
while (rs.next()) {

  policy.setPolicyId(rs.getLong("PolicyId"));
  policy.setCustomerID(rs.getLong("CustomerId"));
  policy.setActivationDate(rs.getDate
  ("ActivationDate"));
  policy.setExpirationDate(rs.getDate
  ("ExpirationDate"));
  validPolicies.add(policy);
}
return validPolicies;


Migrate Method To Database

Rehost existing application logic in the database.

Motivation

There are three reasons why you may want to apply Migrate Method To Database:

To support reuse. Virtually any application technology can access relational databases, and therefore implementing shared logic as database methods (stored procedures, functions, or triggers) within a relational database offers the greatest chance for reusability.

To increase scalability. It is possible to scale databases through the use of grid technology; therefore, your enterprise architecture may direct you to host all data-oriented logic within your database.

To improve performance. There is the potential for improved performance, particularly if the stored procedure will process significant amounts of data, because the processing will happen closer to the database, which results in a reduced result set being transmitted across the network.

Potential Tradeoffs

The primary drawback to this refactoring is the decreased portability of database methods, code that is specific to the individual database vendor. This problem is often overblown by purists—it is not common for organizations to switch database vendors, perhaps because the vendors have their existing client base locked in or perhaps because it simply is not worth the expense any more.

Schema Update Mechanics

As depicted in Figure 9.9, the change to the database schema is straightforward—you just develop and test the stored procedure. In this example, the logic is to identify a list of defaulted customers, those who are behind on their payments. This logic is implemented in two places, the Policy Administration and the Customer Credit applications. Each application named the operation differently, IdentifyDefaultedCustomers() and ListCustomersThatDefaulted(), respectively, although both are replaced by CustomerDB.GetDefaultedCustomers().

Figure 9.9. Migrating application logic to the database.

image

Data-Migration Mechanics

There is no data to migrate with this database refactoring.

Access Program Update Mechanics

The method logic must be removed from any application implementing it, and the stored procedure invoked instead. The easiest way to accomplish this is to simply invoke the stored procedure from the existing application method(s). You will likely discover that individual applications have named this method differently, or they may have even designed it differently. For example, one application may implement the business logic as a single large function, whereas another one may implement it as several, smaller functions.

Similarly, you may discover that different applications implement the method in different ways, either because the code no longer reflects the actual requirements (if it ever did) or because the individual applications had good reason to implement the logic differently. For example, the DetermineVacationDays(Year) operation provides a list of dates for which employees get paid vacations. This operation is implemented in several applications, but upon examination of the code, it is implemented in different manners. The various versions were written at different times, the rules have changed since some of the older versions were written, and the rules vary by country, state, and sometimes even city. At this point, you would either need to decide to leave the various versions alone (for example, live with the differences), fix the existing application methods, or write a stored procedure(s) in the database that implements the correct version of the logic.

The following code sample shows how the application logic was moved to the database and how the application now uses the stored procedure to get a list of defaulted customers. The example does not show the code for the creation of the stored procedure:

//Before code
stmt.prepare(
  "SELECT CustomerId, " +
  "PaymentAmount" +
  "FROM Transactions" +
  "WHERE LastPaymentdate < TRUNC(SYSDATE-90) AND " +
  "PaymentAmount > 30 ");
ResultSet rs = stmt.execute();
List defaulters = new ArrayList();
DefaultedCustomer defaulted  = new DefaultedCustomer();
while (rs.next()) {
  defaulted.setCustomerID(rs.getLong("CustomerId"));
  defaulted.setAmount(rs.getBigDecimal
  ("PaymentAmount"));
  defaulters.add(defaulted);
}
return defaulters;

//After code
stmt.prepareCall("begin ? := getDefaultedCustomers(); end;");
stmt.registerOutParameter(1, OracleTypes.CURSOR);

stmt.execute();
ResultSet rs = stmt.getObject(1);
List defaulters = new ArrayList();
DefaultedCustomer defaulted  = new DefaultedCustomer();
while (rs.next()) {
  defaulted.setCustomerID(rs.getLong(1));
  defaulted.setAmount(rs.getBigDecimal(2));
  defaulters.add(defaulted);
}
return defaulters;


Replace Method(s) With View

Create a view based on one or more existing database methods (stored procedures, stored functions, or triggers) within the database.

Motivation

There are three basic reasons why you may want to apply Replace Method(s) With View:

Ease of use. You may have adopted new tools, in particular reporting tools, that work with views much easier than with methods.

Reduce maintenance. Many people find view definitions easier to maintain than methods.

Increase portability. You can increase the portability of your database schema. Database method languages are specific to individual database vendors, whereas view definitions can be written so that they are SQL standards compliant.

Potential Tradeoffs

This refactoring can typically be applied only to relatively simple methods that implement logic that could also be implemented by a view definition. The implication is that this refactoring limits your architectural flexibility. Furthermore, if your database does not support updatable views, you are limited to replacing only retrieval-oriented methods. Performance and scalability are rarely impacted because all the work still occurs within the database.

Schema Update Mechanics

To update the database schema, you must first introduce the view via the Introduce View refactoring (page 306). You must then mark the method for removal, and then eventually remove it after the transition period expires via the DROP PROCEDURE command. Figure 9.10 depicts an example where the GetCustomerAccountList stored procedure is replaced with the CustomerAccountList view. The following code depicts the DDL to do so:

CREATE VIEW CustomerAccountList (
  CustomerID     NUMBER NOT NULL,
  CustomerName   VARCHAR(40),
  CustomerPhone  VARCHAR2(40),
  AccountNumber  VARCHAR(14),
  AccountBalance NUMBER,
) AS
SELECT
  Customer.CustomerID,
  Customer.Name,
  Customer.PhoneNumber,
  Account.AccountNumber,
  Account.Balance
FROM Customer,Account
WHERE Customer.CustomerID=Account.CustomerID
;

-- Run this code after June 14 2007
DROP PROCEDURE GetCustomerAccountList


Figure 9.10. Introducing the CustomerAccountList view.

image

Data-Migration Mechanics

There is no data to migrate with this database refactoring.

Access Program Update Mechanics

You need to refactor the access programs to work through the view instead of invoking the stored procedure. The error codes thrown by the database will change when using a view, so you may need to rework your error handling code. The following code shows how an application would be changed to call the view rather than the stored procedure:

//Before code
stmt.prepareCall("begin ? := getCustomerAccountList(?); end;");
stmt.registerOutParameter(1, OracleTypes.CURSOR);
stmt.setInt(1,customerId);
stmt.execute();
ResultSet rs = stmt.getObject(1);
List customerAccounts = new ArrayList();
while (rs.next()) {
  customerAccounts.add(populateAccount(rs));
}
return customerAccounts;

//After code
stmt.prepare(
  "SELECT CustomerID, CustomerName, "+
  "CustomerPhone, AccountNumber, AccountBalance " +
  "FROM CustomerAccountList " +
  "WHERE CustomerId = ? ");
stmt.setLong(1,customerId);
ResultSet rs = stmt.executeQuery();
List customerAccounts = new ArrayList();
while (rs.next()) {
  customerAccounts.add(populateAccount(rs));
}
return customerAccounts;


Replace View With Method(s)

Replace an existing view with one or more existing methods (stored procedures, stored functions, or triggers) within the database.

Motivation

There are two fundamental reasons why you may want to apply Replace View With Method(s). First, it is possible to implement more complex logic with methods than with views, so this refactoring would be the first step in that direction. Second, methods can update data tables. Some databases do not support updatable views, or if they do, it is often limited to a single table. By moving from a view-based encapsulation strategy to a method-based one, your database architecture becomes more flexible.

Potential Tradeoffs

There are several potential problems with this refactoring. First, reporting tools usually do not work well with methods, but they do with views. Second, there is a potential decrease in portability because database method languages are specific to individual database vendors. Third, maintainability may decrease because many people prefer to work with views rather than methods (and vice versa). Luckily, performance and scalability are rarely impacted because all the work still occurs within the database.

Schema Update Mechanics

To update the database schema, you must first introduce the appropriate method(s) required to replace the view. You must then mark the view for removal, and then eventually remove it after the transition period expires using the Drop View (page 79) refactoring.

Figure 9.11 depicts an example where the CustomerTransactionsHistory view is replaced with the GetCustomerTransactions stored procedure. The following code depicts the code to introduce the method and to drop the view:

CREATE OR REPLACE PROCEDURE GetCustomerTransactions
{
  P_CustomerID IN NUMBER
  P_Start IN DATE
  P_End IN DATE
}
IS
BEGIN
  SELECT *
  FROM Transaction
  WHERE Transaction.CustomerID = P_CustomerID
  AND Transaction.PostingDate BETWEEN P_Start AND P_End;
END;

––On March 14 2007
DROP VIEW CustomerTransactionsHistory;


Figure 9.11. Introducing the GetCustomerTransactions stored procedure.

image

Data-Migration Mechanics

There is no data to migrate with this database refactoring.

Access Program Update Mechanics

You need to refactor the access programs to work through the method(s) rather than the view. The error codes thrown by the database will change when using methods, so you may need to rework your error handling code.

When you replace the CustomerTransactionsHistory view with the GetCustomerTransactions stored procedure view, your code needs to change as shown here:

//Before code
stmt.prepare(
  "SELECT * " +
  "FROM CustomerTransactionsHistory " +
  "WHERE CustomerId = ? "+
  "AND TransactionDate BETWEEN ? AND ? ");
stmt.setLong(1,customerId);
stmt.setDate(2,startDate);
stmt.setDate(3,endDate);
ResultSet rs = stmt.executeQuery();
List customerTransactions = new ArrayList();
while (rs.next()) {
  customerTransactions.add(populateTransactions(rs));
}
return customerTransactions;

//After code
stmt.prepareCall("begin ? :=
getCustomerTransactions(?,?,?); end;");
stmt.registerOutParameter(1, OracleTypes.CURSOR);
stmt.setInt(1,customerId);
stmt.setDate(2,startDate);
stmt.setDate(3,endDate);
stmt.execute();
ResultSet rs = stmt.getObject(1);
List customerTransactions = new ArrayList();
while (rs.next()) {
  customerTransactions.add(populateAccount(rs));
}
return customerTransactions;


Use Official Data Source

Use the official data source for a given entity, instead of the current one which you are using.

Motivation

The primary reason to apply Use Official Data Source is to use the correct version of the data for a given table or tables. When the same data is stored in several places, you run the risk of inconsistency and lack of availability. For example, assume that you have multiple databases in your enterprise, one of which, the CRM database, is the official source of customer information.

If your application is using its own Customer table, you may not be working with all the customer data available within your organization. Worse yet, your application could record a new customer but their information is not made available to the CRM database, and therefore it is not available to the other applications within your organization.

Potential Tradeoffs

The primary tradeoff is the cost of refactoring all the references to the table in your local database to now use the official database tables. If the key strategy of the official data source is different from what your table is using, you may want to first apply Consolidate Key Strategy (page 168) so that the identifiers used in your database reflect that of the official data source. Furthermore, the semantics and timeliness of the official source data may not be identical to that of the data you are currently using. You may need to apply refactorings such as Apply Standard Codes (page 157), Apply Standard Type (page 162), and Introduce Common Format (page 183) to convert your existing data over to something that conforms to the official data source.

Schema Update Mechanics

To update the schema, you must do the following:

1. Identify the official data source. You have to identify the official data source, which will often be in an external database that is outside the scope of your control. Worse yet, there could be several “official sources,” which you either need to pick from or consolidate appropriately for your situation. You need to negotiate agreement with both your project stakeholder(s) and the owners of the data source that your application should be accessing.

2. Choose an implementation strategy. You have two choices: to either rework your application(s) to directly access the official data source (the strategy depicted in Figure 9.12) or to replicate the source data with your existing table (the strategy depicted in Figure 9.13). When the official data source is in another database, the replication strategy scales better than the direct access strategy because your application will potentially be only coupled to a single database.

3. Direct access strategy. You change your application such that the application now directly accesses CustomerDatabase. Note that this may require a different database connection than what is currently used, or the invocation of a web service or transaction. Following this strategy, you must be able to handle transactions across different databases.

4. Replication strategy. You can set up replication between YourDatabase and CustomerDatabase so that you replicate all the tables that you require. If you want to have updatable tables in YourDatabase, you must use multimaster replication. Following this strategy, you should not have to change your application code as long as both the schema and the semantics of your source data remains the same. If changes are required to your schema, such as renaming tables or, worse yet, the semantics of the official data are different from that of your current data, you will likely require change to your external programs.

5. Remove tables that are not used anymore. When you choose to directly access the official source data, you no longer have any use for the tables in the YourDatabase. You should use Drop Table (page 77) refactoring to remove those tables.

Figure 9.12. Directly accessing the official customer data from the Customer database.

image

Figure 9.13. Using official customer data via a replication strategy.

image

Data-Migration Mechanics

There is no data to migrate for this refactoring if the data semantics of the official YourDatabase and the local YourDatabase are the same. If they are not the same, you need to either consider backing out of this refactoring, following a replication strategy where you convert the different values back and forth, or refactoring your application(s) to accept the data semantics of the official source. Unless the data is semantically similar, or you get very lucky and find a way to safely convert the data back and forth, it is unlikely that a replication strategy will work for you.

Access Program Update Mechanics

The way that you change external programs varies depending on your implementation strategy. If you use the direct access strategy, the programs must now connect to and work with the official data source, CustomerDatabase. With the replication strategy, you must create replication scripts between the appropriate tables in YourDatabase and CustomerDatabase. When the table(s) in YourDatabase match the table(s) in CustomerDatabase, the programs should not need to be changed. Otherwise, you either have to adopt the schema and semantics of the official data source, and then refactor your application(s) appropriately, or you need to figure out a way to implement the replication in such a way that the schema in YourDatabase is not changed. One way to do that is to implement scripts that convert the data back and forth, which is often difficult to accomplish in practice because the mappings are rarely one to one. Another way, equally difficult, is to replace the source table(s) in YourDatabase with the appropriate ones from CustomerDatabase, and then apply the Encapsulate Table With View refactoring (page 243) so that the existing applications can still work with “the old schema.” These new view(s) would need to implement the same conversion logic as the previously mentioned scripts.

The following code shows how to change your application to work with the official data source by using the crmDB connection instead:

// Before code
stmt = DB.prepare("select CustomerID,Name,PhoneNumber "+
  "FROM Customer WHERE CustomerID = ?");
stmt.setLong(1, customerID);
stmt.execute();
ResultSet rs = stmt.executeQuery();

// After code
stmt = crmDB.prepare("select CustomerID,Name,PhoneNumber "+
  "FROM Customer WHERE CustomerID = ?");
stmt.setLong(1, customerID);
stmt.execute();
ResultSet rs = stmt.executeQuery();


..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset