This chapter summarizes the code refactorings that we have found applicable to stored procedures, stored functions, and triggers. For the sake of simplicity, we refer to these three types of functionality simply as methods, the term that Martin Fowler used in Refactoring (Fowler 1999). As the name suggests, a method refactoring is a change to a stored procedure that improves its quality. Our goal is to present an overview of these refactorings. You can find detailed descriptions in Refactoring, the seminal work on the subject.
We differentiate between two categories of method refactorings, those that change the interface of the database offered to external programs and those that modify the internals of a method. For example, the refactoring Add Parameter (page 278) would change the interface of a method, whereas Consolidate Conditional Expression (page 283) does not. (It replaces a complex comparison with an invocation to a new method that implements just the comparison.) We distinguish between the two categories because refactorings that change the interface will require external programs to be refactored, whereas those that only modify the internals of a method do not.
Each of these refactorings includes a transition period to provide development teams sufficient time to update external programs that invoke the methods. Part of the refactoring is to rework the original version of the method(s) to invoke the new version(s) appropriately, and then after the transition period, the old version is invoked accordingly.
As the name suggests, this refactoring adds a new parameter to an existing method. In Figure 10.1, you see an example where MiddleName was added as a third parameter to ReadCustomer. The safest way to add a parameter is to add it to the end of parameter list; that way, when you refactor the invocations of the method, you do not run the risk of reordering the parameters improperly. If you later decide that the parameters appear in the wrong order, you can apply the Reorder Parameters refactoring (page 281).
Sometimes two or more methods do basically the same thing. For example, in Figure 10.2, the GetAmericanCustomers, GetCanadianCustomers, and GetBrazilianCustomers stored procedures produce a list of American, Canadian, and Brazilian customers, respectively. Although these separate procedures have the advantage of being specific, thereby ensuring that the right queries are being invoked on the database, they are more difficult to maintain than a single stored procedure. These stored procedures are replaced with the GetCustomerByCountry stored procedure, which takes a country identifier as a parameter. There is now less code to maintain as a result, and it is easy to add support for new countries just by adding new country codes.
Sometimes a method includes an extra parameter, perhaps because it was originally included in the belief that it would be needed in the future, or perhaps because the business changed and it is no longer required, or perhaps the parameter value can be obtained via another manner, such as reading it from a lookup table. In Figure 10.3, you see that GetAccountList has an extraneous AsOfDate parameter that is removed to reflect the actual usage of the stored procedure.
Sometimes a method is poorly named, or it simply does not follow your corporate naming conventions. For example, in Figure 10.4, the GetAccountList stored procedure is renamed to GetAccountsForCustomer to reflect your standard naming conventions.
It is common to discover that the ordering of the parameters of a method does not make sense, perhaps because new parameters have been added in the wrong order via Add Parameter (page 278) or simply because the ordering does not reflect the current business needs. Regardless, you can make a stored procedure easier to understand by ordering its parameters appropriately (see Figure 10.5).
As you see in Figure 10.6, the parameters of GetCustomers were reordered to reflect common business ordering conventions. Note that this is a highly risky example—you cannot have a transition period in this case because the types of all three parameters are identical, and therefore you cannot simply override the method. The implication is that you could reorder the parameters but forget to reorder them in external code that invokes the method—the method would still run, but with the wrong values for the parameters, a tricky defect to find without good test cases. Had the parameters been of different types, external applications that had not been refactored to pass the parameters in the new order would break, and it would be very easy to find the affected code as a result.
Sometimes you will have a single method doing one of several things, based on the value of a parameter being passed to it. For example, in Figure 10.7, you see that GetAccountValue, which is basically a generic getter that returns the value of a column within a row, is replaced with the more specific GetAccountBalance, GetAccountCustomerID, and GetAccountOpeningDate stored procedures.
Internal refactorings improve the quality of the implementation of a method without changing its interface.
Within the logic of a method, you often have a series of conditional tests that produce the same result. You should refactor your code to combine those conditionals to make your code clearer, as you see in the following example, in which the three conditionals are combined into one. This refactoring also sets you up to apply Extract Method (page 285) to create a method that implements the conditional:
-- Before code
CREATE OR REPLACE FUNCTION GetAccountAverageBalance
( inAccountID IN NUMBER)
RETURN NUMBER;
AS
averageBalance := 0;
BEGIN
IF inAccountID = 10000 THEN
RETURN 0;
END IF;
IF inAccountID = 123456 THEN
RETURN 0;
END IF;
IF inAccountID = 987654 THEN
RETURN 0;
END IF;
-- Code to calculate the average balance
RETURN averageBalance;
END;
-- After code
CREATE OR REPLACE FUNCTION GetAccountAverageBalance
( inAccountID IN NUMBER)
RETURN NUMBER;
AS
averageBalance := 0;
BEGIN
IF inAccountID = 10000 || inAccountID = 123456 || inAccountID = 987654 THEN
RETURN 0;
END IF;
-- Code to calculate the average balance
RETURN averageBalance;
END;
You have a conditional IF-THEN-ELSE expression that is complex and therefore difficult to understand. Replace the condition, then part, and else part with invocations to methods that implement the logic of the expression. In the first version of the following code, the logic to determine which interest rate to apply is determined based on whether the value of inBalance is below, within, or over a certain threshold. In the second version, the code is simplified by the addition of calls to BalanceIsSufficient, CalculateLowInterest, and CalculateHighInterest. (These functions are not shown.)
-- Before code
CREATE OR REPLACE FUNCTION CalculateInterest
( inBalance IN NUMBER )
RETURN NUMBER;
AS
lowBalance NUMBER;
highBalance NUMBER;
lowInterestRate NUMBER;
highInterestRate NUMBER;
BEGIN
lowBalance := GetLowBalance();
highBalance := GetHighBalance();
lowInterestRate := GetLowInterestRate();
highInterestRate := GetHighInterestRate();
IF inBalance < lowBalance THEN
RETURN 0;
END IF
IF inBalance >= lowBalance && inBalance <= highBalance THEN
RETURN inBalance * lowInterestRate;
ELSE
RETURN inBalance * highInterestRate;
END IF;
END;
-- After code
CREATE OR REPLACE FUNCTION CalculateInterest
( inBalance IN NUMBER )
RETURN NUMBER;
AS
BEGIN
IF BalanceIsInsufficient( inBalance ) THEN
RETURN 0;
END IF
IF IsLowInterestBalance( inBalance ) THEN
RETURN CalculateLowInterest( inBalance );
ELSE
RETURN CalculateHighInterest( inBalance );
END IF;
END;
Within an existing method, you have a code fragment that could be grouped together and extracted into its own method whose name explains what it does. The code fragment is usually duplicated elsewhere and/or or implements meaningful functionality; therefore, you can apply this refactoring to simplify your code. Short, well-named operations are easier to understand and therefore to maintain, and they are more likely to be reused. As you can see in the following code, the first version is simplified by replacing the logic to determine the starting daily balance with an invocation of the GetDailyBalance stored function. In the final version of the code, we have applied this refactoring several times more to further simplify the code. A side effect of applying this refactoring is that your high-level, public stored procedures will read like well-commented, highly literate code:
-- Initial version of the code
CREATE OR REPLACE FUNCTION CalculateAccountInterest
( inAccountID IN NUMBER,
inStart IN DATE,
inEnd IN DATE )
RETURN NUMBER;
AS
medianBalance NUMBER;
startBalance NUMBER;
endBalance NUMBER;
interest := 0;
BEGIN
BEGIN
-- Determine the starting balance
SELECT Balance INTO startBalance
FROM DailyEndBalance
WHERE AccountID = inAccountID && PostingDate = inStart;
EXCEPTION WHEN NO_DATA_FOUND THEN
startBalance := 0;
-- Determine the ending balance
SELECT Balance INTO endBalance
FROM DailyEndBalance
WHERE AccountID = inAccountID && PostingDate = inEnd;
EXCEPTION WHEN NO_DATA_FOUND THEN
endBalance := 0;
END;
medianBalance := ( startBalance + endBalance ) / 2;
IF medianBalance < 0 THEN
medianBalance := 0;
END IF;
IF medianBalance >= 500 THEN
interest := medianBalance * 0.01;
END IF;
RETURN interest;
END;
-- Intermediate version of the code
CREATE OR REPLACE Function CalculateAccountInterest
( inAccountID IN NUMBER,
inStart IN DATE,
inEnd IN DATE )
RETURN NUMBER;
AS
medianBalance NUMBER;
startBalance NUMBER;
endBalance NUMBER;
interest := 0;
BEGIN
startBalance := GetDailyEndBalance ( inAccountID, inStart );
BEGIN
-- Determine the ending balance
SELECT Balance INTO endBalance
FROM DailyEndBalance
WHERE AccountID = inAccountID && PostingDate = inEnd;
EXCEPTION WHEN NO_DATA_FOUND THEN
endBalance := 0;
END;
medianBalance := ( startBalance + endBalance ) / 2;
IF medianBalance < 0 THEN
medianBalance := 0;
END IF;
IF medianBalance >= 500 THEN
interest := medianBalance * 0.01;
END IF;
RETURN interest;
END;
CREATE OR REPLACE Function GetDailyBalance
( inAccountID IN NUMBER,
inDate IN DATE )
RETURN NUMBER;
AS
endbalance NUMBER;
BEGIN
BEGIN
SELECT Balance INTO endBalance
FROM DailyEndBalance
WHERE AccountID = inAccountID < PostingDate = inDate;
EXCEPTION WHEN NO_DATA_FOUND THEN
endBalance := 0;
END;
RETURN endBalance;
END;
-- Final version of the code
CREATE OR REPLACE FUNCTION CalculateAccountInterest
( inAccountID IN NUMBER,
inStart IN DATE,
inEnd IN DATE )
RETURN NUMBER;
AS
medianBalance NUMBER;
startBalance NUMBER;
endBalance NUMBER;
BEGIN
startBalance := GetDailyEndBalance ( inAccountID, inStart );
endBalance:= GetDailyEndBalance ( inAccountID, inEnd );
medianBalance := CalculateMedianBalance ( startBalance, endBalance );
RETURN CalculateInterest ( medianBalance );
END;
-- Code for GetDailyEndBalance, CalculateMedianBalance, & CalculateInterest goes here
Your method code contains a complicated expression that is difficult to read. You can introduce well-named variables in your code that are used to implement portions of the expression and then are composed to form the original expression, thereby making the method easier to understand. This refactoring is the equivalent of Martin Fowler’s Introduce Explaining Variable (Fowler 1999). After speaking with him, he suggested that we use the simpler name; in hindsight, he realized that he should have originally called this refactoring Introduce Variable, and most tool vendors also use the simpler name:
-- Before code
CREATE OR REPLACE FUNCTION DetermineAccountStatus
( inAccountID IN NUMBER,
inStart IN DATE,
inEnd IN DATE )
RETURN VARCHAR;
AS
lastAccessedDate DATE;
BEGIN
-- Some code to calculate lastAccessDate
IF ( inDate < lastAccessDate && outdate > lastAccessDate )
&& ( inAccountID > 10000 )
&& ( inAccountID != 123456 && inAcountID != 987654) THEN
-- do something
END IF;
-- do another thing
END;
-- After code
CREATE OR REPLACE FUNCTION DetermineAccountStatus
( inAccountID IN NUMBER,
inStart IN DATE,
inEnd IN DATE )
RETURN VARCHAR;
AS
lastAccessedDate DATE;
isBetweenDates BOOLEAN;
isValidAccountID BOOLEAN;
isNotTestAccount BOOLEAN
BEGIN
-- Some code to calculate lastAccessDate
isBetweenDates := inDate < lastAccessDate && outdate > lastAccessDate;
isValidAccountID := inAccountID > 100000;
isNotTestAccount := inAccountID != 123456 && inAcountID != 987654;
IF isBetweenDates && isValidAccountID &&
isNotTestAccount THEN
-- do something
END IF;
-- do another thing
END;
You have a variable acting as a control flag to break out of a control construct such as a loop. You can simplify your code by replacing the use of the control flag by examining the expression within the control construct. As you see in the following example, when you remove the controlFlag logic, the code is simplified:
-- Before code
DECLARE
controlFlag := 0;
anotherVariable := 0;
BEGIN
WHILE controlFlag = 0 LOOP
-- Do something
IF anotherVariable > 20 THEN
controlFlag = 1;
ELSE
-- Do something else
END IF;
END LOOP;
END;
-- After code
DECLARE
anotherVariable := 0;
BEGIN
WHILE anotherVariable <= 20 LOOP
-- Do something
-- Do something else
END LOOP;
END;
Sometimes a method just acts as a pass through, or middle man, for other methods. This can happen when a stored procedure is renamed and then a stored procedure is introduced with the original name, which merely calls the stored procedure with the new name. It can also happen when you realize that you have two stored procedures that do the same thing; so, as you can see, one of them was rewritten to simply invoke the other. Regardless, when you see code such as the following example, you should refactor whatever invokes AProcedure to now invoke AnotherProcedure (thus allowing you to drop AProcedure):
CREATE OR REPLACE PROCEDURE AProcedure
parameter1 IN NUMBER;
...
parameterN IN VARCHAR;
AS
BEGIN
EXECUTE AnotherProcedure ( parameter1, ..., parameterN );
END;
It is common to discover that an existing parameter name is difficult to understand—either the parameter name made sense at the time and its usage has changed over time or it was simply misnamed to begin with. This is a rather simple refactoring to implement; you merely change the parameter name in the original source code.
You have hard-coded a literal number that has a particular meaning in the code of a method, making it difficult to maintain. A better approach is to store the value in a table, and then retrieve it as needed from the table. (This table could be cached to improve performance.) In the following code, the minimum balance of $500 is retrieved in the new version of the code from a single row table called CorporateBusinessConstants used to store such values. (We could have also implemented a multirow table with a limited number of columns, one for each type, in which to store the data.) The code is then further improved by applying this refactoring again to obtain the value for the interest rate. We also apply Extract Method (page 285) to consolidate the logic of obtaining the values from the lookup table. This refactoring is the database version of the Replace Magic Number With Symbolic Constant (Fowler 1999) code refactoring:
-- Initial version of the code
CREATE OR REPLACE FUNCTION CalculateInterest
( inBalance IN NUMBER )
RETURN NUMBER;
AS
interest := 0;
BEGIN
IF inBalance >= 500 THEN
interest := medianBalance * 0.01;
END IF;
RETURN interest;
END;
-- Intermediate version of the code
CREATE OR REPLACE FUNCTION CalculateInterest
( inBalance IN NUMBER )
RETURN NUMBER;
AS
interest := 0;
minimumBalance NUMBER;
BEGIN
BEGIN
SELECT MinimumBalanceForInterest INTO minimumBalance
FROM CorporateBusinessConstants
WHERE RowNumber = 1;
EXCEPTION WHEN NO_DATA_FOUND THEN
minimumBalance := 0;
END;
IF inBalance >= minimumBalance THEN
interest := medianBalance * 0.01;
END IF;
RETURN interest;
END;
-- Final version of the code
CREATE OR REPLACE FUNCTION CalculateInterest
( inBalance IN NUMBER )
RETURN NUMBER;
AS
interest := 0;
minimumBalance NUMBER;
interestRate NUMBER;
BEGIN
minimumBalance := GetMinimumBalance();
interestRate := GetInterestRate();
IF inBalance >= minimumBalance THEN
interest := medianBalance * interestRate;
END IF;
RETURN interest;
END;
Nested conditional statements can be difficult to understand. In the following example, the nested IF statements were replaced with a series of separate IF statements to improve the readability of the code:
-- Before code
BEGIN
IF condition1 THEN
-- do something 1
ELSE
IF condition2 THEN
-- do something 2
ELSE
IF condition3 THEN
-- do something 3
END IF;
END IF;
END IF;
END;
-- After code
BEGIN
IF condition1 THEN
-- do something 1
RETURN;
END IF;
IF condition2 THEN
-- do something 2
RETURN;
END IF;
IF condition3 THEN
-- do something 3
RETURN;
END IF;
END;
You have a temporary variable being used for one or more purposes within your method. This has likely prevented you from giving it a meaningful name, or perhaps it has a meaningful name for one purpose but not the other. The solution is to introduce a temporary variable for each purpose, as you see in the following code, which uses aTemporaryVariable as a repository for the converted imperial values into metric values:
-- Before code
DECLARE
aTemporaryVariable := 0;
farenheitTemperature := 0;
lengthInInches := 0;
BEGIN
-- retrieve farenheitTemperature
aTemporaryVariable := (farenheitTemperature—32 ) * 5 / 9;
-- do something
-- retrieve lengthInInches
aTemporaryVariable := lengthInInches * 2.54;
-- do something
END;
-- After code
DECLARE
celciusTemperature := 0;
farenheitTemperature := 0;
lenghtInCentimeters := 0;
lengthInInches := 0;
BEGIN
-- retrieve farenheitTemperature
celciusTemperature := (farenheitTemperature—32 ) * 5 / 9;
-- do something
-- retrieve lengthInInches
lenghtInCentimeters := lengthInInches * 2.54;
-- do something
END;
If you find a clearer way to write the logic of an algorithm implemented as a method, you should do it. If you need to change an existing algorithm, it is often easier to simplify it first and then update it. Because it is hard to rewrite a complicated algorithm, you should first simplify it via other refactorings, such as Extract Method (page 285) before applying Substitute Algorithm.