Chapters 2 and 3 examined the basic types of structures that appear in formulations of linear programs—allocation, covering, blending, and network models. Not every application of linear programming can be classified as one of those four types, but most applications resemble one or more of them. In this chapter, we look at a type of linear program that has a distinctive application and a special kind of interpretation. This type is associated with Data Envelopment Analysis, or DEA. For classification purposes, the DEA model is essentially an allocation model, but its unique applications make it an important type of model to study in its own right.
As the examples in Chapters 2 and 3 indicated, linear programming is typically used as an ex ante tool in planning, that is, as an aid in choosing among alternative possible courses of action. In DEA, linear programming is used as an ex post tool to evaluate performance that has already been observed. Compared to other linear programming applications, DEA is a relative newcomer. The first articles on the methodology began appearing in the mid-1970s, and researchers have been elaborating the theory ever since. As recognition of DEA has spread, the technique has been applied in a variety of settings, such as public schools, courts of law, hospitals, oil and gas production, vehicle maintenance, and banking.
The primary elements in a DEA study are a set of decision-making units (DMUs), along with their measured inputs and outputs. The DMUs may be different branches of the same large bank or different hospitals in the same region or different offices of the same insurance company, but they should be reasonably homogeneous and separately managed. In the ideal case, the DMUs have a well-defined set of common inputs and outputs.
The purpose of DEA is to determine which of the DMUs make efficient use of their inputs and which do not. For the inefficient units, the analysis can actually quantify what levels of improved performance should be attainable. In addition, the analysis indicates where an inefficient DMU might look for benchmarking help as it searches for ways to improve.
DEA produces a single, comprehensive measure of performance for each of the DMUs. If the situation were simple, and there were just one input and one output, then we would define performance as the ratio of output to input, and we would likely refer to this ratio as “productivity” or “efficiency.” The best ratio among all the DMUs would identify the most efficient DMU, and every other DMU would be rated by comparing its ratio to the best one. As an example, suppose that we have been hired as consultants to the White River Dairy Cooperative.
In this example, productivity is calculated as the ratio of milk produced to cows owned or output divided by input. The efficiency rating in the table is just a normalized measure of the same thing. In other words, the value of 1.00 is assigned to the maximum productivity in the set (for Farm 4), and the remaining values are calculated as the ratio of each farm’s productivity to the maximum productivity in the set.
Without access to detailed knowledge about the operation of each farm, we might infer that Farm 4 has achieved its maximum efficiency rating because of factors such as the following:
Something about these categories of factors is probably lacking at the other farms. For instance, if Farm 1 could employ the same technology, procedures, and management as Farm 4, then we would expect that with 15 cows, it should be able to achieve a milk output of 75. (This target figure is the productivity of the best farm multiplied by the number of cows at Farm 1. This output target can also be computed as the actual output for Farm 1 divided by the efficiency of Farm 1.) Alternatively, we would expect that the same milk output of 60 should be achievable with only 12 cows (obtained by multiplying the input by the efficiency). In any event, the comparative analysis provides two kinds of information for Farm 1: First, its productivity could be as much as 25% higher than it actually was, and second, it could probably learn a lot by studying—and imitating—the operation of Farm 4.
In more practical cases, a DMU is characterized by multiple outputs and inputs. Productivity is still a ratio, usually a weighted sum of the outputs divided by a weighted sum of the inputs. When more than one output exists, we need to use weights in order to value a combination of outputs and quantify them in a single number. The same holds for inputs. When we can quantify the value of outputs and inputs in a single number, then we can take their ratio and compute a productivity measure. We can also normalize that value by comparing it to productivities of other DMUs and scale the results so that the best value is 1. By relying on efficiency, DEA is useful when no single output metric captures performance comprehensively and when some measure of outputs relative to inputs seems appropriate. This makes DEA a valuable tool for situations in which several dimensions of performance are important.
DEA has often been applied in nonprofit industries, characterized by multiple outputs of interest and some ambiguity about the relative importance of those outputs. For example, in comparing the performance of mental health clinics, it might be difficult to place relative values on services for domestic abuse and drug addiction. DEA is well suited to this type of situation because it does not require importance weights for the various outputs (or inputs) to be established beforehand. Instead, as we shall see, it determines the weights in the analysis and allows each DMU to be evaluated in its best possible light.
Even in for-profit industries, a total profit figure may not be adequate for evaluating productivity. In the case of branch banks, which we use for the purposes of illustration, suppose that profit is entirely determined by loan and deposit balances. In the short run, fluctuations in the profit margins for loans or deposits may influence a branch’s profits, but short-run profits may not indicate how productive the branch has been at developing and managing loans and deposits. In addition, short-run profits at a particular time may not indicate how well the branch will perform when the market shifts and margins change. Therefore, a gross profit figure may not be the best measure of branch productivity. Instead, DEA combines the loan and deposit balances into a single output measure, considering every possible ratio of profit margins, and chooses the margins that are most favorable to the branch being evaluated. Then, having chosen a favorable set of loan and deposit margins for each branch, the DEA program rates the efficiency of each branch on a scale of 0 to 1.
To illustrate the use of weighted averages in DEA, we move from the one-input, one-output case of dairy farms to a simplified one-input, two-output case involving branch banks. This time, we illustrate the analysis with a graphical approach.
As shown in the table, Branches 1, 3, and 5 have the highest efficiency rating of 1; therefore, they are classified as efficient. An efficiency rating of 1 means that we can find a pair of weights on loans and deposits for which the branch would be the most productive branch in the system. For instance, suppose the weights are 3 for loans and 33 for deposits. Then the weighted values of outputs for the branches are as follows.
DMU | Loans | Deposits | Value |
Branch 1 | 10 | 31 | 1053 |
Branch 2 | 15 | 25 | 870 |
Branch 3 | 20 | 30 | 1050 |
Branch 4 | 23 | 23 | 828 |
Branch 5 | 30 | 20 | 750 |
In this comparison, Branch 1 has the highest value. For the pair of weights (3, 33), Branch 1 is the most productive DMU on the list. On the other hand, if the weights were (12, 10), then Branch 5 would be the most productive. As long as we can find at least one set of weights for which Branch 1 achieves the highest value, then Branch 1 is classified as efficient. Later, we impose some restrictions on the weights chosen.
For Branches 2 and 4, the story is different: No possible weights exist on loans and deposits that would make these branches the most productive. For Branch 2, this is easy to see because it is “dominated” by Branch 3—that is, Branch 3 performs better on both dimensions than Branch 2. Since the input expenses are the same, whatever weights we choose for loans and deposits, Branch 3 will show a higher total value than Branch 2 and therefore greater productivity. The case of Branch 4, however, is less clear. No other branch dominates Branch 4, yet it is still inefficient because no pair of weights can give Branch 4 the highest output value.
Figure 5.1 displays the output for each branch as a point on a two-dimensional graph. Thus, Branch 1 corresponds to the point (10, 31) in the figure. The points are labeled by branch number.
For any inefficient branch, such as Branch 4, the DEA procedure creates a hypothetical comparison unit (HCU) that is built from the features of efficient units. These efficient DMUs are referred to as the reference set for the inefficient branch. In the case of Branch 4, the reference set is made up of Branches 3 and 5, and the comparison unit corresponds to the point (25, 25) in Figure 5.2. We can form the comparison unit by adding 0.5 times the profile (inputs and outputs) of Branch 3 and 0.5 times the profile of Branch 5:
Thus, we obtain a hypothetical branch with an input of 100 and with outputs of 25 (loans) and 25 (deposits). Graphically, the point (25, 25), labeled 4′, lies on the straight line connecting points 3 and 5, as shown in Figure 5.2. Among all the points on the line (i.e., all linear combinations of 3 and 5), 4′ is the only one that contains the same ratio of loans and deposits as that of Branch 4. Thus, we can think of the comparison unit as producing the same product mix but producing more of it than Branch 4. Consequently, the comparison unit dominates Branch 4, even though there is no actual branch that does.
For Branch 2, the HCU corresponds to the point (18.1, 30.2), labeled 2′ in Figure 5.2. Although we noted that Branch 2 is dominated by Branch 3, this comparison unit does not correspond to point 3 because the output mix for Branch 3 is different from the mix for Branch 2. The hypothetical point 2′, however, has a deposits-to-loans ratio of 25:15, which matches the ratio for Branch 2 but with greater output. Furthermore, the point 2′ lies on the line connecting points 1 and 3, so that Branches 1 and 3 form the reference set for Branch 2.
In the DEA approach, we presume that an inefficient branch can improve its performance by emulating one or more of the efficient branches in its reference set. In the case of Branch 2, that would mean emulating aspects of Branches 1 and 3, the components of its HCU. In the case of Branch 4, that would mean emulating aspects of Branches 3 and 5.
The efficiency measure also has a geometric interpretation in Figure 5.2. The distance from the origin to the point representing Branch 4 is 92% of the distance from the origin to 4′. This percentage matches the efficiency of Branch 4. Similarly, the point representing Branch 2 is located 82.8% of the way from the origin to 2′.
Under our definition of efficiency, Branch 1 is efficient although it has far less loan activity than Branch 3 and only minimally larger deposits. Similarly, a branch with $1 more in deposits than Branch 1 and no loans at all would also be efficient. That is because we can conceive of a set of weights for loans and deposits that would make such a branch the most productive of all the branches. In particular, if deposits were very profitable for the bank, and loans were not very profitable, then the branch with $1 more in deposits would have a more valuable total output in terms of profitability. Such a profitability relationship might be very unlikely, but it is still possible. Thus, when we use DEA, we deal with a theoretical notion of efficiency—based on what is conceivable, not what is likely.
Spotting dominance in our example does not require DEA. We can simply scan the data if we want to detect dominance. However, as the number of outputs and inputs increases, a dominance relationship like the one between Branches 2 and 3 becomes less likely. Consequently, direct comparisons do not reveal many inefficient DMUs, and DEA becomes more valuable. Once we proceed beyond two outputs, geometric analyses are difficult or impossible, and we lose the intuition that it provides. For larger problems, we need an algebraic approach. In fact, even with two outputs, the graphical approach is limited. Our branch bank example was simplified because identical inputs existed for all branches. If there were differences in the inputs, then the graphical display of outputs would not convey the full comparison. In general, DEA relies on an algebraic approach and, as we shall see, on linear programming.
In order to describe a generic DEA model in algebraic terms, we let
The x- and y-values represent given information. In our branch bank example, for Branch 1 (or k = 1), we have
Next, we define the weights, which play the role of decisions in the model
If the scenario contained one output and one input, as in the case of milk and cows, we could measure productivity as yk/xk and then normalize this measure to compute efficiency. We would have no need for weights at all. When two outputs exist, as in the example of branch banks, we need weights to calculate an aggregate value for the outputs. In the case of loans and deposits, it’s conceivable that market prices exist and that we could use actual profit margins for weights; but in other settings, market prices may not exist for all outputs. For that reason, we refer to weights rather than to prices as a means of valuing inputs and outputs. As we shall see, the weights are obtained from the data, that is, they are determined intrinsically.
When more than one output exists, we use Yk to denote the weighted value of outputs. That is, we let
Suppose that in the branch bank example, the weights selected are u1 = 0.2 and u2 = 0.3. Then the output value for Branch 1 is 0.2(10) + 0.3(31) = 11.3. Similarly, we use Xk to denote the weighted value of the inputs, where
Suppose that in the branch bank example, we take v1 = 0.1. Then the input value for Branch 1 is 0.1(100) = 10. For an arbitrary set of nonnegative weights, we can compute productivity as the ratio of the weighted value of outputs to the weighted value of inputs. With the values of u1, u2, and v1 just mentioned, the productivity measure for Branch 1 would be 11.3/10 = 1.13. With those weights, the best productivity among the five branches is 1.30 for Branch 3. Therefore, the efficiency for Branch 1 would be calculated as 1.13/1.3 = 0.869. Later, we will constrain the weights so that they normalize the measure of productivity. This means that the highest productivity measure in the comparison is 1. With normalizing weights, we define efficiency as Ek = Yk/Xk, where the capital letters are shorthand for weighted sums.
The performance of a particular DMU is considered efficient if the performance of other DMUs does not provide evidence that one of its inputs or outputs could be improved without worsening some of its other inputs or outputs. In other words, the performance is efficient if it is impossible to construct an HCU that does better.
In our notation, the subscript k refers to the kth DMU. Our approach will be to choose a particular DMU for evaluation and to denote it with the subscript k = 0. But this same DMU will still be included among the values of k > 0.
Now, we can give an outline of the steps in DEA:
In the next section, we implement these steps using a spreadsheet model.
We can use a standard linear programming format to implement a spreadsheet model for DEA. We begin by entering a table containing the data in the problem. Usually, this table will have columns corresponding to inputs and outputs and rows corresponding to DMUs. Figure 5.3 shows this layout for the branch bank example. The decision variables are the weights for inputs and outputs. The decision variable cells appear below the table containing the data, in the highlighted cells C12:E12.
First, we fix the weighted value of the inputs, arbitrarily, at X0 = 1. In the spreadsheet, this equation is enforced by requiring that cell F16 must equal H16. Moreover, because of the form of X0, it is easily expressed as a SUMPRODUCT of input weights (vi) and input values (xi0). This equality constraint is just a scaling step; we could set the input value equal to any number we like. Having fixed the weighted input value X0 in the denominator of Y0/X0, it follows that maximizing the ratio E0 amounts to maximizing Y0, the weighted value of the outputs. Now, Y0 can be expressed as a SUMPRODUCT of output weights (uj) and output values (yj0). The value of Y0, which plays the role of an objective function, is located in cell F13. Therefore, we have a maximization problem in which the objective function is the weighted output value and the weighted input value is constrained to equal 1.
Next, we adopt the convention that the efficiency of a DMU cannot exceed 1. This convention reflects the sense that “perfect” efficiency is 100%, and we saw this convention used earlier in Examples 5.1 and 5.2. This requirement is just a way of ensuring that the value of the output can never be greater than the value of the input. In symbols, we write Yk/Xk ≤ 1, for every k representing a DMU, or equivalently, Yk − Xk ≤ 0. These normalizing conditions become the remaining constraints of the model.
These steps lead to a relatively simple linear programming model. The form of the model can be expressed as follows.
Figure 5.3 shows the spreadsheet for the analysis using the standard format for an allocation model. The objective function in this model corresponds to the output value of Branch 1, computed by a SUMPRODUCT formula in cell F13. The equation that fixes the value of inputs appears in row 16, and the normalizing constraints (requiring that output values never exceed input values) can be found in rows 17–21.
The model specification is as follows:
When we run Solver, we obtain an objective function of 1.00, as shown in the figure, along with the following weights:
With these weights for Branch 1, the input value is X0 = 1.0 and the output value is Y0 = 1.0 corresponding to an efficiency of 100%. In the solution, cells F18:F21 all show negative values. This means that the output value is strictly less than the input value for each of the other branches. Since the input values are identical in this example, it follows that none of the other branches can achieve the productivity of Branch 1 at its most favorable weights (0, 0.032258).
Figure 5.4 shows the analysis for Branch 2. The format is the same as the format for Branch 1, and only two changes occur. First, the objective function now contains data for Branch 2 in row 13. Second, the coefficients for the constraint on input value contain data for Branch 2 in row 16. (In this example, that change does not actually alter row 16, but in other examples, it could.) Otherwise, the parameters of the linear program remain unchanged from the analysis for Branch 1. When we run Solver on this model, we obtain an objective function of 0.828, as shown in the figure, along with the following weights:
With these weights for Branch 2, the input value is X0 = 1.0, and the output value is Y0 = 0.828, for an efficiency of 82.8%. In this solution, cells F17 and F19 are zero. This means that the normalizing constraint is binding for Branches 1 and 3. In other words, Branches 1 and 3 have efficiencies of 100%, even at the most favorable weights (0.003125, 0.03125) for Branch 2.
We could construct similar spreadsheet models for the analyses of Branches 3, 4, and 5 following the same format. However, much of the content on those worksheets would be identical, so a more efficient approach is possible. In Figure 5.5, we show a single spreadsheet model that handles the analysis for all five branches. As before, the array in rows 4–9 contains the problem data. Cell F11 contains the branch number for the DMU under analysis. Based on this choice, two adjustments occur in the linear programming model. First, the outputs for the branch being analyzed must be selected for use in the objective function, in cells D13:E13. Second, the inputs for the branch being analyzed must be selected for use in the EQ constraint, in cell C16. These selections are highlighted in bold in Figure 5.5. The INDEX function (Box 5.1) uses the branch number in cell F11 to draw the objective function coefficients from the data array and reproduce them in cells D13:E13. It also draws the input value from the data array and reproduces it in cell C16. The three cells in bold format change when a different selection appears in cell F11.
We want to solve the model in Figure 5.5 several times, once for each DMU. To do so, we vary the contents of cell F11 from 1 to 5. Since each solution requires a reuse of the worksheet, we save the essential results in some other place before switching to a new DMU. In particular, we save the weights and the value of the objective function. Figure 5.6 shows a worksheet containing a summary of the five optimizations for the five-branch example (one from each choice of cell F11 in Fig. 5.5). The original data are reproduced in rows 4–9, and the optimal decision variables and efficiencies appear in rows 12–17. This summary can be generated automatically by using the Solver Sensitivity add-in, as described in Chapter 4.
As we can see in Figure 5.6, there are three efficient branches in our example: Branches 1, 3, and 5. Branches 2 and 4 are inefficient, with efficiencies of 82.8 and 92%, respectively. The numerical results agree with the graphical model. Thus, we have developed a spreadsheet prototype that implements the DEA approach. Later, we build on this set of results and use the spreadsheet model to compute additional information pertinent to the analysis.
Our branch bank example is a special case. It has only one input dimension, and all the DMUs have identical input levels. The example has only two output dimensions, but by working with identical inputs and two dimensions of output, we can depict the solution graphically, as in Figures 5.1 and 5.2. As mentioned earlier, if the inputs were different for all the DMUs, we would not have been able to convey the analysis that way. However, the spreadsheet model, in the same form as Figures 5.5 and 5.6, accommodates more general problems without difficulty.
Quantitative measures of performance are not always absolute figures. Often, it is more meaningful and more convenient to measure performance in relative terms. To create indexed data, we assign the best value on a single input or output dimension an index of 100, and other values are assigned the ratio of their value to the best value. In effect, all performance values are expressed in percentages, relative to the best performance observed.
The use of indexed data does not present difficulties for DEA. In fact, the DEA calculations are, perhaps, more intuitive when based on indexed data because the result tends to be optimal weights of approximately the same order of magnitude, which may not be the case without indexing.
To illustrate how indexing works, we return to Example 5.2. When we scan the loan values for the various branches, the highest output in the comparison comes from Branch 5, with an output of 30. If we treat a level of 30 as the base, we can express the loan values for each of the other branches as a percentage of Branch 5 output. Table 5.1 summarizes the scaled values that result, for both loans and deposits.
Table 5.1 Scaled Values from Example 5.2
DMU | Loans | Index | Deposits | Index |
Branch 1 | 10 | 33.3 | 31 | 100.0 |
Branch 2 | 15 | 50.0 | 25 | 80.6 |
Branch 3 | 20 | 66.7 | 30 | 96.8 |
Branch 4 | 23 | 76.7 | 23 | 74.2 |
Branch 5 | 30 | 100.0 | 20 | 64.5 |
Suppose we perform the linear programming analysis using the indexed values instead of the original, raw data. How does the analysis change? Figure 5.7, which shows the analysis of Branch 4 using indexed values, conveys the main point. The value of the objective function remains unchanged (at 92% in this case), even though the values of the decision variables are different from those in the original model (compare Fig. 5.5). This example shows that the efficiency calculation is robust in the sense that it depends only on the relative magnitudes of the output levels, and these can be scaled for convenience without altering the efficiency values produced by the analysis.
Within each of the output dimensions being evaluated, only relative values matter, so it is always possible to use raw data even when the dimensions are quite different. In Example 5.2, the sizes of loans and deposits are of roughly the same magnitude—tens of millions of dollars. Suppose we had used another dimension of performance, calculated as the nondefault rate on commercial and residential mortgages. For this measure, the given data might be proportions no larger than one, but it will not be a problem to mix such data with numbers in the tens of millions because it is only relative levels, within a performance dimension, that really matter in DEA. As a result, we do not have to worry about scaling the data. Nevertheless, it is sometimes advantageous to use indexing because it leads to some comparability in the weights selected by the optimization model.
We identify an efficient DMU by solving the linear program and finding a value of 1 for the objective function. By contrast, an optimal value less than 1 signifies that the DMU is inefficient. The first main result in DEA is classifying the various DMUs as either efficient or inefficient. For the efficient DMUs, there may not be much more to say. As we shall see later, advanced variations of the analysis can discriminate among the efficient DMUs. Initially, however, they are not analyzed further. Instead, attention focuses on the inefficient DMUs. If we solve a version of the linear program and discover that a DMU is inefficient, the analysis proceeds by identifying the corresponding reference set and describing the associated HCU. In order to carry out this part of the analysis, we can draw on the shadow price information in Solver’s Sensitivity Report.
To illustrate how the analysis proceeds, we move next to an example with multiple inputs and multiple outputs. The simplest such case would be a two-input, two-output structure, as in the example of evaluating a chain of nursing homes.
A spreadsheet model for the analysis of the six DMUs is shown in Figure 5.8, which displays the specific analysis for Facility 5. In a full set of six optimization runs for this model, we find that the first four units are all efficient, while Facilities 5 and 6 are inefficient. The efficiencies are summarized in cells G6:G11.
Next, we illustrate the further analysis of Facility 5. The first step is to rerun Solver for Facility 5 and obtain the Sensitivity Report, which is shown with some reformatting in Figure 5.9. The information we need can be found in the Constraints section of the Sensitivity Report, in the rows corresponding to the normalizing constraints of the original model, which have right-hand-side constants of zero. The specific values we seek are the shadow prices corresponding to the six normalizing constraints, as highlighted in Figure 5.9.
To proceed, we copy the shadow prices for the normalizing constraints and paste them into column J of the spreadsheet, so they match up with the corresponding constraint rows, as shown in Figure 5.8. The next step is to identify which shadow prices are positive; the DMUs corresponding to those make up the reference set. In Figure 5.8, we can observe that the shadow prices are positive in normalizing constraints corresponding to Facilities 1, 2, and 4. This means that Facilities 1, 2, and 4 form the reference set for Facility 5. The results are summarized as follows.
DMU | Shadow Price | Reference Set |
Facility 1 | 0.2000 | Yes |
Facility 2 | 0.0805 | Yes |
Facility 3 | 0.0000 | |
Facility 4 | 0.5383 | Yes |
Facility 5 | 0.0000 | |
Facility 6 | 0.0000 |
Having identified the reference set for Facility 5, we next construct an HCU. In cells C26:F26, we lay out a row resembling the original row of data for Facility 5. The entry in cell C26 is calculated as the SUMPRODUCT of the shadow prices and the six values of the first input (staff hours) from the array of input data. This calculation yields the value 342.125, as shown below. This number represents the staff hours of the HCU.
DMU | Staff Hours per Day | Shadow Price | |
Facility 1 | 150 | 0.2000 | |
Facility 2 | 400 | 0.0805 | |
Facility 3 | 320 | 0.0000 | |
Facility 4 | 520 | 0.5383 | |
Facility 5 | 350 | 0.0000 | |
Facility 6 | 320 | 0.0000 | SUMPRODUCT = 342.125 |
The specific formula in cell C26 is =SUMPRODUCT($J$19:$J$24,C6:C11)
. Next, this calculation is copied to cells D26:F26 using absolute addresses for the shadow prices in column J, as shown in Figure 5.8. The resulting numbers provide the description of an HCU for Facility 5:
In particular, the outputs of the comparison unit, which are (19, 25), match the outputs of Facility 5 precisely. However, the inputs (342.125, 1.173) are slightly smaller than the inputs of Facility 5. In other words, the comparison unit achieves the same outputs as Facility 5 but with lower input levels. By its construction, the comparison unit has inputs and outputs that are weighted averages of those for the facilities in the reference set. Thus, a weighted combination of Facilities 1, 2, and 4 provides a target for Facility 5 to emulate.
The analysis of Example 5.3 shows how the shadow prices can be used as weighting factors to construct the HCU. In general, the comparison unit has outputs that are at least as large as the outputs of the inefficient unit being analyzed and inputs that are no larger than the inputs of the unit being analyzed. In this case, the actual inputs for Facility 5 are staff hours of 350 and a supply level of 1.2. The analysis suggests that efficient performance, as exemplified by Facilities 1, 2, and 4, would enable Facility 5 to produce the same outputs with inputs of only 342.125 staff hours and a supplies level of 1.173.
How could Facility 5 achieve these efficiencies? DEA does not tell us. It merely suggests that Facilities 1, 2, and 4 would be reasonable benchmarking targets for Facility 5. Then, by studying differences in technology, procedures, and management, Facility 5 might be able to identify and implement changes that could lead it to improved performance.
In Example 5.3, Facilities 1–4 are all efficient, but only Facilities 1, 2, and 4 form the reference set for Facility 5. An exploration of the analysis for Facility 6 leads to a similar conclusion: Its reference set also consists of Facilities 1, 2, and 4. Although Facility 3 is efficient, it does not appear in any reference sets. Evidently, it is not a facility that Facility 5 or 6 should try to emulate. We might guess that this is the case because its output mix is quite different.
Although we have relied on the term “efficient,” it would be more appropriate to use the term relatively efficient—that is, the productivity of a DMU is evaluated relative to the other units in the set being analyzed. DEA identifies what we might call “best practice” within a given population. However, that does not necessarily mean that the efficient units compete well with DMUs outside the population. Thus, we have to resist the temptation to make inferences beyond the population under study.
As mentioned earlier, DEA works well when there is some ambiguity about the relative value of outputs. No a priori price or other judgment about the relative value of the outputs is needed. Because prices are not given, it should not be obvious what output mix would be best. (This applies to inputs as well.) DEA performs its evaluation by finding weights that are as favorable as possible to the DMU being evaluated. However, DEA may not be very useful in a situation where a distinct hierarchy of strategic goals exists, especially if one goal strongly influences overall performance.
Some applications of DEA have run into complaints that the output measures may be influenced by factors that managers cannot control. In response, variations of the DEA model have been developed that can accommodate uncontrollable factors. Such a factor can be added by simply including it in the model; there is no need to specify any of its structural relationships or parameters. Thus, a factor that is neither an economic resource nor a product, but is instead an attribute of the environment, can easily be included as an input. An example might be the convenience of a location for a branch bank.
One of the technical criticisms often raised about DEA relates to the use of completely arbitrary weights. In particular, the basic DEA model allows weights of zero on any of the outputs. (Refer to Fig. 5.3 as an illustration.) A zero-valued weight in the optimal solution means that the corresponding input or output has been discarded in the evaluation. In other words, it is possible for the analysis to completely avoid a dimension on which the DMU happens to be relatively unproductive. This may sound unfair, especially since the inputs and outputs are usually selected for their strategic importance, but it is consistent with the goal of finding weights that place the DMU in the best possible light. In response, some analysts suggest imposing a lower bound on each of the weights, ensuring that each output dimension receives at least some weight in the overall evaluation. Choosing a suitable lower bound is difficult, however, because of the flexibility available in scaling performance data. (Recall the discussion of indexed values earlier.) A more uniform approach is to impose a lower bound on the product of performance measure and weight. For any input dimension, the product of its input value and its weight is sometimes called the virtual input on that dimension. Similarly, for any output dimension, the product of its output value and its weight is sometimes called the virtual output. The virtual outputs are the components of the efficiency measure, and we can easily require that each component account for at least some minimal portion of the efficiency, such as 10%. In the analysis of Branch 1 (see Fig. 5.10), we can compute the virtual outputs for each performance dimension in cells D14 and E14. Then, we add constraints forcing these values to be at least 10% (as specified in cell F14). With these lower bounds added, it is not possible to place all the weight on just one dimension as was done previously. As shown in Figure 5.10, the imposition of a 10% lower bound for the contribution from each dimension reduces the efficiency rating for Branch 1 to 92.7% when the model is optimized. As the example illustrates, when we impose additional requirements, we may turn efficient DMUs into inefficient ones.
A related criticism is that the weight may be positive but still quite small on an output dimension that is known to be strategically important. In this situation, it is possible to add a constraint to the model that will force the virtual output of one important measure to be greater than the virtual output other measures. These kinds of additional constraints may improve the logic, although they sacrifice transparency in the model.
Another technical criticism relates to the fact that a DEA evaluation often produces a number of efficient DMUs, and it would be satisfying to have a tie-breaking mechanism for distinguishing among the efficient units. One way to break ties is to omit the normalizing constraint for the kth DMU when it is the subject of evaluation. When we do so, we tend to get some efficiencies above 1.0, and we are much less likely to get ties in the performance metric. Another response is more complicated but perhaps more equitable. The evaluation of the kth DMU produces a set of optimal weights that are, presumably, as favorable as possible to unit k. Suppose we call those “price set k.” When we evaluate DMU k, we compute the value of its outputs under each of the price sets (price set 1, price set 2, etc.). Then we average the output values obtained under the various price sets and rank the DMUs based on their average values over all price sets. The average value on which the DMUs are ranked is called the cross-efficiency. This method makes ties less likely but involves more computation.
Although we started with a small one-input/one-output example and then moved on to larger examples, it does not follow that a DEA model should be built with as many inputs and outputs as possible. In fact, there is a good reason to limit the number of variables. A large number of outputs and inputs have a tendency to cause nearly every unit to appear efficient. Therefore, a judicious choice of outputs and inputs retains the power of DEA but tends to limit the number of DMUs that attain the maximum efficiency. The literature recommends an ideal number of DMUs of two or three times the total number of inputs and outputs. In practice, it makes sense to limit consideration to those inputs and outputs that are broadly considered to be of strategic importance to the units being evaluated.
The DEA model accommodates multiple inputs and outputs and makes no assumption about the functional form relating outputs to inputs. In other words, any type of production function is permissible. However, the efficiency measure itself and the construction of an HCU both involve some assumptions. First, the comparison unit is defined by assuming that weighted averages of efficient units are feasible operating possibilities. In other words, there are no major “lumpy” relationships in the production function. Second, the comparison unit is interpreted as the output potential that could be achieved if the unit under consideration were to maintain its mix of inputs and outputs. Here, DEA assumes constant returns to scale. More advanced variations of the DEA model allow for alternative assumptions about returns to scale.
The DEA model represents a fifth type of linear programming model, along with allocation, covering, blending, and network models covered in Chapters 2 and 3. In a strict sense, the DEA model is a variation on the allocation type, but because its use is so specialized, we have given it separate treatment here.
For the purposes of spreadsheet implementation, the DEA model should be built with the kind of flexibility exemplified by Figure 5.5. That is, the analysis of every DMU can be done in the same worksheet, simply by updating a single cell. A documented analysis is likely to need a separate location to keep a summary of the linear programming results, as illustrated in the worksheet of Figure 5.6. In addition, to identify and analyze the properties of an HCU, we also need to obtain the Sensitivity Report, making use of its shadow price information as shown in Figure 5.8.
The DEA model was introduced in the 1970s, and for many years, it was a topic known mainly to a small group of researchers. Their work extended the theory underlying DEA, made progress enhancing the computational aspects of the analysis, and reported on selected applications. Over a period of many years, corporations and consultants have slowly discovered DEA and have begun to use it more frequently. As application catches up with theory, the DEA model promises to find more significant use in the future.
Acres | Gallons | |
Firm 1 | 100 | 10 |
Firm 2 | 110 | 15 |
Firm 3 | 122 | 20 |
Firm 4 | 115 | 23 |
Firm 5 | 96 | 30 |
Acres | Juice | Sauce | |
Firm 1 | 100 | 10 | 31 |
Firm 2 | 110 | 15 | 25 |
Firm 3 | 122 | 20 | 30 |
Firm 4 | 115 | 23 | 23 |
Firm 5 | 96 | 30 | 20 |
Table 5.2 Inputs and Outputs for Seven Hospitals
Hospital | Input Measures | |||
Full-Time Equivalent Nonphysicians | Supply Expense ($1000s) | Bed-Days Available ($1000s) | ||
A | 310.0 | 134.60 | 116.00 | |
B | 278.5 | 114.30 | 106.80 | |
C | 165.6 | 131.30 | 65.52 | |
D | 250.0 | 316.00 | 94.40 | |
E | 206.4 | 151.20 | 102.10 | |
F | 384.0 | 217.00 | 153.70 | |
G | 530.1 | 770.80 | 215.00 | |
Hospital | Output Measures | |||
Patient-Days (65 or Older) (000s) | Patient-Days (Under 65) (000s) | Nurses Trained | Interns Trained | |
A | 55.31 | 49.52 | 291 | 47 |
B | 37.64 | 55.63 | 156 | 3 |
C | 32.91 | 25.77 | 141 | 26 |
D | 33.53 | 41.99 | 160 | 21 |
E | 32.48 | 55.30 | 157 | 82 |
F | 48.78 | 81.92 | 285 | 92 |
G | 58.41 | 119.70 | 111 | 89 |
Table 5.3 Inputs and Outputs for Five Restaurants
Restaurant | Input Measures | ||
Hours of Operation | FTE Staff | Supplies ($) | |
Jacksonville | 96 | 16 | 850 |
Daytona | 110 | 22 | 1400 |
Gainesville | 100 | 18 | 1200 |
Ocala | 125 | 25 | 1500 |
Orlando | 120 | 24 | 1600 |
Restaurant | Output Measures | ||
Weekly Profit ($) | Market Share (%) | Growth Rate (%) | |
Jacksonville | 3800 | 25 | 8.0 |
Daytona | 4600 | 32 | 8.5 |
Gainesville | 4400 | 35 | 8.0 |
Ocala | 6500 | 30 | 10.0 |
Orlando | 6000 | 28 | 9.0 |
Table 5.4 Inputs and Outputs for 17 Branch Banks
Branch Code | Inputs | Outputs | ||||
Labor | Expenses | Space | Deposits | Credit | Foreign | |
1 | 34,515 | 6,543 | 591 | 268,836 | 9,052 | 11,242 |
2 | 49,960 | 11,830 | 550 | 475,144 | 15,697 | 15,967 |
3 | 20,652 | 3,464 | 427 | 133,020 | 3,696 | 6,937 |
4 | 49,024 | 7,603 | 478 | 355,909 | 12,918 | 16,594 |
5 | 36,923 | 8,723 | 830 | 240,679 | 4,759 | 8,087 |
6 | 28,967 | 4,606 | 474 | 211,183 | 3,188 | 5,621 |
7 | 28,452 | 7,425 | 182 | 147,364 | 5,302 | 40,618 |
8 | 45,911 | 8,013 | 790 | 130,161 | 12,070 | 115,022 |
9 | 26,890 | 14,662 | 447 | 156,828 | 15,102 | 1,336 |
10 | 47,376 | 7,576 | 764 | 297,925 | 16,797 | 12,030 |
11 | 57,913 | 12,035 | 875 | 462,603 | 2,698 | 13,232 |
12 | 43,477 | 7,255 | 1109 | 300,976 | 12,299 | 24,368 |
13 | 49,786 | 10,909 | 405 | 233,178 | 6,248 | 4,701 |
14 | 30,045 | 4,264 | 479 | 110,976 | 8,675 | 19,796 |
15 | 56,579 | 8,895 | 840 | 363,048 | 6,370 | 10,788 |
16 | 43,824 | 12,690 | 801 | 130,219 | 20,417 | 28,133 |
17 | 33,823 | 4,143 | 381 | 146,804 | 47,508 | 21,856 |
Analyze Exercise 5.4 using this approach, calculating the revised efficiency rating for each of the DMUs.
Analyze Exercise 5.5 using this approach, calculating the revised efficiency rating for each of the DMUs.