Advanced Formula Concepts
The previous chapter’s formula examples may have appeared complicated at first, but you should be able to use them with time, practice, and patience. If you followed the advice at the start of Chapter 9, which was to work through formulas with techniques like Excel’s Evaluate Formula feature, you should find them easier to understand.
In this chapter, you will investigate how these formulas are applied. Specifically, I will cover the following:
Filtering and Highlighting
Following what you learned about ones and zeros in Chapter 9, you can use formulas for filtering results. In Chapter 9, you employed a mechanism to filter even and odd values using Booleans. Highlighting, as it turns out, isn’t much different than filtering. Let’s take a look.
Figure 10-1 shows the tables I’ve set up for the example (download Chapter10Example1.xlsx from the project files to follow along). On the left is the raw data. In the middle is the criteria that you want to filter, and on the right are some conditional tables to help know which items fit the criteria you would like to display. The information in the middle is linked to the front screen, which I’ll get to in a moment.
Figure 10-1. An example table to demonstrate applied formula concepts
Let’s take a better look at the table on the right. For the NPV column, let’s set up a conditional to compare whether the selected NPV is greater than the item in the current row of the Raw Data Table (Figure 10-2).
Figure 10-2. The Raw Data Table
Then do the same comparisons for Portfolio Risk and Project Lead. See Figures 10-3 and 10-4.
Figure 10-3. You’re testing for what level of Portfolio Risk is selected
Figure 10-4. You’re testing for which Project Lead has been selected
In the last two columns, you identify which projects you want to be highlighted. Since you’re looking for projects whose values come at the intersection of your criteria, you’ll test if each condition is met, and you’ll use AND for that (Figure 10-5).
Figure 10-5. Testing when all three conditions are met
Finally, for extra help, you’ll include the Project’s index in column P. This isn’t itself necessary to complete your work, but sometimes an extra column of information can help, provided you have enough room for it.
All of this work goes to help the highlighting mechanism developed on the front screen. Click the Dashboard (incomplete) tab in example file to see what I’m talking about (shown in Figure 10-6).
Figure 10-6. The Dashboard (incomplete) tab
Now take a look at Column A. Column A tests whether the current index in Column B is the same as the index returned from the Project List tab. Essentially, the result is the same as the Show on Front field in Column O on the Project List tab (Figure 10-7).
Figure 10-7. TRUE/FALSE on the dashboard corresponds to backend calculations
Conditional Highlighting Using Formulas
In this section, I’ll talk about how to add condition highlighting to the spreadsheet. Let’s do the following steps.
Figure 10-8. Selecting cells C8:C27
Figure 10-9. The Edit Formatting Rule dialog box
If you’ve performed these steps correctly, you should see several items highlighted in your list (see Figure 10-10). To bring more emphasis to these items—and to deemphasize the items outside your selection—highlight the table range again, C8:C27, and set the font to a gray color that is lighter than black but still readable. I chose the darkest gray at the bottom of the first color column. Finally, you’ll want to get rid of those conditional formulas in Column A. The easiest way to do this is to hide the entire column by right-clicking Column A and selecting Hide. Alternatively, I’ve simply set the font of the condition formulas to white. Personally, I like having the extra margin of white space on the left side of the screen.
Figure 10-10. A list of highlighted items
One last thought before moving on: I could have created another conditional format formula testing if A8=FALSE and then colored everything gray based on that. To me, that’s extra work. Conditional formats are volatile actions. Consider this: no instruction is executed to set the table items that are FALSE to be grayed out if you’ve already set them to gray by default. Remember to always be on the lookout for shortcuts.
Selecting
Selection is the process of returning only certain information (thinking of selecting from a group). Selecting is similar to filtering and highlighting, except that selecting only returns the information you’re interested in. Filtering, for example, simply hides the information you’re not interested in. Highlighting does the same as filtering through emphasizing and deemphasizing certain items. Selection, on the other hand, always contains only the complete set of information you’re interested in. Nothing more or less.
Open example file Chapter10Example2.xlsx. In this example, you’re going to create a range that can grow and shrink dynamically based on what you want to return. In this way, you’ll be creating the mechanism that selects the portion to return. Go to the Project List tab, and note the column of zeros you’ve created, as shown in Figure 10-11.
Figure 10-11. The Project List tab
If you recall from the previous chapter, the zeros indicate projects you don’t want to return. Alternatively, the numbers indicate projects you DO want to return. So, what you need to do now is count those projects. I’ve already laid out a spot for this count in cell R3. So go ahead and put this formula into R3:
=COUNTIF($P$3:$P$22,">0")
In the columns next to the box labeled Count-non 0s, set up the column headers as I have in Figure 10-12.
Figure 10-12. Column headers that you will use in the process of developing a selecting mechansim
Now, follow these steps.
=LARGE($P$3:$P$22,I3)
Figure 10-13. Using the LARGE function in the Index location
Note what what’s happening here. You’re using the index you created in column I to pull out the nth largest value from within the range indices that aren’t zero. When you drag down, you’ll have grouped all the indices you’re interested in at the top of the range (Figure 10-14). You should find there are six non-zero items at the top—exactly as the formula predicted.
Figure 10-14. The LARGE function returns the indices of the items you’re interested in at the top of the range
Figure 10-15. Adding the INDEX formula to the Project Name column
In case you’re wondering why you need to do this, remember that INDEX allows you to return one or more cells from within an array; all you must supply are the row(s) or columns(s) you’d like to grab. Because you returned more than a single cell, you had to use Ctrl+Shift+Enter.
Note >Remember, any time you return more than a single cell, you have an array formula. When you have an array formula, you must use Ctrl+Shift+Enter.
In a cell off to the side (I’ve chosen X3), type =OFFSET($V$3,0,0,$R$3), as shown in Figure 10-16.
Figure 10-16. Using OFFSET to create a dynamic side function
Remember how OFFSET works. That fourth argument specifies the height of the offset range to be returned. Here, you don’t actually want the returned range to be moved from cell V3 (which is why you supply a zero in the first two arguments); you simply want V3 to be the starting point and to have the range “grow” (or expand) downward from there.
So, click on X3 and copy the formula now that you know it’s working. Go to Name Manager from the Formulas tab. Click on New. Give it a name like “ProjectList.ReturnSelection” and paste the formula you copied into the Refers To box. Press OK until you’re back at the spreadsheet screen.
Figure 10-17. The Select Data Source dialog box
Figure 10-18. The SERIES function that appears when you click on a chart
Viola! If it worked correctly, you should see a series of columns like in Figure 10-19.
Figure 10-19. A dynamic chart that is automatically linked to your data selections
To see what I’m talking about, right-click the chart again and go to Select Data. Press the Edit button under the Horizontal category. Select the entire range of projects in column U from the Project List worksheet and press OK until you reach the spreadsheet screen (Figure 10-20).
Figure 10-20. The Axis Labels selection box
Now the labels are automatically assigned! Go ahead and mess with the dropdown boxes to see it work in action.
Okay, one last piece before moving on. Go ahead and click one of those columns again in the chart and look at the formula bar. You should see that the range you’ve entered for your labels is now the second argument in the formula box. Just like for the series values, you could have simply entered the label range directly in the formula box. In case you’re interested, here’s how the series formula breaks down:
=SERIES(series_title , series_label_range , series_value_range , series_index)
If you’d like to supply this chart a title directly, go ahead and type a string into that series_title parameter. That last parameter, series_index, holds the current index of the series. If you have multiple series in your chart, setting the series_index will change the series order by inserting the series you’re currently editing at the index you give.
Aggregating
In this section, I’ll talk about aggregation, particularly the formulas you can use for aggregation. I’ll also take a detour into some algebra, but nothing terrible. I promise.
Using SUMPRODUCT for Aggregation
Aggregation is the process of grouping similar items and presenting them as a whole. Excel has several aggregation formulas that you might already use every day including SUM, AVERAGE, and COUNT. If you want to get even more complicated—as if life isn’t already complicated enough!—you could use the SUMIF/SUMIFS functions or COUNTIF/COUNTIFS functions to find the sum and count of multiple ranges of the same length satisfying certain criteria.
Let’s say for the information in Figure 10-21, you were interested in all projects by Larry or Barry in which NPV is greater than 11,000,000 or portfolio risk is low.
Figure 10-21. The Raw Data table containing projects, NPV, portfolio risk, and the project’s lead
To do that, you could use this formula, which isn’t very pretty:
=COUNTIFS(ProjectLead,"Larry",NPV,">11000000")+COUNTIFS(ProjectLead,"Larry",PortfolioRisk,"Low")+COUNTIFS(ProjectLead,"Barry",NPV,">11000000")+COUNTIFS(ProjectLead,"Barry",PortfolioRisk,"Low")
This is because SUMIFS and COUNTIFS test for the intersection of data by themselves. There’s no room for an OR condition in these formulas. But you have alternatives. For example, you could use the SUMPRODUCT formula for this problem, which would look like this:
=SUMPRODUCT(((ProjectLead="Larry")+(ProjectLead="Barry"))*((NPV>11000000)+(PortfolioRisk="Low")))
I know you’re scratching your head, so let’s dig deeper. SUMPRODUCT by its name suggests it was designed for matrix algebra operations. To wit, Microsoft’s definition of SUMPRODUCT is pretty mathematical. Specifically, SUMPRODUCT “multiplies corresponding components in the given arrays, and returns the sum of those products” (my emphasis). But this exactly what’s so great about SUMPRODUCT.
When you write something like (ProjectLead="Barry") you’re turning the range given by ProjectLead into array of TRUE/FALSE based on the supplied condition. That’s from Chapter 9. So something like (ProjectLead="Larry")*(NPV>11000000) is calculated as shown in Figure 10-22.
Figure 10-22. A visual represetation of what’s happening when you use SUMPRODUCT
In a certain sense, you’re performing a query on the data. If you know SQL, the arrangement above could also be written as
SELECT COUNT(ProjectLead)
WHERE ProjectLead = "Larry" AND NPV > 11000000
You’re About To Be FOILed!
OK, I know what you’re thinking, how the heck am I ever going to remember how to write one of those fancy SUMPRODUCT formulas? Well, it all comes down to FOILing, which you might recall from your early days of learning algebra.
At first glance, the series of COUNTIFS functions appears easier to write and understand, even if the formula ends up being much longer. But I’m here to tell you that if you can write a series of COUNTIFS functions, you’re already writing the same formula. No, seriously: I can prove this to you with some simple algebra. So let’s talk FOILing (First, Outside, Inside, Last) from your algebra class. Let’s do it on an expression inside the SUMPRODUCT formula.
So
((ProjectLead="Larry")+(ProjectLead="Barry"))*((NPV>11000000)+(PortfolioRisk="Low"))
=
(ProjectLead="Larry")*(NPV>11000000)
+ (ProjectLead="Larry")*(PortfolioRisk="Low")
+ (ProjectLead="Barry")*(NPV>11000000)
+ (ProjectLead="Barry")*(PortfolioRisk="Low")
Now compare that FOILed expression to series of COUNTIFS functions.
=
COUNTIFS(ProjectLead,"Larry",NPV,">11000000")
+ COUNTIFS(ProjectLead,"Larry",PortfolioRisk,"Low")
+ COUNTIFS(ProjectLead,"Barry",NPV,">11000000")
+ COUNTIFS(ProjectLead,"Barry",PortfolioRisk,"Low")
Here’s the kicker: the plus symbol (+) acts as your OR condition and the multiplication symbol acts as your AND condition. If you think you’ll have trouble remember the plus’s + and multiplication’s *, remember that these symbols aren’t arbitrary, they represent algebraic operations.
Note Remember, for SUMPRODUCT queries, + = OR, * = AND.
If you open Chapter10Example3.xlsx, I’ve placed a summary table on the front page that employs SUMPRODUCT (Figure 10-23).
Figure 10-23. A demonstration of SUMPRODUCT on your dashboard
In this section, I’ll take a few moments to go through a concept I call reusable components. Take a look at the outlined components in Figure 10-24.
Figure 10-24. An example of reusuable componants
Admittedly, these components were not placed with any specific care. I did this on purpose to demonstrate how easily these components can be moved around, as shown in Figure 10-25.
Figure 10-25. A demonstration of how componants can be easily moved around
There was some reformatting required, of course. But if I select the entire region of a table, I’m able to move it somewhere else on the screen without having to update any code or other formulas that refer that area. In addition, if I want to create another table similar to the one above, I can copy and paste the table into another free area on the spreadsheet and update the formulas that make it refer to another desired location. This is what is meant by reusability. And developing reusable components really helps down the road.
The Last Word
In this chapter, you build upon the formulas presented in the previous chapter. You applied what you learned to create the processes of filtering, highlighting, selecting, and aggregation. Finally, you learned about the usefulness of reusable components.