IN THIS CHAPTER
Introduction to Data Visualizations
VBA Methods and Properties for Data Visualizations
Adding Color Scales to a Range
Using Other Conditional Formatting Methods
The data visualization tools were introduced in Excel 2007. However, Microsoft has made further improvements to these tools in Excel 2010.Data visualizations appear on a drawing layer that can hold icon sets, data bars, color scales, and now sparklines. In Excel 2010, you have new icon sets and new options for data bars. Unlike SmartArt graphics, Microsoft exposed the entire object model for the data visualization tools, so you can use VBA to add data visualizations to your reports.
→ See Chapter 17, “Dashboarding with Sparklines in Excel 2010,” for more information about sparklines.
Excel 2010 provides a variety of data visualizations. A description of each appears here, with an example shown in Figure 15.1:
• Data bars—The data bar adds an in-cell bar chart to each cell in a range. The largest numbers have the largest bars, and the smallest numbers have the smallest bars. You can control the bar color as well as the values that should receive the smallest and largest bar. New in Excel 2010, bars can be solid or a gradient. The gradient bars can have a border. In addition, negative bars can appear for the first time.
• Color scales—Excel applies a color to each cell from among a two- or three-color gradient. The two-color gradients are best for reports that are presented in monochrome. The three-color gradients require a presentation in color, but can represent a report in a traditional traffic light color combination of red-yellow-green. You can control the points along the continuum where each color begins, and you can control the two or three colors.
• Icon sets—Excel assigns an icon to each number. Icon sets can contain three icons such as the red, yellow, green traffic lights; four icons; or five icons such as the cell phone power bars. Excel 2010 adds the 3-stars icon set and the 4-boxes icon set. With icon sets, you can control the numeric limits for each icon, reverse the order of the icons, or choose to show only the icons.
• Above/below average—Found under the top/bottom rules fly-out menu, these rules make it easy to highlight all of the cells that are above average. You can choose the formatting to apply to the cells. Note in Column G of Figure 15.1 only 30 percent of the cells are above average. Contrast with the top 50 percent in Column I.
• Top/bottom rules—Excel highlights the top or bottom n percent of cells or highlights the top or bottom n cells in a range.
• Duplicate values—Excel highlights any values that are repeated within a dataset. Because the Delete Duplicates command on the Data tab of the Ribbon is so destructive, you might prefer to highlight the duplicates and then intelligently decide which records to delete.
• Highlight cells—The legacy conditional formatting rules such as greater than, less than, between, and text that contains are still available in Excel 2010. The powerful Formula
conditions are also available, although you might have to use these less frequently with the addition of the average and top/bottom rules.
All the data visualization settings are managed in VBA with the FormatConditions
collection. Conditional formatting has been in Excel since Excel 97. In Excel 2010, Microsoft expanded the FormatConditions
object to handle the new visualizations. Whereas legacy versions of Excel would use the FormatConditions.Add
method, Excel 2010 offers additional methods such as AddDataBar
, AddIconSetCondition
, AddColorScale
, AddTop10
, AddAboveAverage
, and AddUniqueValues
.
It is possible to apply several different conditional formatting conditions to the same range. For example, you can apply a two-color color scale, an icon set, and a data bar to the same range. Excel includes a Priority
property to specify which conditions should be calculated first. Methods such as SetFirstPriority
and SetLastPriority
ensure that a new format condition is executed before or after all others.
The StopIfTrue
property works in conjunction with the Priority
property. In the “Using Visualization Tricks” section, later in this chapter, you see how to use the StopIfTrue
property on a dummy condition to make other formatting apply only to certain subsets of a range.
Beginning with Excel 2007, the Type
property was expanded dramatically. This property was formerly a toggle between CellValue
and Expression
, but 13 new types were added in Excel 2007. Table 15.1 shows the valid values for the Type
property. Items 3 through 17 were included in Excel 2007.
The Data Bar command adds an in-cell bar chart to each cell in a range. Many charting experts complained to Microsoft about problems in the Excel 2007 data bars. For this reason, Microsoft changed the data bars in Excel 2010 to address these problems.
In Figure 15.2, Cell C37 is new in Excel 2010. Notice that this cell, which has a value of 0, has no data bar at all. In Excel 2007, the smallest value receives a 4-pixel data bar, even if that smallest value is 0. In addition, in Excel 2010 the largest bar in the dataset typically takes up the entire width of the cell.
In Excel 2007, the data bars would end in a gradient that made it difficult to tell where the bar ended. Excel 2010 offers a border around the bar. You can choose to change the color of the border or even to remove the border as shown in Column K of the figure.
Excel 2010 also offers support for negative data bars, as shown in Column G and the data bars that run right to left as shown in Cells C43:C45 of Figure 15.2. These allow comparative histograms.
Although all of these are fine improvements, they add complexity to the VBA that is required to create data bars. In addition, you run the risk that your code will use new properties that will be incompatible with Excel 2007.
To add a data bar, you apply the .FormatConditions.AddDataBar
method to a range containing your numbers. This method requires no arguments, and it returns an object of the DataBar
type.
Once you add the data bar, you will most likely need to change some of its properties. One method of referring to the data bar is to assume that the recently added data bar is the last item in the collection of format conditions. This code would add a data bar, identify the data bar by counting the conditions, and then change the color:
A safer way to go is to define an object variable of type DataBar
. You can then assign the newly created data bar to the variable:
When specifying colors for the data bar or the border, you should use the RGB function to assign a color. You can modify the color by making it darker or lighter using the TintAndShade
property. Valid values are from -1
to 1
. A value of 0
means no modification. Positive values make the color lighter. Negative values make the color darker.
By default, Excel assigns the shortest data bar to the minimum value and the longest data bar to the maximum value. If you want to override the defaults, use the Modify
method for either the MinPoint
or MaxPoint
properties. Specify a type from those shown in Table 15.2. Types 0, 3, 4, and 5 require a value. Table 15.2 shows valid types.
Use the following code to have the smallest bar assigned to values of 0
and below:
DB.MinPoint.Modify _
Newtype:=xlConditionValueNumber, NewValue:=0
To have the top 20 percent of the bars have the largest bar, use this code:
DB.MaxPoint.Modify _
Newtype:=xlConditionValuePercent, NewValue:=80
An interesting alternative is to show only the data bars and not the value. To do this, use this code:
DB.ShowValue = False
To show negative data bars in Excel 2010, use this line:
DB.AxisPosition = xlDataBarAxisAutomatic
Once you allow negative data bars, then you have the ability to specify an axis color, negative bar color, and a negative bar border color. Samples of how to change the various colors are shown in the following code that creates the data bars shown in Column C of Figure 15.3.
In Excel 2010, you have a choice of showing a gradient or a solid bar. To show a solid bar, use the following:
DB.BarFillType = xlDataBarFillSolid
The following code sample produces the solid bars shown in Column E of Figure 15.3:
To allow the bars to go right to left, use this code:
DB.Direction = xlRTL ' Right to Left
Color scales can be added in either two-color or three-color scale varieties. Figure 15.4 shows the available settings in the Excel user interface for a color scale using three colors.
Like the data bar, a color scale is applied to a range object using the AddColorScale
method. You should specify a ColorScaleType
of either 2
or 3
as the only argument of the AddColorScale
method.
Next, you can indicate a color and tint for both or all three of the color scale criteria. You can also specify if the shade is applied to the lowest value, highest value, a particular value, a percentage, or at a percentile using the values shown previously in Table 15.2.
The following code generates a three-color color scale in Range A1:A10:
Icon sets in Excel come with three, four, or five different icons in the set. Figure 15.5 shows the settings for an icon set with five different icons.
To add an icon set to a range, use the AddIconSet
method. No arguments are required. You can then adjust three properties that apply to the icon set. You then use several additional lines of code to specify the icon set in use and the limits for each icon.
After adding the icon set, you can control whether the icon order is reversed, whether Excel shows only the icons, and then specify one of the 20 built-in icon sets:
It is somewhat curious that the IconSets collection is a property of the active workbook. This seems to indicate that in future versions of Excel, new icon sets might be available.
Table 15.3 shows the complete list of icon sets.
After specifying the type of icon set, you can then specify ranges for each icon within the set. By default, the first icon starts at the lowest value. You can adjust the settings for each of the additional icons in the set:
Valid values for the Operator
property are XlGreater
or xlGreaterEqual
.
With VBA, it is easy to create overlapping ranges such as icon 1 from 0 to 50 and icon 2 from 30 to 90. Even though the Edit Formatting Rule dialog box will prevent overlapping ranges, VBA allows them. However, keep in mind that your icon set will display unpredictably if you create invalid ranges.
If you use an icon set or a color scale, Excel applies a color to all cells in the dataset. Two tricks in this section enable you to apply an icon set to only a subset of the cells or to apply two different color data bars to the same range. The first trick is available in the user interface, but the second trick is only available in VBA.
Sometimes, you might want to apply only a red X to the bad cells in a range. This is tricky to do in the user interface.
In the user interface, follow these steps to apply a red X to values greater than 80
:
80
. You now have a mix of all three icons, as shown in Figure 15.6.
80
. Because you don’t want any icons for these values, do not specify any special formatting for the cells that match this rule.true
. This prevents Excel from getting to the icon set rule for any cell with a value of 80
or less. The result is that only cells greater than 80
appear with a red X, as shown in Figure 15.7.
The code to create this effect in VBA is straightforward. A great deal of the code is spent making sure that the icon set has the red X symbols on the cells greater than 80
.
You will use the FormatConditions.Add
method to add the second condition. However, you need to make sure this condition is executed first. For this reason, you need to use the SetFirstPriority
method to move the new condition to the top of the list. The final step is to turn on the StopIfTrue
property.
The code to highlight values greater than 80
with a red X is shown here:
This trick is particularly cool because it can only be achieved with VBA. Say that values above 90
are acceptable and below 90
indicate trouble. You would like acceptable values to have a green bar and others to have a red bar.
Using VBA, you first add the green data bars. Then, without deleting the format condition, you add red data bars.
In VBA, every format condition has a Formula
property that defines whether the condition is displayed for a given cell. Therefore, the trick is to write a formula that defines when the green bars are displayed. When the formula is not True
, the red bars are allowed to show through.
In Figure 15.8, the effect is being applied to Range A1:D10. You need to write the formula in A1 style, as if it applies to the top-left corner of the selection. The formula needs to evaluate to True
or False
. Excel automatically copies the formula to all the cells in the range. The formula for this condition is =IF(A1>90,True,False)
.
The formula is evaluated relative to the current cell pointer location. Even though it is not usually necessary to select cells before adding a FormatCondition
, in this case, selecting the range ensures that the formula will work.
The following code creates the two-color data bars:
The Formula
property works for all the conditional formats, which means you could potentially create some obnoxious combinations of data visualizations. In Figure 15.9, five different icon sets are combined in a single range. No one will be able to figure out whether a red flag is worse than a gray down arrow. Even so, this ability opens interesting combinations for those with a little creativity.
Although the icon sets, data bars, and color scales get most of the attention, there are still plenty of other uses for conditional formatting.
The remaining examples in this chapter show some of the prior conditional formatting rules and some of the new methods available.
Use the AddAboveAverage
method to format cells that are above or below average. After adding the conditional format, specify whether the AboveBelow
property is xlAboveAverage
or xlBelowAverage
.
The following two macros highlight cells above and below average:
Four of the choices on the Top/Bottom Rules fly-out menu are controlled with the AddTop10
method. After you add the format condition, you need to set three properties that control how the condition is calculated:
• TopBottom
—Set this to either xlTop10Top
or xlTop10Bottom
.
• Value
—Set this to 5
for the top 5, 6
for the top 6, and so on.
• Percent
—Set this to False
if you want the top 10 item. Set this to True
if you want the top 10 percent of the items.
The following code highlights top or bottom cells:
The Remove Duplicates command on the Data tab of the Ribbon is a destructive command. You might want to mark the duplicates without removing them. If so, the AddUniqueValues
method marks the duplicate or unique cells.
After calling the method, set the DupeUnique
property to either xlUnique
or xlDuplicate
.
As I have ranted about in Excel 2010 In Depth (Que, ISBN 9780789743084), I do not really like either of these options. Choosing duplicate values marks both cells that contain the duplicate, as shown in Column A of Figure 15.10. For example, both A2 and A8 are marked, when A8 is really the only duplicate value.
Choosing unique values marks only the cells that do not have a duplicate, as shown in Column B of Figure 15.10. This leaves several cells unmarked. For example, none of the cells containing 17 is marked.
As any data analyst knows, the truly useful option would have been to mark the first unique value. In this wishful state, Excel would mark one instance of each unique value. In this case, the 17 in E2 would be marked, but any subsequent cells that contain 17 such as E8, would remain unmarked.
The code to mark duplicates or unique values is shown here:
To see a demo of marking duplicates, search for Excel VBA 15 at YouTube.
The value conditional formats have been around for several versions of Excel. Use the Add
method with the following arguments:
• Type
—In this section, the type will be xlCellValue
.
• Operator
—Can be xlBetween
, xlEqual
, xlGreater
, xlGreaterEqual
, xlLess
, xlLessEqual
, xlNotBetween
, xlNotEqual
.
• Formula1
—Formula1
is used with each of the operators specified to provide a numeric value.
• Formula2
—This is used for xlBetween
and xlNotBetween
.
The following code sample highlights cells based on their values:
When you are trying to highlight cells that contain a certain bit of text, you will use the Add
method, the xlTextString
type, and an operator of xlBeginsWith
, xlContains
, xlDoesNotContain
, or xlEndsWith
.
The following code highlights all cells that contain a capital letter A:
The date conditional formats were new in Excel 2007. The list of available date operators is a subset of the date operators available in the new pivot table filters. Use the Add
method, the xlTimePeriod
type, and one of these DateOperator
values: xlYesterday
, xlToday
, xlTomorrow
, xlLastWeek
, xlLast7Days
, xlThisWeek
, xlNextWeek
, xlLastMonth
, xlThisMonth
, xlNextMonth
.
The following code highlights all dates in the past week:
Buried deep within the Excel interface are options to format cells that contain blanks, contain errors, do not contain blanks, or do not contain errors. If you use the macro recorder, Excel uses the complicated xlExpression
version of conditional formatting. For example, to look for a blank, Excel will test to see whether the =LEN(TRIM(A1))=0
. Instead, you can use any of these four self-explanatory types. You are not required to use any other arguments with these new types:
The most powerful conditional format is still the xlExpression
type. In this type, you provide a formula for the active cell that evaluates to True
or False
. Make sure to write the formula with relative or absolute references so that the formula will be correct when Excel copies the formula to the remaining cells in the selection.
An infinite number of conditions can be identified with a formula. Two popular conditions are shown here.
In Column A of Figure 15.11, you would like to highlight the first occurrence of each value in the column. The highlighted cells will then contain a complete list of the unique numbers found in the column.
The macro should select Cells A1:A15. The formula should be written to return a True
or False
value for Cell A1. Because Excel logically copies this formula to the entire range, a careful combination of relative and absolute references should be used.
The formula can use the COUNTIF
function. Check to see how many times the range from A$1 to A1 contains the value A1
. If the result is equal to 1
, the condition is True
, and the cell is highlighted. The first formula is =COUNTIF(A$1:A1,A1)=1
. As the formula is copied down to, say A12, the formula changes to =COUNTIF(A$1:A12,A12)=1
.
The following macro creates the formatting shown in Column A of Figure 15.11:
Another example of a formula-based condition is when you want to highlight the entire row of a dataset in response to a value in one column. Consider the dataset in Cells D2:F15 of Figure 15.11. If you want to highlight the entire row that contains the largest sale, you select Cells D2:F15 and write a formula that works for Cell D2: =$F2=MAX($F$2:$F$15)
. The code required to format the row with the largest sales value is as follows:
NumberFormat
PropertyIn legacy versions of Excel, a cell that matched a conditional format could have a particular font, font color, border, or fill pattern. Starting in Excel 2007, you can also specify a number format. This can prove useful for selectively changing the number format used to display the values.
For example, you might want to display numbers above 999 in thousands, numbers above 999,999 in hundred thousands, and numbers above 9 million in millions.
If you turn on the macro recorder and attempt to record setting the conditional format to a custom number format, the Excel 2007 VBA macro recorder actually records the action of executing an XL4 macro! Skip the recorded code and use the NumberFormat
property as shown here:
Figure 15.12 shows the original numbers in Columns A:C. The results of running the macro are shown in Columns E:G. The dialog box shows the resulting conditional format rules.
In Chapter 16, “Reading from and Writing to the Web,” you learn how to use web queries to import data from the Internet to your Excel applications automatically.