Preparing Data with PowerPivot

With the Data Mining Add-In installed, and the SSAS server configured and connected to Excel, the next task is to prepare your data for data mining. PowerPivot for Excel is well suited for the assembly of the data.

Within the On Time Performance.xlsx file already used in the section titled “Adding the SSAS Connection Within the Data Mining Add-In,” create a new flattened PivotTable from the PowerPivot ribbon menu. Use a new worksheet as the destination for the flattened PivotTable. Use Month, DayofWeek, Carrier, Origin, Dest, and DistanceGroup as row labels. Drag WeatherDelay to the Values area of the PowerPivot Field List. When finished, your PivotTable should resemble Figure 10-27.

images

Figure 10-27. Flattened PivotTable

The reason for the flattened version of a PivotTable is the Data Mining Add-In's inability to consume data from a PivotTable. Because the add-in was created for use in Excel 2007, it understands only data tables. We can effectively work around this limitation by converting our flattened PivotTable into an ordinary Excel data table.

The first step in converting the PivotTable to an ordinary Excel data table is to remove any subtotaling that may have been introduced by PowerPivot. From the PivotTable Tools ribbon, select Design and then Subtotals. From the Subtotals pulldown, select Do Not Show Subtotals, as illustrated in Figure 10-28.

images

Figure 10-28. Removing PivotTable subtotals

To ensure an illustrative example, filter out all flights except those occurring in the month of February. Clicking the pulldown arrow to the right of the Month column will show a dialog for sorting and filtering the data, as illustrated in Figure 10-29.

images

Figure 10-29. Filtering the month

After removing the sub-totaling rows, we need to convert the flattened PivotTable into a data table usable by the Data Mining Add-In. Begin the conversion by selecting any cell within the flattened PivotTable. Then select the OLAP Tools from the Options sub-menu of the PivotTable Tools ribbon, as illustrated in Figure 10-30. Finally, from the OLAP Tools pulldown, select the Convert to Formulas option.

images

Figure 10-30. Convert to Formulas selection

After the conversion is complete, the PivotTable Tools ribbon will no longer be visible. Select the entire data table, using the Ctrl+A keyboard shortcut, and copy and “paste special” to convert the OLAP formulas to values. Finally, format the entire dataset as a table. After successful table formatting, the Table Tools ribbon will be visible. Now we can then begin to make some predictions with the data from PowerPivot.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset