PowerPivot As a Data Source for PowerPivot

An existing PowerPivot for SharePoint solution can act as a data source from which related PowerPivot solutions can be created. Figure 8-1 shows an example scenario.

Because a PowerPivot for Excel solution, once deployed to PowerPivot for SharePoint, creates a SQL Server Analysis Services database, the database can be accessed by other PowerPivot solutions. This ability to serve as a data source is limited to PowerPivot for SharePoint, however, as connections to the SSAS instance behind PowerPivot for Excel are not supported in the same manner as PowerPivot for SharePoint.

Consider the following scenario: An enterprise has created two separate PowerPivot for SharePoint solutions, one for analysis of sales and another for analysis of production. Audiences in the sales management and production operations management areas use each PowerPivot solution separately, updating the PowerPivot database with new, fresh data on independent schedules. However, summary-level information from each of the departments, sales and production operations, is of interest to a single analyst reporting directly to the chief operating officer (COO). The COO's analyst could create from scratch a solution to report key sales and operations metrics, or that analyst could leverage the existing solutions as data sources for a third, summary-level PowerPivot solution. As the underlying data in the PowerPivot for SharePoint solutions used by sales and production is updated, the new data would be available to the COO-level solution.

One of the advantages to using one PowerPivot solution as a data source for another is that the technique allows your solutions to steer away from the “spreadmart” chaos of the past. Spreadsheets stored on file shares are replaced with orderly, indexed, and searchable storage in PowerPivot Galleries. Furthermore, PowerPivot's robust, built-in ETL replaces relatively brittle and inelegant cell and range linking within spreadsheet files.

The other, more obvious advantage is the reuse of existing analytical work to create a new perspective. Very similar to the concept of a mash-up application, PowerPivot for SharePoint solutions can be pieced together to create new insight, as illustrated in Figure 8-1.

images

Figure 8-1. Combining PowerPivot for SharePoint solutions

To maintain clarity as we discuss the scenario in Figure 8-1, the Sales and Production Operations solutions will be referred to as primary solutions. In other words, those solutions connect directly to the primary data sources the transactional systems for sales, manufacturing operations, etc. The summary solution uses each of the primary solutions as the data sources, instead of connecting firsthand to the operational data sources.

Verifying the Solution Prerequisites

There are pre-conditions to understand and verify before using the solution in Figure 8-1. First, both the computer on which the PowerPivot for Excel solution will be developed and the PowerPivot for SharePoint instance should be on the same domain. It is possible to upload both the Sales and Production Operations solutions from a computer not on the domain of the SharePoint server. However, in order to author the PowerPivot solution that uses these (Sales and Production) primary solutions as a data source, both primary solutions must be on the same domain.

Second, it may be necessary to open firewall ports on the PowerPivot for SharePoint server, in order to allow incoming connections from the PowerPivot for Excel development computer. Microsoft Books Online (BOL) describes this as not being the case, as the connection should be from the PowerPivot for Excel computer via port 80 for http (or 443 for https), a port already established for inbound connections.

Finally, the PowerPivot for Excel user developing the derived solution must use a domain account that has farm administrator privileges on the PowerPivot for SharePoint farm. This level of security is generally limited to a small group within an enterprise SharePoint implementation. You may find that your organization is unwilling to give you such access.

Nonetheless, utilizing PowerPivot as a data source is an important technique to understand as there are environments where the solution is useful. Moreover, as PowerPivot for SharePoint matures, the security privilege limitation may be eliminated. Finally, the technical foundation of PowerPivot is the creation of SQL Server Analysis Services databases, in order to facilitate rapid, ad hoc business intelligence. Understanding how the solution in Figure 8-1 is assembled will give you familiarity with accessing multi-dimensional databases (such as non-PowerPivot SSAS databases).

Assembling the Pieces

I have created two primary solutions, named Sales.xlsx and Operations.xlsx, containing sales measurements and operations measurements respectively. To avoid being bogged down in the details of actual data, each solution has two years of data, for two imaginary products: Widgets and Sprockets. Both solutions have been deployed to a PowerPivot Gallery on a SharePoint server, also a member of my demonstration domain. The idea is to illustrate the combination possibilities of PowerPivot for SharePoint, using the underlying data SSAS databases, not to create an exhaustive sales and production operations database.

Begin the summary solution similarly to any PowerPivot for Excel development. On our development computer, we open a new Excel worksheet and use the PowerPivot ribbon command to open a corresponding PowerPivot window. From the PowerPivot window, we can add a new data source, similar to any other source database. From the Get External Data set of the PowerPivot ribbon, choose From Database and then the From Analysis Services or PowerPivot option. This will launch the Table Import Wizard, as illustrated in Figure 8-2.

images

Figure 8-2. Table Import Wizard for PowerPivot data source

In the Table Import Wizard, enter information similar to that in Figure 8-2. In the illustration, we are using the Internet Protocol (IP) address 192.168.1.120 in order to access the PowerPivot for SharePoint server on this development domain. Your address or server name will be different. One thing that will be the same is the “PowerPivot” immediately following the server name or server IP address. The text “PowerPivot” modifies the connection string for a specific, named instance of SSAS on the target server. If we attempted to connect without this suffix, the connection would be made (or attempted to be made) to the default instance of SSAS on the target server. Recall from Chapter 6 that one of the requirements of the initial release of PowerPivot for SharePoint is the use of a specific, named SSAS instance (“PowerPivot”) on the SharePoint server.

After completing the Server Name of the Table Import Wizard, ensure the Use Windows Authentication radio button is checked. Pressing the Test Connection button near the bottom of the dialog will verify the ability to connect from the PowerPivot for Excel development computer to the PowerPivot for SharePoint database.

After establishing the ability to connect to PowerPivot for SharePoint, the next step is creating the queries to fetch data into the derived solution. From the Table Import Wizard, as illustrated in Figure 8-2, use the Database Name pulldown list to select a PowerPivot database. You will notice the names of the solutions in the Table Import Wizard are different from the actual solution (.xlsx file names). For example, a unique ID has been added to all of the PowerPivot for SharePoint solution names. Additionally there are two versions of the NCAA Dashboard solution. This is because of SharePoint document library revision control. There are two revisions of the NCAA Dashboard published on my PowerPivot for SharePoint server; therefore, there are two potential SSAS databases for the Table Import Wizard. The multiple revisions of a given file will reinforce the next point. Use the file name URL in the connection instead of the server+instance (serverPowerPivot) and a database name.

The reason for using a file instead of the server and database name is when a connection request is made to the underlying database, the server+database name connection bypasses PowerPivot for SharePoint web services that will ensure the PowerPivot file is loaded on a server, even if it is currently unloaded. Second, the PowerPivot for SharePoint web service will load the latest, published version of the file, eliminating the need to manage the workbook name and GUID in the database connection strings.

In our example, we will begin by creating the query for the data from the Sales primary solution. The Table Import Wizard defaults to a Mulitdimensional Expressions (MDX) dialog, where an MDX string can be used if you are familiar with both MDX and the source SSAS database. Choose instead the Query Designer option, and you will be presented with a dialog similar to Figure 8-3. This tool will allow creation of a dataset from our PowerPivot for SharePoint SSAS database with no understanding of MDX.

images

Figure 8-3. Table Import Wizard Query Designer

The left-hand explorer-like window allows for navigation of the PowerPivot for SharePoint data structure. Each element from the explorer can be dropped onto the query dataset canvas. If you are using the Sales.xlsx solution provided in the book's example download, drag the Product and Sales Dollars columns onto the canvas. The query designer will respond with two rows in the dataset, one for each of the two products, Widgets and Sprockets. Additionally, the Sales Dollars column has been renamed to Sum of Sales Dollars. This is by design, as the default aggregation method of the Sales Dollars measure. Complete the query design by adding the Year and SortMonth columns, as illustrated in Figure 8-4.

images

Figure 8-4. Sales query design complete

Pressing the OK button at this point will close the Query Designer and return to the original MDX text dialog with the completed query for the Sales dataset. The completed MDX query will look similar to that in Figure 8-5.

images

Figure 8-5. MDX query

Pressing the OK button from the Specify MDX Query dialog will execute the import of the dataset as reflected earlier in the Query Designer. You should see a familiar Table Import Wizard completion dialog, similar to Figure 8-6. In our case, with two years of monthly data, for two products, we expected to see 48 (2*12*2) rows imported.

images

Figure 8-6. Sales table import complete

Notice in Figure 8-6 that PowerPivot has used the friendly query name “Sales” from the designer as the PowerPivot table name. However, the column names have not been named with an easily recognized naming convention. We do have the primary measure, sales in dollars, and keys for a time period and product.

Establishing the data connection for the production operations primary solution is similar. In the Table Import Wizard, the Production solution is specified in the database name pulldown list, as illustrated in Figure 8-7.

images

Figure 8-7. Connection to Production data

Additionally, in the Table Import Wizard Query Designer, we will select similar dimension key columns, as well as the two measures available in the Production primary solution. Similar to the illustration in Figure 8-8, the Sum of Production Cost and Sum of Production Units will be dropped onto the query design surface. Adding to the query design surface the columns for Year, SortMonth, and Product will generate rows in the dataset that detail the Units and Cost measures by all of the combinations of Year, SortMonth, and Product.

images

Figure 8-8. Query Designer for Production data

Pressing the OK button from the Query Designer will preview the MDX generated for the import of the Production data. Figure 8-9 illustrates the completion of the Table Import Wizard, and the new PowerPivot table created with the Production Operations data.

images

Figure 8-9. Production Operations PowerPivot table

A potential issue with using PowerPivot for SharePoint as a data source exists in how SSAS in SharePoint integrated mode unloads databases based on query usage by PowerPivot for SharePoint. If you recall from the PowerPivot Management Dashboard, in addition to loading databases into memory, PowerPivot will also free memory by unloading a database that is being under-utilized. The issue will be manifested by the desired database not being available in the Table Import Wizard pulldown. As an example, my development environment has unloaded the Production solution used as a primary source in this chapter. Illustrated in Figure 8-10, the pulldown list of databases is missing the Production solution and instead is showing both versions of the NCAA Dashboard and the PowerPivot Management Data solution. In order for the Production solution to reappear in the Database name pulldown, PowerPivot for SharePoint must reload the database into the SSAS instance running in SharePoint integrated mode.

images

Figure 8-10. Missing Production database name

Loading a solution into SSAS is as simple as executing a query that requires the SSAS database for fulfillment. Opening the solution from within PowerPivot for SharePoint will not necessarily cause a database query operation. However, opening the solution and interacting with one of the slicers or expanding a drill-down will generate a query. PowerPivot for SharePoint will react to the slicer value change by dispatching a request to load the database for the current PowerPivot into the SSAS instance (PowerPivot) servicing PowerPivot queries. The database will be loaded and the query results returned to PowerPivot for SharePoint, where the Excel Services user interface will be updated. As a practical example, changing the selected slicer values for Product in the Production solution causes the database to load. The database name would then be available in the Table Import Wizard dialog, as pictured in Figure 8-11.

images

Figure 8-11. Database name available in Table Import Wizard

With data from both primary solutions, Sales and Production, available in our derived solution, PowerPivot reporting works the same as any other data source. There is a nuance to using PowerPivot as a data source that may not be readily apparent, however. The measures from the primary solutions (Sales and Production) are given a data type of text by the Table Import Wizard. This is in spite of the fact that each of the primary solutions defines the Sales Dollars, Production Units, and Production Cost measures as whole numbers. Figure 8-12 illustrates the default Table Import Wizard treatment of the Sales Dollars measurement as a text field.

images

Figure 8-12. Default data type for imported measure

The measure's data type can be changed to a whole number by doing the following. First, from the PowerPivot window, select the measure column by clicking in a cell within the PowerPivot table display. From the Formatting section of the ribbon, use the Data Type pulldown as illustrated in Figure 8-13 to change the column data type from Text to Whole Number. Unless there are actual text values in the data that will not convert to a number, PowerPivot will respond by changing the data type and format of the column. Aggregation operations such as Sum, Min, Max, and Average will now be possible from within a PivotTable or PivotChart via the PowerPivot Field List.

images

Figure 8-13. Changing a column data type

One final dataset needs to be incorporated into the summary solution, in order to have meaningful slicers. Recall from Figure 8-1 that we are combining two separate yet related fact sets to create a third, derived solution. Because each of the fact sets contains a Product identifier, we may be tempted to relate the Sales and ProductionSummary tables via the Product column. This approach would not be desirable based on the limitations, described in Chapter 4, that this places on our ability to report on both fact sets. However, because each primary solution contains many rows for each Product, relating the two tables (Sales and ProductionSummary) via Product is not possible. The many-to-many relationship this would attempt to create is not available in the initial release of PowerPivot.

Instead, we need to create a PowerPivot table containing the distinct master list of products. The “master” product list can become an additional PowerPivot table and a relationship with both measure sets (Sales and ProductionSummary) established. A slicer can then be based on the new Product table, and reporting for both Sales and ProductionSummary data can be modified using a consistent domain of products.

The trick in creating the master product list is dealing with limitations in the Table Import Wizard Query Designer. This feature was designed to produce datasets consisting of measures, described by columns containing dimension values, identical to how the Table Import Wizard was used to create the Sales and ProductionSummary PowerPivot tables. However, when used to create a dimension table, as we need for the master product list, the Query Designer has a few problems. Because we are using an SSAS data source (our PowerPivot for SharePoint primary solutions), there is a prescribed syntax for MDX to which the master product list query must conform. One of the syntax limitations is the inability to create a result set consisting of a single dimensional value column, without measures. We could, in straightforward MDX, create a set of columns for each distinct product, but the result would not be well suited for PowerPivot. Instead, the most expeditious means to create our master product list is by creating a query of a single measure, grouped by distinct values of Product. As illustrated in Figure 8-14, we have created from the ProductionSummary primary solution a query of the sum of Production Cost, by Product. This produces the distinct list of Products for the eventual product dimension table.

images

Figure 8-14. Product Dimension Query Designer

It is important for me to say here that the only reason for including the Sum of Production Cost column is to create a dataset with one row per Product. As you apply this technique in your own solution designs, ensure that you choose as a dimension source the dataset representing the more complete list of dimension members. Completing the Table Import Wizard for the new Product table will result in a table similar to Figure 8-15. The first column in the table has been renamed to “Product” for clarity.

images

Figure 8-15. Product table

With the creation of the Product table, relationships from each of the Sales and ProductionSummary tables can be established. Figure 8-16 illustrates the final result of the example summary solution. A Product slicer has been added, sourced from the Product column of the Product table. Each of the PivotCharts detailing Sales or ProductionSummary measures are influenced by changes in the Product slicer. This allows for interactive use of the report by the consumer.

images

Figure 8-16. Final summary solution

One of the benefits of the technique just illustrated in this section is ensuring the data in the summary solution is in agreement with the source solutions. Another benefit is eliminating some of the details and complexity that exist in the primary solutions that may be of little or no interest to the summary solution audience.

Refreshing the Summary

What is the process by which the summary solution is refreshed? That's a logical question to ask at this point.

At a minimum, there are two steps to the data refresh process. First, new data has to be made available to the primary solutions. In the case of this example, a linked table contains the production summary data. Adding rows to the linked table as illustrated in Figure 8-17 will update the primary solution, completing the first step.

images

Figure 8-17. Data for January, 2011 added to ProductionSummary

The second step, configuring data refresh for the summary solution, is a little more involved. However, once data refresh is configured, the summary solution will reflect new data from the component solutions almost as soon as it is available. To configure data refresh for the COO Summary PowerPivot for SharePoint solution, begin by clicking the calendar icon for the solution in either of the visual Gallery views (Gallery, Theater, or Carousel). The Gallery version of this is illustrated in the upper right-hand corner of Figure 8-18. In the All Documents view, use the pulldown context menu for the document to access the Manage PowerPivot Data Refresh option.

images

Figure 8-18. Manage Data Refresh

Regardless of the exact access method, each will result in the one of two dialogs being displayed. For a PowerPivot solution with no data refresh schedule currently configured, a dialog such as Figure 8-19 will be displayed. This Manage Data Refresh dialog will allow for activation of data refresh features, establish a schedule, set e-mail notification addresses, and configure the security credentials for the data refresh process.

images

Figure 8-19. Manage Data Refresh configuration

However, if the workbook for which data refresh features are being accessed is currently configured to use data refresh, a different dialog is presented. As illustrated in Figure 8-20, a summary of the refresh operations and a detailed history of the success or failure of each data refresh are listed for the selected workbook. Clicking the “Configure Schedule…” link will present the detailed settings for this workbook, as detailed in Figure 8-19.

images

Figure 8-20. Data refresh history

Configuring data refresh is for the most part self-evident. Working from top to bottom from the Manage Data Refresh dialog (Figure 8-19), make the following decisions. First, checking the Enable check box actually makes the schedule active, making it possible to configure and pause or resume data refresh. Establish the schedule as Daily, Weekly, Monthly, or Once (one-time refresh). Each radio button will reveal relevant schedule options based on the time frame. The earliest start time is only a “suggested” time for the refresh to begin. Other system load and activities may cause the refresh to occur later.

The only other configuration option commonly questioned is the Data Source Credentials section. There are three options: use the data refresh account configured by the administrator, connect using a specified Windows account, or connect using Secure Store Service credentials. For our demo purposes, we can use the data refresh account configured by the administrator, assuming you have followed directions detailed in Chapter 6 to establish a development and training PowerPivot for SharePoint environment.

Finally, each data source that is part of a solution can have a separate schedule. By clicking the down arrow to the right of each of the solution's data sources, a separate schedule and credential setting can be established for each source. If we know Sales.xlsx, for example, will only be updated monthly after the second Monday of the month, we can set the data source refresh to occur immediately after the update, as illustrated in Figure 8-21.

images

Figure 8-21. Data Source Schedule

While the PowerPivot data refresh schedule options are robust, the intent is to provide a facility to ensure the correct, fresh data is available in PowerPivot solutions. For example, in this initial release, features to attempt a failed refresh are not immediately available. Instead an e-mail notification of a failure is sent, based on the schedule configuration.

It is also possible to create an Excel report from a PowerPivot for SharePoint solution, using Office Data Connection (.odc) files. Strictly speaking, this technique does use PowerPivot as a data source, but does not leverage PowerPivot for display and user interaction as we have just detailed. Simply clicking the new report icon, as illustrated in Figure 8-22, to open an Excel workbook using the selected PowerPivot for SharePoint solution will prompt for file save or open dialog and result in an Excel spreadsheet PivotTable based on the PowerPivot source. None of the features of PowerPivot will be available in the user interface using this solution technique, as PowerPivot will only be a data source.

images

Figure 8-22. Open in new Excel workbook

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset