Chapter 12

Extending Power Query with Custom Functions

In This Chapter

arrow Making a custom function

arrow Using custom functions in other queries

arrow Creating a parameter query

In Chapter 8, you see that Power Query records all actions using its own formula language (known as the M language). When you connect to a data source and apply transformations to that data, Power Query diligently saves your actions as M code behind the scenes in query steps. The transformation steps can then be repeated when you refresh the data in your query.

That backstage coding is relatively transparent, and can, for the most part, be ignored for most data processing activities. In this chapter, I show you how to leverage the M language to extend the capabilities of Power Query to create your own custom functions and perform truly heroic data processing.

Creating and Using a Basic Custom Function

When building a custom function for Power Query, you’re essentially doing nothing more than creating a query and manipulating its M code to return a desired result. That result can be an array, a data table, or a single value.

To help you gain a sense of the general steps taken to create a custom function, I show you now how to build a basic mathematical function that calculates profit. This function should be able to take a revenue amount and a cost amount and output a profit amount using this basic mathematical operation:

Revenue – Cost = Profit

For basic functions such as this one, you can start with a blank query and simply enter the needed M code from scratch. Follow these steps:

  1. Click the Data tab in Excel and select New Query ⇒ From Other Data Sources ⇒ Blank Query.

    This step activates the Query Editor window.

  2. On the Query Editor Ribbon, click on the View tab and select the Advanced Editor command.
  3. When the Advanced Editor window opens, delete the starter syntax you see in the code input box.
  4. Enter the following code into the code input box:

    let Profit = (Revenue, Cost)=>

    Revenue-Cost

    in Profit

    • Line 1 of the code tells Power Query that this is a function called Profit, requiring two parameters. For clarity, the two parameters are named Revenue and Cost, though Power Query doesn’t care what you name them as long as the names start with a letter and have no spaces.
    • Line 2 in the code essentially tells Power Query to subtract the Cost parameter from the Revenue parameter.
    • Line 3 of the code tells Power Query to return the result.

    Figure 12-1 illustrates what the code looks like in the Advanced Editor window.

  5. Click the Done button to close the Advanced Editor window.
  6. In the Query Settings pane, change the name of the query in the Name input box. The goal here is to give your function a reasonably descriptive name, as opposed to Query1. In this case, enter FunctionProfit in the Name input box.
  7. At this point, you can select the Home tab of the Query Editor and click the Close & Load button.

    As you can see in Figure 12-2, Power Query adds the query to the Workbook Queries pane as a connection-only query. Queries recognized as functions are automatically saved as connection-only.

image

Figure 12-1: Enter your custom code in the Advanced Editor window.

image

Figure 12-2: Your function is ready to use.

You can now use this function in other queries that contain revenue and cost fields. For example, Figure 12-3 illustrates the contents of the Chapter 12 Sample text file, which you can find in the download files for this book.

image

Figure 12-3: A text file containing Invoice details.

This text file contains a table of invoices with the fields Qty, UnitCost, and UnitPrice. Your newly created function can be used to calculate profit using these fields.

To create a new query from this text file, follow these steps:

  1. Click the Data tab in Excel and select New Query ⇒ From File ⇒ From Text.

    This step opens the Import Data dialog box.

  2. Browse for, and select, the Chapter_12 Sample text file.

    Power Query opens the text file and opens the Query Editor.

  3. While in the Query Editor, click the Add Column tab and then click the Add Custom Column button, as shown in Figure 12-4.

    The Add Custom Column dialog box opens, and you can call the custom function and pass it the needed parameters.

  4. In this case, enter the following line:

    = FunctionProfit([UnitPrice], [UnitCost])*[Quantity]

    This syntax calls the FunctionProfit custom function and passes the UnitPrice and UnitCost fields as the required parameters. The results are then multiplied by the Quantity field. The Add Custom Column dialog box should look similar to the one shown in Figure 12-5.

  5. Click the OK button to apply the custom column.

    When you confirm the changes, Power Query triggers the function for each row in the data table.

image

Figure 12-4: Add a custom column.

image

Figure 12-5: Use the Add Custom Column action to invoke your function.

Although this example is quite basic, it demonstrates that you can define a function that requires parameters and then use the function in other queries. This simple technique is the foundation for creating more useful functions.

remember Power Query functions are stored in the workbook in which they reside. Unfortunately, there’s no easy way to share functions between workbooks. If you start a new workbook, you need to re-create your functions in that new workbook.

Creating a Function to Merge Data from Multiple Excel Files

When building a basic function, such as the profit function you create in the earlier section “Creating and Using a Basic Custom Function,” it’s no big deal to start from a blank query and enter all the code from scratch. But for more complex functions, it’s generally smarter to build a starter query via Query Editor and then manipulate the M code to accomplish what you need.

Imagine that you have a set of Excel files in a folder (see Figure 12-6). These files all contain a worksheet named MySheet that holds a table of data. The tables in each file have the same structure, but need to be combined into one file. This is a common task/nightmare that most Excel analysts have faced at one time or another. If you don’t have a solid knowledge of Excel VBA programming, this task typically entails opening each file, copying the data on the MySheet tab, and then pasting the data into a single workbook.

image

Figure 12-6: You need to merge into one table the data in all the Excel files in this folder.

Power Query has the ability to make short work of this task, but it requires a bit of direction via a custom function. Now, it would be difficult for most anyone to start from a blank query and type out the M code for the relatively complex function needed for this endeavor. Instead, you could build a starter query via Query Editor and then wrap the query in a function.

To help you understand this concept, I present the following steps:

  1. On the Excel Data tab, select New Query ⇒ From File ⇒ From Workbook.
  2. Browse to the folder that contains all the Excel files, and choose only one of them.
  3. In the Navigator pane (shown in Figure 12-7), choose the sheet that holds the data that needs to be consolidated, and then click the Edit button to open the Query Editor.
  4. Use the Query Editor to apply some basic transformation actions.

    For example, in the Applied Steps shown in Figure 12-8, you see that First Row step was promoted to column headers and a few unneeded columns were removed.

  5. When all the needed transformations are complete, open the Advanced Editor window by clicking the View tab and selecting the Advanced Editor command.

    Figure 12-9 demonstrates that as you build out the starter template, Power Query diligently creates the bulk of the code for your function. Note in the portion of the code that’s highlighted in gray (for illustration), Power Query has hard-coded the file path and filename of the Excel file that was originally selected. The idea is to wrap this starter code in a function that passes a dynamic file path and filename.

  6. Wrap the entire block of code with function tags, specifying that this function requires two parameters: FilePath and FileName. Also replace the hard-coded file path and filename with each respective parameter.

    Here’s the syntax shown in Figure 12-10:

    let GetMyFiles=(FilePath, FileName) =>

    let

    Source = Excel.Workbook(File.Contents(FilePath&FileName)),

    MySheet1 = Source{[Name="MySheet"]}[Data],

    #"Promoted Headers" = Table.PromoteHeaders(MySheet1),

    #"Removed Columns" =

    Table.RemoveColumns(#"Promoted Headers",{"Branch_Number", "Effective_Date"})

    in

    #"Removed Columns"

    in GetMyFiles"

  7. Close the Advanced Editor.
  8. In the Query Settings pane, change the name of the query in the Name input box. Give the function a reasonably descriptive name, such as (in this scenario) fnGetMyFiles.
  9. Click the Home tab of the Query Editor and click the Close & Load button.

    At this point, the custom function is ready to be used on all files in the target folder.

  10. Click the Data tab in Excel and select New Query ⇒ From File ⇒ From Folder to start a connection to the folder that contains all the Excel files.
  11. In the From Folder dialog box, provide Power Query with the file path of the target folder.

    The Query Editor window activates to show you a table similar to the one shown in Figure 12-11. This table contains a record for each file in the chosen folder. The columns you’re interested in are Folder Path and Name, which provide the function with the needed FilePath and FileName parameters.

  12. Click the Add Column tab, and then click the Add Custom Column command.

    The Add Custom Column dialog box opens.

  13. Invoke the function and pass the Folder Path and Name fields as parameters separated by commas (see Figure 12-12).

    When you confirm your changes, Power Query triggers the function for each row in the data table. The function itself grabs the data from each file and returns a table array. Figure 12-13 shows the newly created custom column with a returned table array for each file (specified by a green Table hyperlink).

  14. Click the Expand icon for your new custom column.

    You see a list of fields included in each table array, as shown in Figure 12-14.

  15. Choose which fields in the table array to show, click the Expand radio button, and then click the OK button.

    With each table array expanded, Power Query exposes the columns pulled from each Excel file and adds the detailed records to the data preview. Figure 12-15 illustrates the data preview for the final combined table.

  16. At this point, you can remove unneeded columns and then click the Close & Load command to output the combined table.
image

Figure 12-7: Connect to one of the Excel files in the target folder, and navigate to the sheet holding the data that needs to be consolidated.

image

Figure 12-8: Use the Query Editor to apply any necessary transformation actions.

image

Figure 12-9: Open the Advanced Editor to see the starter code.

image

Figure 12-10: Wrapping the starter code with function tags and replacing the hard-coded names with your dynamic parameters.

image

Figure 12-11: Create a new query using the From Folder connection type to retrieve a table of all files in the target folder.

image

Figure 12-12: Use the Add Custom Column action to invoke the function.

image

Figure 12-13: Power Query triggers the function and returns a table array for each file in the folder.

image

Figure 12-14: Click the Custom column header to expand the table arrays.

image

Figure 12-15: Power Query exposes the columns pulled from each Excel file and adds the detailed records to create the final combined view.

As you look at the final combined view, don’t lose track of the fact that this relatively complex task was facilitated by a simple custom function. For all the steps required to accomplish this task, you expend very little effort on creating the code for the function. Power Query writes the code for the core functionality, and you simply wrap that code into a function.

The takeaway here is that you don’t have to be an expert on Power Query’s M language to pull together effective and useful custom functions. You can leverage the Query Editor to create some base code and then adjust from there.

Creating Parameter Queries

A parameter query is a kind of query that relies on one or more parameters to run. Although that sounds suspiciously like the custom functions covered earlier in this chapter (after all, they ran on parameters), there is a subtle difference.

A parameter query is one where you provide the parameters. So rather than have the parameters come from a predefined query, you enter the parameters. This comes in handy when creating interactive reporting for others to consume.

In this section, I walk you through creating your first parameter query.

Preparing for a parameter query

To create a proper parameter query, you first have to understand the parameters necessary to make your reporting interactive. The best way to gain this understanding is to explore the target data source.

In this scenario, I’ll tell you how to build an interactive view of the top-grossing films for any given year and month. To accomplish this task, leverage the Box Office Mojo website. Box Office Mojo provides an array of box office reporting tools, including a monthly index of top-grossing films.

The URL for the monthly index includes a yr parameter and a month parameter. Enter this URL into any browser and you see a list of the top-grossing films of December 2015:

www.boxofficemojo.com/monthly/?yr=2015&month=12

A look at the website (shown in Figure 12-16) confirms that the URL opens a web page that contains the table you would expect to see: an index of movies for December 2015 box office. The parameters in the URL are working as expected.

image

Figure 12-16: Confirming that the parameters in the URL actually work.

Now that you know the year and month number are the parameters, you can get started.

Creating the base query

The best place to start is to create the base query. The base query is essentially the one that will pull the data you’re working toward. In this scenario, you create a query that pulls the table shown in Figure 12-16 from the Box Office Mojo website.

Follow these steps:

  1. Open a new Excel workbook, and then select Data ⇒ New Query ⇒ From Other Sources ⇒ From Web.
  2. Enter a starting URL and then click OK. You can use the following URL:

    www.boxofficemojo.com/monthly/?yr=2015&month=12

  3. Use the Navigator pane to select the correct web table, and then click the Edit button to open the Query Editor.
  4. Use the Query Editor to rename columns and apply any transformations that are needed to clean the web data.

    Figure 12-17 illustrates a clean table that makes up the base query.

  5. After all needed transformations are complete, open the Advanced Editor window by clicking the View tab and selecting the Advanced Editor command.
  6. Wrap the entire block of code with function tags, specifying that this function requires two parameters: YearNum and MonthNum. Also replace the hard-coded year and month in the URL with each respective parameter.

    Here’s the final syntax shown in Figure 12-18:

    let TopMovies=(YearNum, MonthNum) =>

    let

    Source = Web.Page(Web.Contents(

    "http://www.boxofficemojo.com/monthly/?yr=" & Number.ToText(YearNum) & "&month=" & Number.ToText(MonthNum))),

    Data0 = Source{0}[Data],

    #"Renamed Columns" =

    Table.RenameColumns(Data0,{

    {"Movie Title (click to view)", "Movie Title"},

    {"Total Gross /Theaters", "Total Gross"},

    {"Total Gross /Theaters2", "Theaters"}}),

    #"Removed Columns" =

    Table.RemoveColumns(#"Renamed Columns",

    {"Opening /Theaters",

    "Opening /Theaters2", "Open", "Close"})

    in

    #"Removed Columns"

    in

    TopMovies

  7. Close the Advanced Editor.
  8. In the Query Settings pane, change the name of the query in the Name input box. In this scenario, it’s fnGetTopMovies.
  9. Click the Home tab of the Query Editor and click the Close & Load button.
image

Figure 12-17: The clean base query.

image

Figure 12-18: Wrapping the starter code with function tags and specifying a YearNum parameter and a MonthNum parameter.

You now have a fnGetTopMovies function, which can be used to pull web data from a custom function, and it’s ready to be used on all files in the target folder.

Creating the parameter query

The final step is to create the parameter query. To do so, you need a simple table that will serve as the feeder for your dynamic parameters.

Staying in the same workbook where you created fnGetTopMovies, create a table similar to the one shown in Figure 12-19.

image

Figure 12-19: Create a simple parameter table.

From here, follow these steps:

  1. Place the cursor in the parameter table, and then select Data ⇒ New Query ⇒ From Table.

    The Create Table dialog box opens.

  2. Click OK to continue.

    The Query Editor opens with the parameter table.

  3. Click the Add Column tab, and then click the Add Custom Column command.
  4. In the Add Custom Column dialog box, invoke the fnGetTopMovies function, passing the year and month fields as parameters (see Figure 12-20).

    Because you’re mixing data from the web with data from Excel (though the parameter table can hardly be considered data), Power Query initiates a few data-privacy precautionary measures.

  5. Click Continue.

    The Privacy Levels dialog box opens, as shown in Figure 12-21.

  6. Select Public for both the Current Workbook option and the website. Click the Save button to confirm and save the privacy levels.

    Power Query, at this point, imports data from the website based on the year and month in the parameter table.

  7. The data imports as a table array, so click the green Table hyperlink.

    Alternatively, you can click the Expand icon.

    Now that you’re basically done, it’s time to think about where the query should be loaded. If you simply click the Close & Load button, Power Query outputs the final parameter query in its own worksheet. However, it would be more practical to have the parameter table and query results on the same worksheet. This way, you can edit the parameters and see the results without having to flip between worksheets.

  8. Rather than click the Close & Load command button, click the drop-down arrow beneath the button (as shown in Figure 12-22) and select the Close & Load To option.
  9. In the Load To dialog box, choose the Existing Worksheet option, ensuring that you select a cell beneath the parameter table. (See Figure 12-23.)
  10. Click the Load button to finalize the query (see Figure 12-23).
image

Figure 12-20: Use the Add Custom Column action to invoke the function.

image

Figure 12-21: The combining of Excel and Web data triggers Power Query to ask about data privacy.

image

Figure 12-22: Selecting the Close & Load To option.

image

Figure 12-23: Choose to load the final query results under the parameters table.

Figure 12-24 illustrates the final parameter query. Take a moment to think about what’s happening here. With this parameter query, you enter a year and a month and click Refresh (or press Ctrl+Alt+F5). Power Query then dynamically imports data back from the Internet based on the parameters you entered — all without your having to enter more than three lines of M language syntax. Truly amazing.

image

Figure 12-24: The final parameter query provides an interactive mechanism to flexibly pull data based on dynamic parameters, all with virtually no coding.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset