Writing an Excel file with a dynamic number of sheets

When you generate an Excel file, you usually generate it with a single sheet. You can however generate a file with more sheets. With PDI, you can generate an Excel file with several sheets even if you don't know in advance how many sheets you will generate, or the name of those sheets.

In this recipe, you will create such an Excel file. Your file will have book title information separated in different sheets depending of the genre of the books.

Getting ready

You will need a database containing books and authors with the structure described in Appendix, Data Structures.

How to do it...

Carry out the following steps:

  1. Create a new job.
  2. From the File Management category, drop a Delete file job entry into the work area.
  3. In the File name textbox, type the path and name of the Excel file you will create, in order to remove the file if it exists.
  4. Then you have to add two Transformation entries: one for selecting the book's categories (Transf_Categories) and another to write the specific sheet for each category (Trans_BookByCategory). The job should look like the following:
    How to do it...
  5. Create the transformation named Transf_Categories.
  6. In this transformation, drop a Table input step, in order to obtain the different book's categories. The SQL statement, should be similar to the following:
    SELECT DISTINCT genre FROM Books ORDER BY genre
    
  7. Add a Copy rows to result from the Job folder and create a hop from the Excel output step towards this one.
  8. Create the second transformation called Trans_BookByCategory.
  9. In the Transformation settings (CTRL-T), go to the Parameters tab, and add a new parameter named GENRE without default.
  10. Drop a Table input step into the canvas. In the SQL frame, type the following statement, in order to select the books depending on the GENRE parameter:
    SELECT * FROM Books WHERE genre='${GENRE}'
    
  11. In this step, check the prompt Replace variables in script?
  12. Add an Excel output step.
  13. In the Filename textbox under the File tab, type the destination path and file. In the Extension textbox, leave the default value xls.
  14. Under the Content tab, be sure to check Append.
  15. Also here, in the Sheet name textbox, type ${GENRE}.
  16. Under the Field tab click on the Get Fields button.
  17. Come back to the job; edit the job entry details for the transformation Trans_BookByCategory. Go to the Advanced tab and check Copy previous result to parameters? and Execute for every input row? checkboxes.
  18. Under the Parameters tab, add a new value: type GENRE in the Parameter column, and genre for the Stream column name.
  19. When you run the job, the Excel file created should have a different sheet for each category, for example:
How to do it...

How it works...

When you have to execute the same task over and over again, the solution is to create a loop that executes a single transformation or job, as many times as needed. In this case, the goal was to create a new Excel sheet for each book category. So, the solution was:

  • Creating a transformation (Transf_Categories) that builds the list of categories.
  • Creating another transformation (Trans_BookByCategory) that appends a single sheet to the Excel file.
  • Calling the second transformation once for each category in the list by copying the rows to result in the first transformation, and checking the execute for every input row checkbox in the Job entry belonging to the second transformation.

The main task was in the second transformation. In order to know which book categories to write each time, in that transformation you defined a parameter named GENRE. Then you used the GENRE parameter for filtering in the SQL statement and also for naming the Excel file sheet. The parameter is sent to the transformation because in the job, you set the Copy previous result to parameters? checkbox, and configured the Parameters tab properly.

Note

Note that in the Excel Output step, you checked the Append option, so that every time the transformation is executed, it creates a new sheet without loosing the sheets previously generated. Also note that you deleted the file at the beginning for cleaning purposes.

See also

  • The section named Writing an Excel file with several sheets in this chapter. See this recipe for an example of generating an Excel file with a small fixed number of sheets.
  • The section named Executing a transformation once for every row in the dataset in Chapter 7, Executing and Reusing Jobs and Transformations. See it in order to understand how loops work.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset