Generating a custom log file

When you run a transformation or a job, all of what is happening in the process is shown in the Execution Results window, which has a tab named Logging where you can check the execution of your transformation step by step. By default, the level of the logging detail is Basic, but you can change it to show different levels of detail.

Under the Logging tab, you can see information about how the step is performing, for example, the number of rows coming from previous steps, the number of rows read, the number of rows written, errors in execution, and so on. All this data is provided by the steps automatically, but what if you want to write your custom messages to the Logging information? To do this, there is a step and an entry named Write to log, in the Utility folder.

To put them into practice, let's take a simple transformation that reads a text file with book novelties and splits them into two Excel files depending on their price. The objective here is to include, in the Logging window, custom messages about the incoming number of books and also how many of these books are cheap or expensive.

Getting ready

For checking this recipe, you will need a text file that includes information about book titles novelties. For example:

title;author_id;price;title_id;genre
Bag of Bones;A00002;51,99;123-353;Fiction
Basket Case;A00003;31,00;123-506;Fiction
Carrie;A00002;41,00;123-346;Fiction
Cashflow Quadrant;A00007;55,00;323-604;Business
Harry Potter and the Half-Blood Prince;A00008;61,00;423-005;Childrens
Harry Potter and the Prisoner of Azkaban;A00008;29,00;423-003;Childrens
Power to the People;A00005;33,00;223-302;Non-fiction
Who Took My Money;A00007;21,00;323-603;Business

You can download the sample file from the book's website.

How to do it...

Carry out the following steps:

  1. Create a new transformation.
  2. Drop a Text file input step into the canvas. Set the file to read under the File tab, and type; as the character Separator under the Content tab. Finally, use the Get Fields button under the Fields tab to populate the grid automatically.

Previewing this step, you will obtain the data of the books from the text file. Now, let's add the steps for counting the books and writing the information.

  1. Add a Group by step from the Statistics folder. Create a hop from the text file to this step. In the Aggregates grid at the bottom, add a new field named qty, choose a field (for example, title_id) in the Subject column, and select the Number of Values (N) option in the Type column.
  2. Add a User Defined Java Expression (UDJE for short) from the Scripting category and link it to the previous step. Create a new field named line of String type with the following Java expression:

    "Book news = " + Java.lang.Long.toString(qty)

  3. From the Utility folder, add a Write to log step and create a hop from the previous step towards this one; name it Write books counting. Add the line field to the Fields grid. Choose Basic logging in the Log level listbox.

Run the transformation using Basic logging and check the Logging tab for the results.

You can verify the basic logging information where you should see the following line:

2011/01/25 10:40:40 - Write books counting.0 - Book news = 8

Now, you will generate two Excel files and write the information about cheap and expensive books to the log.

  1. Drop one Filter rows, two Excel output, two Group by, two UDJE, two Block this step until steps finish (from Flow category), and two Write to log steps into the canvas. Link the steps, as shown in the following diagram:
    How to do it...
  2. Create a hop from the Text file input step to the Filter rows step. Here, set the condition price ' 50. We will use an arbitrary price of $50 to determine if a book is cheap or expensive.
  3. Point one Excel Output step filename to cheapBooks.xls, and use the Get Fields button to populate the grid under the Fields tab.
  4. In the Group by step, add a new field named qty in the Aggregates grid, choose the field title_id in the Subject column, and select the Number of Values (N) option in the Type column.
  5. Add a field named line of String type in the UDJE step with the following Java expression:

    "Cheap books = " + Java.lang.Long.toString(qty)

  6. In the Block this step until steps finish steps, select the step named Write books counting in the step name column of the grid.
  7. Finally, in the Write to log step, add the line field to the Fields grid and choose Basic logging in the Log level listbox.
  8. Now, repeat the last five steps in order to configure the lower stream. This time use the Excel Output step (named Excel Output 2) to generate the expensiveBooks.xls file and replace the text Cheap for Expensive in the other UDJE step.

Running the transformation using Basic logging, you can verify that your custom messages have been written to the log under the Logging tab. Here's an example:

How to do it...

How it works...

The main objective in this recipe is to explain how you can write personalized messages to the logging windows. The task of the transformation is simple—it reads a text file at the beginning, uses a Filter rows step to split the list of books into cheap and expensive ones, and then writes two Excel spreadsheets with these details.

Now, let's analyze the task of customizing the log. In the first part of the recipe you wrote into the log the number of books:

  • The Group by step does the counting
  • The UDJE step creates the personalized string in a new field
  • Finally, the Write to log step write this string to the log

After each of the Excel output steps, there is the same sequence of steps (Group by, UDJE, and Write to log), in order to write a message with the number of cheap books and the number of expensive books into the log.

There is also a Block this step until steps finish step in this sequence before the Write to log step, this is because you want to be sure that the total number of books will be written first.

There's more...

This recipe showed you the classic way for adding messages to the log. The following subsections show you some variants or alternative approaches.

Filtering the log file

Instead of adding text to the log, you may want to filter text in the existing log.

If you select the Logging tab in the Execution Results window, then you will see a toolbar. Under that toolbar, there is a button named Log settings that allows you to apply a filter. If you type some text into the Select filter textbox, then the log of the subsequent executions of the transformation will only show the lines that contain that text. For example, you could use the same text prefix in all of your custom messages, and then apply a filter using this fixed text to see only those messages.

This also works if you run the transformation from a job, and even if you restart Spoon because the filter is saved in a Kettle configuration file.

This is valid only in Spoon. If you intend to run a job or a transformation with Pan or Kitchen and you need to filter a log, for example, by keeping only the lines that contain certain words, then you have to take another approach. One way for performing that would be to save the job or transformation log in a file and then run another transformation that parses and filters that log file.

Creating a clean log file

The Logging window shows not only your messages, but also all of the information written by the steps. Sometimes you need a clean log file showing only your custom messages and discarding the rest.

There is another problem related to that - when you configure the Write to log step, you need to specify a level for your log. (In the recipe, you used Basic logging). If you run your transformation using a different log level, then you will not see any of your personalized messages.

One alternative would be using a Text file output instead of the Write to log step. With this, you will produce a new text file with only your desired messages. Be sure to point all of the Text file output steps to the same Filename under the File tab, and use the Append checkbox under the Content tab, in order to avoid overwriting the file with each run.

Isolating log files for different jobs or transformations

It is possible that you want to see different log levels depending on the job or transformation, or that you simply want to isolate the log for a particular job or transformation. This is a simple task to accomplish. In the main job, right-click on the Job or Transformation entry of interest; under the Logging settings tab check the Specify logfile? option and you will be able to specify a name for the log file as well as the log level desired. In this way, you can create different log files with different log levels for each of the Jobs and Transformations that are part of your main job.

See also

  • The recipe named Sending e-mails with attached files in this chapter. In this recipe, you learn how to send logs through e-mail.
  • The recipe named Programming custom functionality in this chapter. This recipe shows you how to send custom messages to the Logging window from the User Defined Java Class and Modified Java Script Values steps.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset