Chapter 7. Executing and Reusing Jobs and Transformations

In this chapter, we will cover:

  • Executing a job or a transformation by setting static arguments and parameters
  • Executing a job or a transformation from a job by setting arguments and parameters dynamically
  • Executing a job or a transformation whose name is determined at runtime
  • Executing part of a job once for every row in the dataset
  • Executing part of a job several times until a condition is true
  • Creating a process flow
  • Moving part of a transformation to a subtransformation

Introduction

A transformation by itself rarely meets all the requirements of a real-world problem. It's common to face some of the following situations:

  • You need to execute the same transformation over and over again
  • You need to execute a transformation more than once, but with different parameters each time
  • You decide at runtime which job to run from a group of jobs
  • You have to reuse part of a transformation in a different scenario

Kettle is versatile enough to allow you to do that kind of thing. However, you may get confused trying to do some of them without guidance.

This chapter contains quick recipes just meant to teach you the basics. The transformations and jobs used are simple enough to serve as templates for you to modify for your own needs.

Before starting on the recipes, let's take a look at the following subsections:

  • Sample transformations: As the name suggests, this section explains the sample transformations that will be used throughout the chapter.
  • Launching jobs and transformations: This section quickly introduces Kitchen and Pan, the tools for launching jobs and transformations from the command line.

Sample transformations

The recipes in this chapter show you different ways of running Kettle transformations and jobs. In order to focus on the specific purposes of the recipes rather than on developing transformations, we've created some sample transformations that will be used throughout the chapter.

The transformations are described in the following subsections. You can download them from the book's website.

Note

These transformations generate files in a directory pointed to by a variable named ${OUTPUT_FOLDER}. In order to run the transformations, this variable must be predefined.

Tip

Remember that you have several ways of defining variables: as a named parameter, in the Kettle properties file, in a previous job or transformation (if this transformation is going to be called from a job) or in the Variables section of the Execute a transformation window (the window that shows up when you run the transformation from Spoon).

Sample transformation: Hello

This transformation receives the name of a person as the first command-line argument and generates a file saying hello to that person.

The transformation looks like the one shown in the following diagram:

Sample transformation: Hello

A sample output file is as follows:

Hello, Eva! It's January 09, 09:37.

Sample transformation: Random list

This transformation generates a file with a list of random integers. The quantity generated is defined as a named parameter called QUANTITY, with a default value of 10.

The transformation looks like the one depicted in the following diagram:

Sample transformation: Random list

A sample output file is as follows:

-982437245
1169516784
318652071
-576481306
1815968887

Sample transformation: Sequence

This transformation generates a file with a list of numbers. The transformation receives two command-line arguments representing FROM and TO values. It also has a named parameter called INCREMENT with a default of 1. The transformation generates a list of numbers between FROM and TO, with increments of INCREMENT.

The transformation looks like the one shown in the following diagram:

Sample transformation: Sequence

A sample output file using from=0, to=6, increment=2 is as follows:

0
2
4
6

Sample transformation: File list

This transformation generates a file containing the names of the files in the current directory.

The transformation looks like the one depicted in the following diagram:

Sample transformation: File list

A sample output file is as follows:

gen_random.ktr
gen_sequence.ktr
get_file_names.ktr
hello.ktr

Launching jobs and transformations

As said, the recipes in this chapter are focused on different ways of running Kettle transformations and jobs. Ultimately, you will end up with a main job. In order to test your job with different inputs or parameters, you can use Spoon as usual, but it might be useful or even simpler to use Kitchen, a command-line program meant to launch Kettle jobs. If you're not familiar with Kitchen, this section gives you a quick review.

In order to run a job with Kitchen:

  1. Open the terminal window
  2. Go to the Kettle installation directory
  3. Run kitchen.bat /file:<kjb file name> (Windows system) or kitchen.sh /file:<kjb file name> (Unix-based system), where<kjb file name> is the name of your job, including the complete path. If the name contains spaces, you must surround it with double quotes.

If you want to provide command-line parameters, just type them in order as part of the command.

If you want to provide a named parameter, use the following syntax:

/param:<parameter name>=<parameter value>

For example, /param:INCREMENT=5

Additionally, you can specify the logging level by adding the following option:

/level:<logging level>

The logging level can be one of the following: Error, Nothing, Minimal, Basic (this is the default level), Detailed, Debug, or Rowlevel.

If you intend to run a transformation instead of a job, use Pan: Just replace kitchen.bat/kitchen.sh with pan.bat/pan.sh, and provide the name of the proper .ktr file.

While you use Spoon for developing, debugging and testing transformations and jobs, Kitchen and Pan are most commonly used for running jobs and transformations in production environments. For a complete list of available options and more information on these commands, visit the Pan documentation at the following URL:

http://wiki.pentaho.com/display/EAI/Pan+User+Documentation

For Kitchen documentation, visit the following URL:

http://wiki.pentaho.com/display/EAI/Kitchen+User+Documentation

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset