Executing a job or a transformation from a job by setting arguments and parameters dynamically

Suppose that you developed a transformation which reads command-line arguments or defines named parameters. Now you want to call that transformation from a job, but you don't know the values for the arguments or the parameters; you have to take them from some media, for example, a file or a table in a database. This recipe shows you how to get those values and pass them to the transformation at runtime.

For this recipe, suppose that you want to create a file with a sequence of numbers. You have a transformation that does it. The problem is that the limits from and to and the increment value are stored in a properties file. This represents an obstacle to calling the transformation directly, but can be done with Kettle in a very simple way.

Getting ready

You need a sample transformation that generates a file with a sequence as described in the introduction. Make sure you have defined the variable ${OUTPUT_FOLDER} with the name of the destination folder. Also, make sure that the folder exists.

You also need a file named sequence.properties with the following content:

from=0
to=90
increment=30

With those values, your transformation should generate the values 0, 30, 60, 90.

How to do it...

Carry out the following steps:

  1. Create a transformation.
  2. From the Input category, drag a Property Input step into the canvas, and use it to read the properties file. Under the File tab, enter the name and location of the file. Under the Fields tab, click on Get Fields to fill the grid with the fields: Key and Value.
  3. From the Transform category, add a Row denormalizer step and create a hop from the input step toward this one.
  4. Double-click on the step. For the key field, select Key. Fill the Target fields: grid, as shown in the following screenshot:
    How to do it...
  5. After that step, add a Copy rows to result step. You will find it under the Job category.
  6. Do a preview on the last step. You should see the following screen:
    How to do it...
  7. Save the transformation and create a job.
  8. Drag a Start job entry and two Transformation job entries into the canvas. Link the entries, one after the other.
  9. Double-click on the first Transformation entry and for Transformation filename, select the transformation you just created.
  10. Close the window and double-click on the second Transformation entry.
  11. For Transformation filename, select the sample transformation gen_sequence.ktr.
  12. Select the Advanced tab and check the first three options: Copy previous results to args?, Copy previous results to parameters?, and Execute for every input row?
  13. Select the Parameters tab. For the first row in the grid, type INCREMENT under Parameter and increment_value under Stream column name.
  14. Close the window.
  15. Save and run the job.
  16. As a result, you will have a new file named sequence_0_90_30.txt. The file will contain the sequence of numbers 0, 30, 60, 90, just as expected.

How it works...

The transformation you ran in the recipe expects two arguments: from and to. It also has a named parameter: INCREMENT. There are a couple of ways to pass those values to the transformation:

  • Typing them at the command line when running the transformation with Pan or Kitchen (if the transformation is called by a job).
  • Typing them in the transformation or job setting window when running it with Spoon.
  • In a static way by providing fixed values in the Transformation entry setting window, as in the previous recipe.
  • Dynamically by taking the values from another source as you did in this recipe.

If the values for the arguments or parameters are stored in another media, for example, a table, an Excel sheet, or a properties file, you can easily read them and pass the values to the transformation. First, you call a transformation that creates a dataset with a single row with all required values. Then, you pass the values to the transformation by configuring the Advanced tab properly. Let's see an example.

In the recipe, you created a transformation that generates a single row with the three needed values: from_value, to_value, and increment_value. By adding a Copy rows to result step, that row became available for use later.

In the main job, you did the trick: By checking the Copy previous results to args? and Execute for every input row? options, you take that row and pass it to the transformation as if the fields were command-line arguments. That is, the values of the fields from_value, to_value, and increment_value—namely 0, 90, and 30—are seen by the transformation as if they were the command-line arguments 1, 2, and 3 respectively. Note that in this case, the transformation only read two of those arguments, the third one was ignored.

With regard to the named parameter, INCREMENT, you passed it to the transformation by checking the Copy previous results to parameters? option and adding a row in the Parameters tab grid. Here you entered the map between the named parameter INCREMENT and the incoming stream field increment_value.

There's more...

All that was said for the Transformation job entry is also valid for Job entries. That is, you can set the Advanced tab in a Job entry to copy the previous results as arguments or as parameters to the job that is going to be executed.

See also

  • The recipe named Executing a job or a transformation by setting static arguments and parameters in this chapter. This is useful if you know the values for the arguments and parameters beforehand.
  • The recipe named Executing a job or a transformation once for every row in dataset in this chapter. With this recipe, you can go further and learn how to run a transformation several times, with a different set of arguments or parameters.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset