The Pentaho User Console (PUC) is a web application included with the Pentaho Server conveniently built for you to generate reports, browse cubes, explore dashboards, and more. Among the list of tasks, you can do is the ability of running Kettle jobs. As said in the previous recipe, everything in the Pentaho platform is made up of action sequences. Therefore, if you intend to run a job from the PUC, you have to create an action sequence that does it.
For this recipe, you will use a job which simply deletes all files with extension tmp
found in a given folder. The objective is to run the job from the PUC through an action sequence.
In order to follow this recipe, you will need a basic understanding of action sequences and at least some experience with the Pentaho BI Server and Pentaho Design Studio, the action sequences editor.
Before proceeding, make sure you have a Pentaho BI Server running. You will also need Pentaho Design Studio; you can download the latest version from the following URL:
http://sourceforge.net/projects/pentaho/files/Design%20Studio/
Besides, you will need a job like the one described in the introduction of the recipe. The job should have a named parameter called TMP_FOLDER
and simply delete all files with extension .tmp
found in that folder.
You can develop the job before proceeding (call it delete_files.kjb)
, or download it from the book's site.
Finally, pick a directory on your computer (or create one) with some tmp
files for deleting.
This recipe is split into two parts: First, you will create the action sequence and then you will test the action sequence from the PUC.
delete_files.xaction
. folder
. As Default Value, type the name of the folder with the .tmp
files, for example, c:myfolder
. solution:delete_files.kjb
. folder
.<action-definition>
tag that contains the following line:<component-name>KettleComponent</component-name>
<action-definition> <component-name>KettleComponent</component-name> <action-type>Pentaho Data Integration Job</action-type> <action-inputs> <folder type="string"/> </action-inputs> <action-resources> <job-file type="resource"/> </action-resources> <action-outputs/> <component-definition/> </action-definition>
<component-definition/>
tag with the following:<component-definition> <set-parameter> <name>TMP_FOLDER</name> <mapping>folder</mapping> </set-parameter> </component-definition>
Now, it is time to test the action sequence that you just created.
delete_files
action you just created. Double-click on it. tmp
files.You can run Kettle jobs as part of an action sequence by using the Pentaho Data Integration Job process action located within the Execute category of process actions.
The main task of a PDI Job process action is to run a Kettle job. In order to do that, it has a series of checks and textboxes where you specify everything you need to run the job, and everything you want to receive back after having run it.
The most important setting in the PDI process action is the name and location of the job to be executed. In this example, you had a .kjb
file in the same location as the action sequence, so you simply typed solution:
followed by the name of the file.
You can specify the Kettle log level, but it is not mandatory. In this case, you left the log level empty. The log level you select here (or Basic
, by default) is the level of log that Kettle writes to the Pentaho console when the job runs.
Besides the name and location of the job, you had to provide the name of the folder needed by the job. In order to do that, you created an input named folder
and then you passed it to the job. You did it in the XML code by putting the name of the input enclosed between<set-parameter>
and</set-parameter>
.
When you run the action sequence, the job was executed deleting all .tmp
files in the given folder.
The main reason for embedding a job in an action sequence is for scheduling its execution with the Pentaho scheduling services. This is an alternative approach to the use of a system utility such as cron in Unix-based operating systems or the Task Scheduler in Windows.