Throughout the chapter, we will be working with CTools further to build a more interactive dashboard. We will use the nyse_stocks
data (please refer to Appendix B, Hadoop Setup, for more details), but need to change its structure. The data source for the dashboard will be a PDI transformation.
Execute the following steps:
nyse_stock_transfer.ktr
file from the chapter's code folder.NYSE-2000-2001.tsv.gz
within the same folder with the transformation file.NYSE-2000-2001-convert.tsv.gz
file.http://192
.168.1.122:8000
.NYSE-2000-2001-convert.tsv.gz
file and wait until the uploading process finishes.nyse_stocks
table.nyse_stocks
.NYSE-2000-2001-convert.tsv.gz
file. If the file does not exist, make sure you navigate to the right user or name path.nyse_stocks
. Make sure the month
and year
columns are now available.In Chapter 2, Setting Up the Ground, we learned that Action Sequence can execute any step in PDI script. However, since it needs to list the step's metadata using the getMetaData
method in the PreparedStatement
class, it will become problematic for a Hive connection. It is because Hive JDBC does not implement the getMetaData
method. Therefore, we need to work out another way by using Java code that utilizes the Statement
class instead of PreparedStatement
in PDI's user-defined Java class.
Execute the following steps:
hive_java_query.ktr
from the chapter's code folder. This transformation acts as our data.processRow()
method. The code is actually a query to the Hive database using JDBC objects. What makes it different is the following code:ResultSet res = stmt.executeQuery(sql); while (res.next()) { get(Fields.Out, "period").setValue(rowd, res.getString(3) + "-" + res.getString(4)); get(Fields.Out, "stock_price_close").setValue(rowd, res.getDouble(1)); putRow(data.outputRowMeta, rowd); }
The code will execute a SQL query statement to Hive. The result will be iterated and filled in the PDI's output rows. Column 1 of the result will be reproduced as stock_price_close
. The concatenation of columns 3 and 4 of the result becomes period.
To consume data through CTools, use Community Data Access (CDA) as it is the standard data access layer. CDA is able to connect to several sources including a Pentaho Data Integration transformation.
The following steps will help you create a CDA data sources consuming PDI transformation:
Chapter 5
folder from your book's code bundle folder into [BISERVER]/pentaho-solutions
.
Name |
line_chart_data |
pie_chart_data |
---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
The variables and parameters in the data sources will be used later to interact with the dashboard's filter. The Variables
textbox allows more than one pair. And Variables(1)
indicates that it is the first index value of the Arg
and Value
pair. The same explanation goes to Parameters(1)
.
stock_price_dashboard_hive.cda
inside Chapter 5 to open a CDA data browser. The listbox contains data source names that we have created before; choose DataAccess ID: line_trend_data to preview its data. It will show a table with three columns (stock_symbol
, period
, and stock_price_close
) and one parameter, stock_param_data
, with a default value, ALLSTOCKS
. Explore all the other data sources to gain a better understanding when working with the next examples.