JavaScript Object Notation (Json) is a lightweight language-independent data interchange format. It uses conventions similar to the C or JavaScript languages with some rules for the representation of structured data. The object is represented as a collection of name_of_field:value_of_field
pairs and you can have an array of these elements using the []
characters.
PDI allows reading and writing these kind of files using the Json input and Json output steps from the Input category.
Let's see an example of reading a Json file. Let's assume that you have a file named museums.js
that you want to read for further processing. The file has the following information:
{"data": { "museum": [ { "country": "Italy", "city": "Venice", "id_museum": "109" "name": "Palazzo Ducale"}, { "country": "Mexico", "city": "Mexico City", "id_museum": "36" "name": "Museo de Arte Contemporaneo de Monterrey"}, { "country": "Italy", "city": "Florence", "id_museum": "47" "name": "Museo di San Marco"} ] } }
In addition, you want to read it for further processing.
To run this recipe, you need the museums.js
file with the museum information shown earlier. You can also download the file from the book's site.
Carry out the following steps:
${Internal.Transformation.Filename.Directory}museums.js
. Click on the Add button. museums.js
Json source file.The Json input step reads and interprets the museum.js
Json data using the Path column under the Files tab. Here, you must use a JsonPath expression, in a similar way the XPath expressions are used in XML files.
A basic overview of the JsonPath syntax elements is shown in the following table:
JsonPath |
Description |
---|---|
|
Root object |
|
Child operator |
|
Array operator |
|
Union operator in XPath |
The child operator is used to access to different levels inside the Json structure, for example, $...city
means "the city
element inside the museum
element inside the data
element from the root".
If you want to access to a particular element, then you should use $.data.museum[1].city
, that means "the city
of the second museum
element inside the data
element from the root".
You can find more information about Json language at the following URL:
In the following subsections, you will find some considerations about reading and writing Json files:
In this recipe, you used the Json input step to read the museum.js
Json file, but you can also read the Json data from a field by checking the Source is defined in a field? checkbox and selecting the field in the Get source from field listbox.
Another option is when the name of the Json file is in a field. In this case, instead of typing the name of the file, you must check both the Source is defined in a field? and Source is a filename? checkboxes and select the field that contains the name of the Json file in the Get source from field listbox.
If you need to create a file or a field in Json format, then you can use the Json output step from the Output category. Under the General tab of this step, there is a listbox named Operation where you can choose either a file or field destination.
If you choose a file destination, you need to fill the Output File section with information about the file to be generated. If you choose Output value operation, you must type the name of the new field in the Output Value textbox.
Then, under the Fields tab, you need to populate the grid with the source fields coming from the datasource, and the element name for the Json structure.
For example, assume that you are using a datasource with authors' information like the following:
"lastname","firstname","country","birthyear" "Larsson","Stieg","Swedish",1954 "King","Stephen","American",1947 "Hiaasen","Carl ","American",1953 "Handler","Chelsea ","American",1975 "Ingraham","Laura ","American",1964 "Ramsey","Dave ","American",1960 "Kiyosaki","Robert ","American",1947,"A00007" "Rowling","Joanne ","English",1965 "Riordan","Rick ","American",1964
If you add a Json output step and an Output value operation, then you will obtain a new field with each row in Json format, with a value as follows:
{"data":[{"lastname":"Larsson"},{"firstname":"Stieg"},{"country":"Swedish"},{"birthyear":1954}]}
Previewing the step, the result should be similar to the following: