Working with Json files

JavaScript Object Notation (Json) is a lightweight language-independent data interchange format. It uses conventions similar to the C or JavaScript languages with some rules for the representation of structured data. The object is represented as a collection of name_of_field:value_of_field pairs and you can have an array of these elements using the [] characters.

PDI allows reading and writing these kind of files using the Json input and Json output steps from the Input category.

Let's see an example of reading a Json file. Let's assume that you have a file named museums.js that you want to read for further processing. The file has the following information:

{"data": {
"museum": [
{
"country": "Italy",
"city": "Venice",
"id_museum": "109"
"name": "Palazzo Ducale"},
{
"country": "Mexico",
"city": "Mexico City",
"id_museum": "36"
"name": "Museo de Arte Contemporaneo de Monterrey"},
{
"country": "Italy",
"city": "Florence",
"id_museum": "47"
"name": "Museo di San Marco"}
]
}
}

In addition, you want to read it for further processing.

Getting ready

To run this recipe, you need the museums.js file with the museum information shown earlier. You can also download the file from the book's site.

How to do it...

Carry out the following steps:

  1. Create a new transformation.
  2. Drop a Json input step from the Input category.
  3. Type the name of the file with its complete path in the File or directory textbox located under the File tab. For example, ${Internal.Transformation.Filename.Directory}museums.js. Click on the Add button.
  4. Complete the Fields tab as shown in the following screenshot:
    How to do it...
  5. Previewing the transformation, you will obtain a dataset with the museum information from the museums.js Json source file.

How it works...

The Json input step reads and interprets the museum.js Json data using the Path column under the Files tab. Here, you must use a JsonPath expression, in a similar way the XPath expressions are used in XML files.

A basic overview of the JsonPath syntax elements is shown in the following table:

JsonPath

Description

$

Root object

.

Child operator

[]

Array operator

[,]

Union operator in XPath

The child operator is used to access to different levels inside the Json structure, for example, $...city means "the city element inside the museum element inside the data element from the root".

If you want to access to a particular element, then you should use $.data.museum[1].city, that means "the city of the second museum element inside the data element from the root".

Note

Note that in Json, the lists are zero-based.

There's more...

You can find more information about Json language at the following URL:

http://www.json.org/

In the following subsections, you will find some considerations about reading and writing Json files:

Reading Json files dynamically

In this recipe, you used the Json input step to read the museum.js Json file, but you can also read the Json data from a field by checking the Source is defined in a field? checkbox and selecting the field in the Get source from field listbox.

Another option is when the name of the Json file is in a field. In this case, instead of typing the name of the file, you must check both the Source is defined in a field? and Source is a filename? checkboxes and select the field that contains the name of the Json file in the Get source from field listbox.

Writing Json files

If you need to create a file or a field in Json format, then you can use the Json output step from the Output category. Under the General tab of this step, there is a listbox named Operation where you can choose either a file or field destination.

If you choose a file destination, you need to fill the Output File section with information about the file to be generated. If you choose Output value operation, you must type the name of the new field in the Output Value textbox.

Then, under the Fields tab, you need to populate the grid with the source fields coming from the datasource, and the element name for the Json structure.

For example, assume that you are using a datasource with authors' information like the following:

"lastname","firstname","country","birthyear"
"Larsson","Stieg","Swedish",1954
"King","Stephen","American",1947
"Hiaasen","Carl ","American",1953
"Handler","Chelsea ","American",1975
"Ingraham","Laura ","American",1964
"Ramsey","Dave ","American",1960
"Kiyosaki","Robert ","American",1947,"A00007"
"Rowling","Joanne ","English",1965
"Riordan","Rick ","American",1964

If you add a Json output step and an Output value operation, then you will obtain a new field with each row in Json format, with a value as follows:

{"data":[{"lastname":"Larsson"},{"firstname":"Stieg"},{"country":"Swedish"},{"birthyear":1954}]}

Previewing the step, the result should be similar to the following:

Writing Json files
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset