How it works...

The spark.read.json internally uses TextInputFormat, which processes one line at a time. Therefore, one JSON record cannot be on multiple lines. It would be a valid JSON format if you use multiple lines, but it will not work with Spark and will throw an exception.

It is allowed to have more than one object in a line. For example, you can have the information of two persons in one line as an array, as follows:

[{"firstName":"Barack", "lastName":"Obama"},{"firstName":"Bill", "lastName":"Clinton"}]

This recipe concludes the saving and loading of data in the JSON format using Spark.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset