Once we have rowRDD and the header, the next task is to construct the rows of our Schema DataFrame from the variants using the header and rowRDD:
val sqlContext = spark.sqlContext
val schemaDF = sqlContext.createDataFrame(rowRDD, header)
schemaDF.printSchema()
schemaDF.show(10)
>>>
Figure 15: A snapshot of the training dataset containing features and the label (that is, Region) columns
In the preceding DataFrame, only a few columns, including the label, are shown so that it fits on the page.