Cube

Cube is a multi-dimensional aggregation used to perform hierarchical or nested calculations just like rollup but with the difference that cube does the same operation for all dimensions. For example, if we want to show the number of records for each State and Year group as well as for each State (aggregating over all year's to give a grand total for each State irrespective of the Year), we can use cubeas follows:

scala> statesPopulationDF.cube("State", "Year").count.show(5)
+------------+----+-----+
| State|Year|count|
+------------+----+-----+
|South Dakota|2010| 1|
| New York|2012| 1|
| null|2014| 50|
| Wyoming|2014| 1|
| Hawaii|null| 7|
+------------+----+-----+
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset