Left anti join

Left anti join results in rows from only statesPopulationDF if and only if there is no corresponding row in statesTaxRatesDF:

Join the two datasets via the State column as follows:

val joinDF = statesPopulationDF.join(statesTaxRatesDF,
statesPopulationDF("State") === statesTaxRatesDF("State"), "leftanti")
%sql
val joinDF = spark.sql("SELECT * FROM statesPopulationDF LEFT ANTI JOIN
statesTaxRatesDF ON statesPopulationDF.State = statesTaxRatesDF.State")
scala> joinDF.count
res22: Long = 28
scala> joinDF.show(5)
+--------+----+----------+
| State|Year|Population|
+--------+----+----------+
| Alaska|2010| 714031|
|Delaware|2010| 899816|
| Montana|2010| 990641|
| Oregon|2010| 3838048|
| Alaska|2011| 722713|
+--------+----+----------+
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset