How to do it...

  1. Start the Spark shell:
$ spark-shell
  1. Import the statistics and related classes:
scala> import org.apache.spark.ml.linalg.Vectors
scala> import org.apache.spark.ml.regression.LinearRegression
  1. Create a DataFrame with the house price as the label:
scala>  val points = spark.createDataFrame(Seq(
(1620000,Vectors.dense(2100)),
(1690000,Vectors.dense(2300)),
(1400000,Vectors.dense(2046)),
(2000000,Vectors.dense(4314)),
(1060000,Vectors.dense(1244)),
(3830000,Vectors.dense(4608)),
(1230000,Vectors.dense(2173)),
(2400000,Vectors.dense(2750)),
(3380000,Vectors.dense(4010)),
(1480000,Vectors.dense(1959))
)).toDF("label","features")
  1. Initialize linear regression:
scala> val lr = new LinearRegression()
  1. Train a model using this data:
scala> val model = lr.fit(points)
  1. Create some test data:
scala> val test = spark.createDataFrame(Seq(Vectors.dense(2100)).map(Tuple1.apply)).toDF("features")
  1. Make predictions for the test data:
scala> val predictions = model.transform(test)
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset