How to do it...

  1. Start the Spark shell:
        $ spark-shell
  1. Perform the required imports:
        scala> import org.apache.spark.mllib.tree.DecisionTree
scala> import org.apache.spark.mllib.regression.LabeledPoint
scala> import org.apache.spark.mllib.linalg.Vectors
scala> import org.apache.spark.mllib.tree.configuration.Algo._
scala> import org.apache.spark.mllib.tree.impurity.Entropy
  1. Load the file:
        scala> val data = sc.textFile("tennis.csv") 
  1. Parse the data and load it into LabeledPoint:
        scala>  val parsedData = data.map {
line => val parts = line.split(',').map(_.toDouble)
LabeledPoint(parts(0), Vectors.dense(parts.tail)) }
  1. Train the algorithm with this data:
        scala> val model = DecisionTree.train(parsedData,
Classification, Entropy, 3)
  1. Create a vector for no rain, high wind, and cool temperature:
        scala> val v=Vectors.dense(0.0,1.0,0.0)
  1. Predict whether tennis should be played:
        scala> model.predict(v)
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset