- Start the Spark shell:
$ spark-shell
- Perform the required imports:
scala> import org.apache.spark.mllib.tree.DecisionTree
scala> import org.apache.spark.mllib.regression.LabeledPoint
scala> import org.apache.spark.mllib.linalg.Vectors
scala> import org.apache.spark.mllib.tree.configuration.Algo._
scala> import org.apache.spark.mllib.tree.impurity.Entropy
- Load the file:
scala> val data = sc.textFile("tennis.csv")
- Parse the data and load it into LabeledPoint:
scala> val parsedData = data.map {
line => val parts = line.split(',').map(_.toDouble)
LabeledPoint(parts(0), Vectors.dense(parts.tail)) }
- Train the algorithm with this data:
scala> val model = DecisionTree.train(parsedData,
Classification, Entropy, 3)
- Create a vector for no rain, high wind, and cool temperature:
scala> val v=Vectors.dense(0.0,1.0,0.0)
- Predict whether tennis should be played:
scala> model.predict(v)
..................Content has been hidden....................
You can't read the all page of ebook, please click
here login for view all page.