How to do it...

        $ hdfs dfs -put pres.csv

        $ spark-shell

        scala> import org.apache.spark.mllib.linalg.Vectors
scala> import org.apache.spark.mllib.linalg.distributed.RowMatrix

        scala> val data = sc.textFile("pres.csv")

        scala> val parsedData = data.map( line => 
        Vectors.dense(line.split(',').map(_.toDouble)))

        scala> val mat = new RowMatrix(parsedData)

        scala> val svd = mat.computeSVD(2,true)

        scala> val U = svd.U

        scala> val s = svd.s

        scala> val s = svd.s

If you look at s, you will realize that it gave a much higher score to the Npr article than the Fox article.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for How to do it...