Visualizing K-means results

This recipe explains how you can visualize the results of a K-means run.

Getting ready

This recipe assumes that you have followed the earlier recipe, have run K-means, and have access to the output of the K-means algorithm. If you have not already done so, follow the previous recipe to run K-means.

How to do it...

This section demonstrates how to convert output of the K-means execution to GraphML and visualize it.

  1. Running the following command will print the results into GraphML format, which is a standard representation of graphs. Here, replace the <k-means-output-dir> with the output directory of the k-mean execution.
    >bin/mahout clusterdump --seqFileDir<k-means-output-dir>/clusters-10-final/ --pointsDir<k-means-output-dir>/clusteredPoints --outputFormat GRAPH_ML -o clusters.graphml
    
  2. Download and install Gephi graph visualization toolkit from http://gephi.org/.
  3. Open the MAHOUT_HOME/clusters.graphml file using File->Open menu of the Gephi.
  4. From the layout window at the lower-left corner of the screen, use YufanHu's multilevel as the layout method, and click on Run.
  5. Gephi will show a visualization of the graph that looks like the following:
    How to do it...

How it works...

K-means output is written as a sequence file. We can use the clusterdump command of the Mahout to write them as a GraphML file, which is a standard representation of the graph. Then, we used Gephi graph visualization software to visualize the resulting GraphML file.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset