Getting ready

Let's build a small dataset of the followers:

Our goal is to find out how many followers each node has. Let's load this data in the form of two files: nodes.csv and edges.csv.

The following is the content of nodes.csv:

1,Barack 
2,John 
3,Pat 
4,Gary 
5,Mitt 
6,Chris 
7,Rob

The following is the content of edges.csv:

2,1,follows 
3,1,follows 
4,1,follows 
6,5,follows 
7,5,follows

You can load the files to hdfs using the following commands:

$ hdfs dfs -mkdir data/na
$ hdfs dfs -put nodes.csv data/na/nodes.csv
$ hdfs dfs -put edges.csv data/na/edges.csv

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Getting ready