This recipe is about creating test data for some of the recipes in this chapter and also for the later chapters in this book. We will demonstrate how to load a CSV file in a mongo database using the mongo import utility. This is a basic recipe, and if the reader is aware of the data import utility; they can just download the CSV file from the Packt website (pincodes.csv
), load it in the collection by themselves, and skip the rest of the recipe. We will use the default database, test
, and the collection will be named postalCodes
.
The data used here is for postcodes in India. Download the pincodes.csv
file from the Packt website. The file is a CSV file with 39,732 records; it should create 39,732 documents on successful import. We need to have the Mongo server up and running. Refer to the Installing single node MongoDB recipe from Chapter 1, Installing and Starting the Server for instructions on how to start the server. The server should begin listening for connections on the default port, 27017
.
$ mongoimport --type csv -d test -c postalCodes --headerline --drop pincodes.csv
mongo
on the command prompt.> db.postalCodes.count()
Assuming that the server is up and running, the CSV file has been downloaded and is kept in a local directory where we execute the import utility with the file in the current directory. Let's look at the options given in the mongoimport
utility and their meanings:
The final value on the command prompt after all the options are given is the name of the file, pincodes.csv
.
If the import goes through successfully, you should see something similar to the following printed to the console:
2015-05-19T06:51:54.131+0000 connected to: localhost 2015-05-19T06:51:54.132+0000 dropping: test.postalCodes 2015-05-19T06:51:54.810+0000 imported 39732 documents
Finally, we start the mongo shell and find the count of the documents in the collection; it should indeed be 39,732 as seen in the preceding import log.
The postal code data has been taken from https://github.com/kishorek/India-Codes/. This data is not taken from an official source and might not be accurate as it is being compiled manually for free public use.