In this recipe, we will look at some basic backup and restore operations using utilities such as mongodump
and mongorestore
to back up and restore files.
We will start a single instance of mongod. Refer to the recipe Installing single node MongoDB in Chapter 1, Installing and Starting the Server, to start a mongo instance and connect to it from a mongo shell. We will need some data to backup. If you already have some data in your test database, that will be fine. If not, create some from the countries.geo.json
file available in the code bundle using the following command:
$ mongoimport -c countries -d test --drop countries.geo.json
test
database, execute the following (assuming we want to export the data to a local directory called dump
in the current directory):$ mongodump -o dump -oplog -h localhost -port 27017
Verify that there is data in the dump
directory. All files will be .bson
files, one per collection in the respective database folder created.
dump
in the current directory with the required .bson
files present in it:mongorestore --drop -h localhost -port 27017 dump -oplogReplay
Just a couple of steps executed to export and restore the data. Let's now see what it exactly does and what the command-line options for this utility are. The mongodump
utility is used to export the database into the .bson
files, which can then be later used to restore the data in the database. The export utility exports one folder per database except local database, and then each of them will have one .bson
file per collection. In our case, we used the -oplog
option to export a part of the oplog too and the data will be exported to the oplog.bson
file. Similarly, we import the data back into the database using the mongorestore
utility. We explicitly ask the existing data to be dropped by providing the --drop
option before the import and replay of the contents in the oplog if any.
The mongodump
utility simply queries the collection and exports the contents to the files. The bigger the collection, the longer it will take to restore the contents. It is thus advisable to prevent write operations when the dump is being taken. In the case of sharded environments, the balancer should be turned off. If the dump is taken while the system is running, export with the -oplog
option to export the contents of the oplog as well. This oplog can then be used to restore to the point in time data. The following table shows some of the important options available for the mongodump
and mongorestore
utility, first for mongodump
:
Similarly, for the mongorestore
utility, here are the options. The meaning of the options --help
, -h
, or --host
, --port
, -u
, or --username
, -p
or --password
, --authenticationDatabase
, -d
, or --db
, -c
or --collection
.
You might think, Why not copy the files and take a backup? That works well but there are a few problems associated with it. First, you cannot get a point-in-time backup unless write operations are disabled. Secondly, the space used for backups is very high as the copy would also copy the 0 padded files of the database as against the mongodump
, which exports just the data.
Having said that, filesystem snapshotting is a commonly used practice for backups. One thing to remember is while taking the snapshot the journal files and the data files need to come in the same snapshot for consistency.
If you were using Amazon Web Services (AWS), it would be highly recommended that you upload your database backups to AWS S3. As you may be aware, AWS offers extremely high data redundancy with a very low storage cost.
Download the script generic_mongodb_backup.sh
from the Packt Publishing website and use it to automate your backup creation and upload to AWS S3.