One of the nice and interesting features in Mongo is automatically expiring data in the collection after a predetermined amount of time. This is a very useful tool when we desire to purge some data older than a particular timeframe. For a relational database, it is not common for folks to set up a batch job that runs every night to perform this operation.
With the Time To Live (TTL) feature of Mongo, we need not worry about this as the database takes care of it out-of-the-box. Let's see how we can achieve this.
Let's create some data in Mongo that we want to play with using the TTL indexes. We will create a collection called ttlTest
for this purpose. We will require a server to be up and running. Refer to the Single node installation of MongoDB recipe in Chapter 1, Installing and Starting the MongoDB Server, to learn how to start the server. Also, start the shell with the TTLData.js
script loaded. This script will be available on the book's website for download. To know how to start the shell with a script reloaded, refer to the Connecting to a single node from the Mongo shell with a preloaded JavaScript recipe in Chapter 1, Installing and Starting the MongoDB Server.
> addTTLTestData()
createDate
field:> db.ttlTest.ensureIndex({createDate:1}, {expireAfterSeconds:300})
> db.ttlTest.find()
find
query in approximately 30 to 40 seconds repeatedly, to see the three documents getting deleted until the entire collection has zero documents left in it.Let's start by opening the TTLData.js
file and see what is going on in it. The code is pretty simple; it just got the current date using new Date()
. It then created three documents with createDate
that were some 4, 3, and 2 minutes behind the current time for the three documents. So, on the execution of the addTTLTestData()
method in this script, we will have three documents in the ttlTest
collection, each having a difference of 1 minute in their creation time.
The next step is the core of the TTL feature: the creation of the TTL index. It is similar to the creation of any other index using the ensureIndex
method, except that it also accepts a second parameter, a JSON object. Let's see what these two parameters are:
{createDate:1}
; this will tell Mongo to create an index on the createDate
field, and the order of the index is ascending as the value is 1 (-1 would have been descending){expireAfterSeconds:300}
, is what makes this index a TTL index; it tells Mongo to automatically expire the documents after 300 seconds (5 minutes)OK, but 5 minutes since when? Since the time they were inserted in the collection or is it some other timestamp? In this case it considers the createTime
field as the base, as this was the field on which we created the index.
This now raises a question: if a field is being used as the base for the computation of time, there has to be some restriction on its type. It just doesn't make sense to create a TTL index, as we created earlier, on a char
field that holds, say, the name of a person.
As we guessed, the type of the field can be a BSON type date or an array of dates. What will happen in the case where an array has multiple dates? What will be considered in this case?
It turns out that Mongo uses the minimum of dates available in the array. Try out this scenario as an exercise.
Put two dates separated by about 5 minutes from each other in a document against the updateField
field name and then create a TTL index on this field, as you did earlier, to expire the document after 10 minutes (600 seconds). Query the collection and see when the document gets deleted from the collection. It should get deleted after roughly 10 minutes have elapsed since the minimum time value present in the updateField
array.
Apart from the constraint for the type of field, there are a few more constraints.
_id
field of the collection already has an index by default, it effectively means you cannot create a TTL index on the _id
field.