Storing data to GridFS from a Python client

In the Storing large data in MongoDB using GridFS recipe, we saw what GridFS is and how it can be used to store large files in MongoDB. In the previous recipe, we saw how to use GridFS API from a Java client. In this recipe, we will see how to store image data into MongoDB using GridFS from a Python program.

Getting ready

Refer to the Connecting to a single node from a Java client recipe from Chapter 1, Installing and Starting the MongoDB Server, for all the necessary setup for this recipe. If you are interested in more details on Python drivers, refer to the following recipes in Chapter 3, Programming Language Drivers:

  • Installing PyMongo
  • Executing query and insert operations using PyMongo
  • Executing update and delete operations using PyMongo

Download and save the glimpse_of_universe-wide.jpg image file from the downloadable code bundle, available on the book's website, to the local filesystem, as we did in the previous recipe.

How to do it…

  1. Open a Python interpreter by typing in the following command in the operating system shell (note that the current directory is the same as the directory where the image file glimpse_of_universe-wide.jpg is placed):
    $ python
    
  2. Import the required packages as follows:
    >>> import pymongo
    >>> import gridfs
    
  3. Once the Python shell is opened, create a MongoClient and database object to the test database as follows:
    >>> client = pymongo.MongoClient('mongodb://localhost:27017')
    >>> db = client.test
    
  4. To clear the GridFS-related collections to start clean, and only if nothing important is present in them, execute the following queries:
    >>> db.fs.files.drop()
    >>> db.fs.chunks.drop()
    
  5. Create the instance of GridFS as follows:
    >>> fs = gridfs.GridFS(db)
    
  6. Now, we will read the file and upload its contents to GridFS. First, create the file object as follows:
    >>> file = open('glimpse_of_universe-wide.jpg', 'rb')
    
  7. Now put the file into GridFS as follows:
    >>> fs.put(file, filename='universe.jpg')
    
  8. On successfully executing the preceding put command, we should see ObjectId for the file uploaded. This would be same as the _id field of the fs.files collection for this file.
  9. Execute the following query from the Python shell. It should print out the dict object with the details of the upload. Verify the contents and cross-check by executing the following query:
    >>> db.fs.files.find_one()
    
  10. Now, we will get the uploaded content and write it to a file on the local filesystem. Let us get the GridOut instance representing the object, to read the data out of GridFS as follows:
    >>> gout = fs.get_last_version('universe.jpg')
    
  11. With this instance available, let us write the data to the file on a local filesystem as follows. First, open a handle to the file on the local filesystem to write to, as follows:
    >>> fout = open('universe.jpg', 'wb')
    
  12. We will then write contents to it as follows:
    >>> fout.write(gout.read())
    >>> fout.close()
    >>> gout.close()
    
  13. Now verify the file on the current directory on the local filesystem. A new file called universe.jpg will be created with the same number of bytes as the source present in it. Verify it by opening it in an image viewer.

How it works…

Let us look in detail at the steps we executed. In the Python shell, we import two packages, pymongo and gridfs, and instantiate the pymongo.MongoClient and gridfs.GridFS instances. The constructor of the gridfs.GridFS class takes an argument, which is the instance of pymongo.Database.

We open a file in binary mode using the open function and pass the file object to the GridFS's put method. There is an additional argument passed, called filename, which is the name of the file put into GridFS. The first parameter, in fact, need not be a file object, but any object with a read method defined.

Once the put operation succeeds, the return value is an ObjectId for the uploaded document in the fs.files collection. A query on fs.files can confirm that the file is uploaded. Verify that the size of the data uploaded matches the size of the file.

Our next objective is to get the file from GridFS on to the local filesystem. Intuitively, one would imagine that if the method to put a file in GridFS is put, then the method to get the file would be get. True, the method is indeed get. However, it will get only based on the ObjectId, which was returned by the put method. So if you are ok to get by ObjectId, the method for you is get. However, if you want to get by the filename, the method to use is get_last_version , which accepts the name of the file that we uploaded, and the return type of this method is gridfs.gridfs_file.GridOut. This class contains the method read, which will read out all the bytes from the uploaded file to GridFS. We open a file called universe.jpg to write in binary mode and write all the bytes read from the GridOut object.

See also

  • The Storing binary data in MongoDB recipe
  • The Storing data to GridFS from a Java client recipe
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset