Advantages and drawbacks

Given the high availability and concurrent user focus behind most NoSQL options, it should come as no great surprise that they are better suited than their RDBMS counterparts for applications where availability and the ability to scale is important. Those properties are even more important in big data applications, and applications that live in the cloud—as evidenced by the fact that the major cloud providers all have their own offerings in that space, as well as providing starting-points for some well-known NoSQL options:

  • Amazon (AWS):
    • DynamoDB
  • Google:
    • Bigtable (for big data needs)
    • Datastore
  • Microsoft (Azure):
    • Cosmos DB (formerly DocumentDB)
    • Azure Table Storage

The ability to more or less arbitrarily define data structures can also be a significant advantage during development, since it eliminates the need for defining database schemas and tables. The trade-off for that, potentially, at least, is that since data structures can change just as arbitrarily, code that uses them has to be written to be tolerant of those structure changes, or some sort of conscious effort may have to be planned to apply the changes to existing data items without disrupting systems and their usage.

Consider, as an example, the User class mentioned earlier—if a password_hash property needs to be added to the class, in order to provide authentication/authorization support, the instantiation code will likely have to account for it, and any existing user-object records won't have the field already. On the code side, that may not be that big a deal—making password_hash an optional argument during initialization would take care of allowing the objects to be created, and storing it as a null value in the data if it hasn't been set would take care of the data storage side, but some sort of mechanism would need to be planned, designed, and implemented to prompt users to supply a password in order to store the real value. The same sort of process would have to occur if a similar change were made in an RDBMS-backed system, but the odds are good enough that there would be established processes for making changes to database schemas, and those would probably include both altering the schema and assuring that all records have a known starting value.

Given the number of options available, it should also not be surprising that there are differences (sometimes significant ones) between them with respect to performing similar tasks. That is, retrieving a record from the data, given nothing more than a unique identifier for the item to be retrieved (id_value), uses different libraries and syntax/structure based on the engine behind the data store:

  • In MongoDB (using a connection object):
    • connection.find_one({'unique_id':'id_value'})
  • In Redis (using a redis connection):
    • connection.get('id_value')
  • In Cassandra (using a query value and a criteria list, executing against a Cassandra session object):
    • session.execute(query, criteria)

It's quite possible that each different engine will have its own distinct methods for performing the same tasks, though there may be some common names that emerge—there are only so many alternatives for function or method names, like get or find, that make sense, after all. If a system needs to be able to work with multiple different data store backend engines, those are good candidates for designing and implementing a common (probably abstract) data store adapter.

Since relational and transactional support varies from one engine to another, this inconsistency can be a drawback to a NoSQL-based data store as well, though there are at least some options that can be pursued if they are lacking.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset