Azure Cosmos DB

Azure Cosmos DB is Microsoft's globally distributed, multi-model database. Azure Cosmos DB enables you to elastically and independently scale throughput and storage across any number of Azure's geographic regions. Azure Cosmos DB guarantees single-digit-millisecond latencies at the 99th percentile anywhere in the world, offers multiple well-defined consistency models to fine-tune performance, and guarantees high availability with multi-homing capabilities (more details are available here: https://azure.microsoft.com/en-us/services/cosmos-db/).

Azure Cosmos DB was launched in May 2017, however it's not really a brand-new offering per se. It actually inherited lots of functionality from its predecessor, Azure DocumentDB, which was launched a couple of years prior to it, and was more focused on the NoSQL-based architectural pattern. However, Azure CosmosDB is just not a simple reincarnation of Azure DocumentDB with a new name; it actually introduced multiple new functionalities and capabilities which were not there before. The three important differentiators that new Cosmos DB brings in are:

  • Globally distributed deployment model: Azure has multiple regions, and if you want to deploy an application that spans many of those regions with a shared database, then CosmosDB allows that with a global deployment model. You can select the regions where your DB instance is replicated either during the time of creation of your database, or you can do that later too when the database is active. You can also define which region is read or write, or read/write, as per your requirements. Along with that, you can also define the failover priorities for each of the regions to handle any large scale regional events and issues.

For some use cases, you may want to restrict the data to a specific location or region (like for data sovereignty, privacy, or in-country data residency regulations), and then to meet those needs you can control this using policies that are controlled using the metadata of your Azure subscription:

  • Multi-model APIs: This is one of the biggest differences from the earlier version of Azure DocumentDB, as now it supports various other models:
    • Graph database - This is the model in which data has multiple vertices and edges. Each vertex defines a unique object in the data (such as a person or device) and it can be connected to n-number of other vertices using edges, which define a relationship between them. To query Azure Cosmos DB, you can use the Apache TinkerPop graph traversal language, Gremlin, or other TinkerPop-compatible graph systems, such as Apache Spark GraphX.
    • Table - This is based on the Azure table storage, which is basically a NoSQL key-value approach to storing massive amounts of semi-structured datasets. Cosmos DB adds the global distribution capabilities here on top of Azure Table Storage APIs.
    • JSON documents - Cosmos DB also provides the option to store JSON documents in your database, which can either be accessed using existing DocumentDB APIs or newer MongoDB APIs.
    • SQL - Cosmos DB also supports some basic SQL functions using the existing DocumentDB API. However, this is not as full-fledged as the Azure SQL database, so if advanced SQL operations are required, then Cosmos DB might not be a fit for that.
  • Consistency models: Normally, other cloud databases provided limited options in terms of how the data is replicated across different partitioned nodes and geographies. The most common options in distributed computing are as follows:
    • Strong Consistency, where the response is only returned after the commit is successful across different replicas and so read after write would ensure that you have the latest value
    • Eventual Consistency, where the response after a commit is immediately returned and various nodes synchronize the data eventually, thereby leading to a scenario wherein immediate read after write may give a stale value

However, Cosmos DB has taken it to the next level. It has introduced various other consistency models (bounded staleness, session, and consistent prefix) that lie between strong and eventual consistency models, as follows:

The following is a screenshot from the Azure CosmosDB portal, which shows these consistency model settings:

As a result of all of these differentiating capabilities, Azure Cosmos DB is a great choice for multiple cloud native scenarios such as shared databases for globally distributed application architectures, telemetry, and the state information store for connected platforms/IoT and backend persistence for serverless applications.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset