Cosmos DB Cheat Sheet

cosmos-dbThis page contains an ongoing effort to maintain a list of questions and answers and to also provide useful links for Azure Cosmos DB – Microsoft’s Globally Distributed NoSQL Database in the Cloud. Please keep checking back from time to time and make sure to send me any questions if you want me to include them and provide answers.

Useful Links

Below are a list of useful resources to get you started in cosmos.

Emulators

Query Playground is an easy way to play around with Cosmos SQL API queries in the cloud without having to provision your own resources. It is free to use.
Cosmos DB emulator is a downloadable executable that emulates the Cosmos DB engine enabling you to develop against it for zero cost.

Getting started

Cosmos DB cheat sheets provide you with a useful set of Cosmos API  query cheat sheets to download as PDF.
Getting Started documentation provides a very nice instructional overview and code examples using the Azure Cosmos DB .NET SDK – it also points to the getting started repo.
Cosmos DB Team Blog is an excellent place to stop by for tips and tricks from the people that develop the product!

Training

Learning Azure Cosmos DB Pluralsight course is an excellent offering from Lenni Lobel and requires a Pluralsight subscription or trial subscription.
Developing Planet-Scale Applications in Azure Cosmos DB is an edX course provided in association with Microsoft. The material is slightly out of date in areas but has some good content and is free to take unless you want a certificate of completion.
Work with NoSQL data in Azure Cosmos DB is a Microsoft Learn offering and an excellent guided learning experience with Cosmos DB.

Costing

Cosmos DB Capacity Calculator is a nice page to estimate expected costs and utilization of your Cosmos DB environment.

Troubleshooting

HTTP Status Codes for Azure Cosmos DB for a simple list of error codes to troubleshoot your application connectivity.
Diagnose and troubleshoot issues when using Azure Cosmos DB .NET SDK for 101 advice to troubleshoot common problems when developing against Cosmos DB.
Diagnose and troubleshoot issues when using Azure Cosmos DB .NET SDK

Availability and Scale

SLA for Azure Cosmos DB is a useful Microsoft maintained page detailing all their explicit service level agreements for Cosmos DB.


Q&A

Q. Does Cosmos DB impose a schema?

A. Cosmos DB is a true NoSQL solution and as such does not impose any schema on the data, however like all other NoSQL solutions schema is implicitly defined from your application data requirements. For example if you query a container for a set of documents returning c.name, c.age, c.height then you have implied a schema (through expectation) to the data you are reading. The Cosmos engine itself does not impose or constrain any such constraints.

Q. How do I backup and restore a Cosmos DB database?

A. Currently backups are automatically taken every 4 hours but only the last 2 backups are retained for active accounts. When an account is deleted, those backups are then retained for 30 days. Improvements in this area will come to the backup and restore capabiliites. Microsoft states that “When data corruption occurs and if the documents within a container are modified or deleted, delete the container as soon as possible. By deleting the container, you can avoid Azure Cosmos DB from overwriting the backups.” so ultimately you need to file a support ticket or call Azure support to restore the data from automatic online backups (but this is only possible if you are on a non-basic Azure Plan). For more information see Restore data from a backup in Azure Cosmos DB.

Q. What are Request Units or RUs?

A. A single 1k read on a single item equals 1 Request Unit or RU. Similary a single 1k write for a single item equals 5 RUs. You can estimate capacity usage through the Cosmos DB Capacity Calculator – and you can even upload a sample document. RU requirements can be affected by document item size, no of reads per sec, no of writes per sec, indexing policy, consistency model, number of queries per second. RU sizing is paid for hourly – this means if you scale it up or down you are still getting charged for that full hour!

Q. What are the Cosmos APIs?

A. Cosmos DB provides multiple APIs which are programming models that you can use to develop against the underlying storage engine using a familiar language and environment of choice. Currently, the models that are supported include:

  • etcd API (according to Rancher) is a distributed and reliable key-value store first developed by CoreOS and now open-sourced and managed by the Cloud Native Computing Foundation serving as a reliable way of storing data across a cluster of servers. In the context of Cosmos DB, etcd allows you to use etcd API as the backend store for AKS. This provides developers to scale Kubernetes state management on a fully managed cloud native PaaS service. It is currently in private preview and (unlike the other APIs) cannot be deployed through the portal, CLI, or SDKs. It can only be provisioned via ARM templates. See Introduction to the Azure Cosmos DB etcd API.
  • SQLAPI is Microsoft’s Document API providing JSON document storage and is formally known as Azure DocumentDB. It provides a SQL like query syntax,  and a JavaScript programming model.
  • Table API is Microsoft’s Key/ Value store and a future replacement for Azure Table Storage. Since the Cosmos API provides improved scalability benefits, Azure Table Storage is deprecated and accounts will be migrated across by Microsoft over time.
  • Mongo DB API – TBC
  • Cassandra API – TBC
  • Graph API – TBC

Q. What Programming Languages can I develop against the Cosmos APIs?
A. TBC

Q. What are Cosmos DB Multi-masters?
A. Cosmos DB provides the ability to Geo-replicate your database at the click of a button. Multi-Master Replicas provide write scale since every replica is writeable and this can reduce the latency writing to Cosmos DB from the respective Geographic region. Writes to a non-multi-master would otherwise need to be re-routed to the region master.

Q. How is Cosmos DB data physically stored in each API?
A. Regardless of the API chosen, Cosmos DB stores all data in ARS format (Atom Record Sequence). This presents some interesting posibilities for supporting multiple Cosmos APIs and interoperability scenarios. For the time being, your chosen Cosmos DB API will dictate the way in which you query the underlying ARS data.