Making Bulk Updates in Azure Cosmos DB

Here is quick guide on how to perform bulk updates in Azure Cosmos DB using the .NET SDK.

The Cosmos DB .NET SDK provides built-in support for bulk operations, which can significantly improve the performance of your application when you need to perform multiple operations in a single request. Below is an example of how to perform bulk updates using the SDK.

using Microsoft.Azure.Cosmos;
using System;
using System.Collections.Generic;
using System.Threading.Tasks;
using System.Linq;

var cosmosClient = new CosmosClient("your-cosmos-db-connection-string", new CosmosClientOptions
{
    AllowBulkExecution = true
});
var database = cosmosClient.GetDatabase("your-database-id");
var container = database.GetContainer("your-container-id");

var tasks = new List<Task>(1000); // Adjust the size based on your needs, here we are preparing for 1000 updates in one go
var cost = 0D;
foreach (var item in itemsToUpdate) // itemsToUpdate is a collection of items you want to update
{
    var task = container.CreateItemAsync(item, new PartitionKey(item.PartitionKeyValue));
    // Optionally, you can add continuation tasks to handle success or failure
    tasks.Add(task.ContinueWith(t =>
    {
        if (t.Status == TaskStatus.RanToCompletion)
        {
            cost += t.Result.RequestCharge;
            Console.WriteLine($"Item with id {item.Id} updated successfully.");
        }
        else
        {
            Console.WriteLine($"Failed to update item with id {item.Id}: {t.Exception?.Message}");
        }
    }));
}
await Task.WhenAll(tasks);
Console.WriteLine($"Total Request Charge for bulk update: {cost:0.##} RUs");

In this example, we build a list of tasks to perform asynchronous updates on items in a Cosmos DB container. We iterate over the collection of items to update, create a task for each operation, and add it to the list. Finally, we use Task.WhenAll to run all tasks concurrently.

Each task is configured with a continuation callback that checks whether the operation succeeded or failed and logs the result accordingly.

Benefits of this approach:

  • Efficiency: By batching multiple operations into a single request, you reduce the number of network round-trips, which can significantly improve performance.
  • Concurrency: The SDK handles the concurrency of operations, allowing you to perform many updates in parallel.
  • Backend Requests: The SDK optimizes the number of backend requests, which can help reduce costs associated with request units (RUs).

Important Considerations

  • Throughput: Ensure that your Cosmos DB container has sufficient throughput (RU/s) to handle the bulk operations. You may need to increase the provisioned throughput temporarily during bulk operations. Here are a few more important considerations when performing bulk updates in Azure Cosmos DB:

  • Error handling and retries: Bulk operations may encounter transient failures or rate limiting (429 Too Many Requests). Implement robust retry logic with exponential back off to handle these gracefully and ensure data consistency.

  • Idempotency: Ensure your update operations are idempotent or handle duplicates carefully, since retries or partial failures might cause an operation to be executed multiple times.

  • Request size and batch limits: Keep each batch within Cosmos DB’s maximum request size (2 MB) and operation count limits to avoid request rejections.

  • Throughput scaling: If bulk updates consume a high number of RUs, consider scaling throughput with autoscale or manually increasing provisioned RU/s temporarily during the operation.

  • Data modelling impacts: Bulk updates might affect indexing policy performance; review indexing options to optimize update costs.

Paired with effective throughput planning, these measures ensure that bulk updates in Cosmos DB remain efficient, reliable and cost-effective.