Life Beyond Distributed Transactions: An Apostate's Implementation - A Primer

Posts in this series:

Sample code from this series

For those working with SQL databases, working with transactions is more or less a given. The most we may need to worry about is:

  • Using the appropriate isolation level
  • Not doing too much in a single transaction to prevent excessive locks

The vast majority of applications I see can blissfully ignore the inner workings of transactions in database. We take for granted that our operations on one or more rows are either committed or rolled back.

Things get more complicated when we start dealing with transactions no longer confined to a single resource, or in the case of many NoSQL databases, multiple entities. In many NoSQL databases, transactions are limited to a single entity/record. And if multi-entity transaction are supported, there are a number of limitations that might make that choice undesirable.

When we have two non-transactional resources, we know we have a number of overall patterns to try to coordinate these actions (covered in my Refactoring Towards Resilience series:

All of these options assume "I must have these two actions temporally coupled" and have them happen at the same time.

But what if that wasn't the case? What if we moved away from trying to coordinate two actions, and had more loose coupling between our resources? What might that look like?

And that's the main scope of Pat Helland's paper, Life Beyond Transactions: An Apostate's Opinion. In this paper, Pat describes a mechanism to overcome the fundamental issue of coordinating actions between resources when our transactions only cover a single entity - messaging!

We use some sort of messaging where our messages are saved inside the entities to direct to other entities:

Since the scope of a transaction is a single entity, if we need to affect other entities, we can't do that directly. Instead, we store the intent to affect change as a message inside our entity. The transaction covers our business data, and communication to the outside world.

This certainly isn't the only way to tackle this problem, as I could use the Saga pattern as a means to manage failures between multi-entity activities. Caitie McCaffrey has a great talk about this, but for my situation, I couldn't directly use the Saga pattern, since that implied there was some sort of logical "undo". Instead, I wanted to be able to any of the many coordination patterns available to me, including the Saga pattern.

Before we get into implementation details, let's look at some real code in a real database that doesn't completely allow multi-entity transactions.

Real World Example

Let's suppose I have an ecommerce application, where I can view products, add items to a cart, check out, then finally, approve orders. As part of approving an order, I need to decrement stock. We'll skip any complex business rules, like negative stock, reservations, and the like. Just to keep things simple, when we order, we just subtract the quantity order from our stock reserve:

var orderRequest = await _orderRepository.GetItemAsync(id);

orderRequest.Approve();

await _orderRepository.UpdateItemAsync(orderRequest);

foreach (var lineItem in orderRequest.Items)
{
    var stock = (await _stockRepository
        .GetItemsAsync(s => s.ProductId == lineItem.ProductId))
        .Single();

    stock.QuantityAvailable -= lineItem.Quantity;

    await _stockRepository.UpdateItemAsync(stock);
}

In my case, I'm using Azure Cosmos DB, which supports a variety of consistency levels. Azure Cosmos DB also supports multi-document transactions, but only in the form of stored procedures and functions, and even then, only with some limitations. Once you introduce other resources into the mix, like Azure Service Bus, Azure SQL Database, or really anything outside a single partition key, we can no longer use transactions.

With this in mind, we look back at our original code, and ask ourselves, "Do this action need to couple these resources, or can we decouple them?" There are certainly cases where we need to coordinate two actions (using a coordinator as we saw before), but there are many cases we don't, and don't want to incur the cost of a larger-scoped transaction.

For our above case, do we need to deduct stock right at the time we approve an order? Or can it happen later? According to the business, deducting stock doesn't need to happen immediately, but it does need to happen, eventually.

In the next few posts, I'll walk through building out a mechanism for communication with other entities (and even other resources), to see how we might build out an atomic communication system with Azure Cosmos DB as our example.