Refactoring Towards Resilience: Evaluating Stripe Options

Other posts in this series:

In the last post, I looked at a common problem in web applications - accepting payments via 3rd party APIs (in this case, Stripe). We saw a fundamental issue in our design - because Stripe does not participate in our transactions, any failures after a call to Stripe will result in money being lost in the ether. For reference, our original code:

public async Task<ActionResult> ProcessPayment(CartModel model) {
    var customer = await dbContext.Customers.FindAsync(model.CustomerId);
    var order = await CreateOrder(customer, model);
    var payment = await stripeService.PostPaymentAsync(order);
    await sendGridService.SendPaymentSuccessEmailAsync(order);
    await bus.Publish(new OrderCreatedEvent { Id = order.Id });
    return RedirectToAction("Success");
}

How can we address this "missing money" problem? With each interaction, we have four basic options:

Coordination Options

We need to look at each of our Stripe interactions and based on which of these options are available, decide what coordination action we need to take. Above all though, we need to make sure the end user's expectations are still met. It's all for naught if we implement a back-end process that isn't explicitly communicated to the user.

Up until this point, our code picked option #1, "Ignore" for any failures. If a later failure occurred, we ignored the result and our action remained "successful". We can do better of course, and Stripe have a number of options available to use. Let's look at these options one at a time.

Retry

First, what about "Retry"? We could retry our Stripe payment, but that would result in two payments issues to the customer! Not exactly what we want. It turns out, however, that Stripe can handle idempotent requests by passing in some sort of client-generated idempotency key. From the docs, idempotency keys passed to Stripe expire within 24 hours. This means we can safely retry as many times as we want within 24 hours.

For our idempotency key, we can look at something unique in the cart, perhaps the Cart's ID? In this case, our code would look slightly different in our payments:

public async Task<ActionResult> ProcessPayment(CartModel model) {
    var customer = await dbContext.Customers.FindAsync(model.CustomerId);
    var order = await CreateOrder(customer, model);
    var payment = await stripeService.PostPaymentAsync(order, model.CartId);
    await sendGridService.SendPaymentSuccessEmailAsync(order);
    await bus.Publish(new OrderCreatedEvent { Id = order.Id });
    return RedirectToAction("Success");
}

With this in place, we can retry our action and our payment will be posted only once. But wait - who is retrying? What's the mechanism for our payment to be retried? We have a couple of options here. First, we can bake in some sort of "retry" in our Stripe service. Try X number of times, and finally fail and throw an exception if the payment didn't work.

That won't help in the case of a subsequent step failure, with SendGrid or RabbitMQ, however. If we wanted to retry the entire action, we'd likely need something on the client side to detect failures and retry the POST as necessary. In reality, we'd want to have more control over this retry process than driving it from the browser.

Undo

Undo is interesting in that it allows to perform a compensating action in case of subsequent failures. The "undo" action is highly dependent on what our action is in the first place. In the case of a Stripe payment, what would an "undo" action be? Like all payment gateways I'm aware of, Stripe allows the ability to refund any transaction. We only need to refund the customer if anything goes wrong:

public async Task<ActionResult> ProcessPayment(CartModel model) {
    StripePayment payment = null;
    try {
        var customer = await dbContext.Customers.FindAsync(model.CustomerId);
        var order = await CreateOrder(customer, model);
        payment = await stripeService.PostPaymentAsync(order);
        await sendGridService.SendPaymentSuccessEmailAsync(order);
        await bus.Publish(new OrderCreatedEvent { Id = order.Id });
    } catch {
        if (payment != null) {
            await stripeService.RefundPaymentAsync(payment);
        }
        throw;
    }
    return RedirectToAction("Success");
}

This doesn't quite address the problem, though, as something could still go wrong with our transaction. We really need to handle the database transaction explicitly ourselves. We'd want to extend our solution just a bit, devising a way to tell our MVC transaction filter to not open a transaction if we're explicitly opening one in our action:

[ExplicitTransaction]
public async Task<ActionResult> ProcessPayment(CartModel model) {
    StripePayment payment = null;
    try {
        dbContext.BeginTransaction();
        var customer = await dbContext.Customers.FindAsync(model.CustomerId);
        var order = await CreateOrder(customer, model);
        payment = await stripeService.PostPaymentAsync(order);
        await sendGridService.SendPaymentSuccessEmailAsync(order);
        await bus.Publish(new OrderCreatedEvent { Id = order.Id });
        await dbContext.CommitTransactionAsync();
    } catch {
        if (payment != null) {
            await stripeService.RefundPaymentAsync(payment);
        }
        await dbContext.RollbackTransactionAsync();
        throw;
    }
    return RedirectToAction("Success");
}

Now we've got explicit control over our database transaction, and in the case of failure, we undo any Stripe transaction with a refund, and rollback our database transaction. All is tidied up! Well, unless the refund fails, but that's a different problem.

Coordinate

Stripe cannot participate in a two-phase commit - or can it? A two-phase commit with a payment gateway would look something like:

  1. Authorize the card for a charge
  2. Charge the card

It turns out that credit cards to support this sort of approach, it just can look weird in your account. You might see "pending transactions", and in fact, you're probably already used to seeing this. Many hotels and rental companies authorize a charge on your card, then either charge the card for incidentals or just let the authorization expire.

Stripe has this option as well, with the auth-capture flow. Un-captured authorizations will be reversed after 7 days, and on top of that, we can go into the Stripe admin site to see any un-captured authorizations to capture them as necessary. We can't really do two-phase commit inside our web flow (since we don't have an actual coordinator), but we can come close. In this new flow, we'd authorize after everything else succeeded:

[ExplicitTransaction]
public async Task<ActionResult> ProcessPayment(CartModel model) {
    StripePayment payment = null;
    try {
        dbContext.BeginTransaction();
        var customer = await dbContext.Customers.FindAsync(model.CustomerId);
        var order = await CreateOrder(customer, model);
        payment = await stripeService.PostAuthorizationAsync(order);
        await sendGridService.SendPaymentSuccessEmailAsync(order);
        await bus.Publish(new OrderCreatedEvent { Id = order.Id });
        await dbContext.CommitTransactionAsync();
    } catch {
        await dbContext.RollbackTransactionAsync();
        throw;
    }
    try {
        await stripeService.PostPaymentCaptureAsync(payment);
    } catch {
        Logger.Warning("Stripe authorization failed; authorize manually via Stripe UI");
    }
    return RedirectToAction("Success");
}

Once everything's succeeded on our side, we can finalize the authorization against Stripe. If anything goes wrong with our "PostAuthorizationAsync" call, we'll just swallow the error since we can always go into the Stripe UI and authorize ourselves.

In actuality, none of these options are that great, but if I want to preserve the behavior from the client of "card will be charged at button click". In most e-commerce apps I've been involved with, we tend not to charge right at button click since failures are difficult to recover from.

In the next post, we'll look at our options with SendGrid.