Sagas: the Search for Logical Consistency

Antonio Alexander
7 min readFeb 17, 2023

--

Getting into the microservice architecture is a kind of rabbit hole; I think the deeper you get into microservices, the more likely you are to suggest monoliths…unless you’re a masochist. So you’ve reached the zenith of microservices: you’ve got a bunch of services that have incredibly specific responsibilities and are air-gapped for all intents and purposes; but now you have some business case(s) that require interactions between services. So now…you have to solve the problem you created in the first place because we can’t just string those operations together…or can we?

I came to the conclusion that: “data consistency isn’t logical consistency” and it’s the foundation upon which I put together this article. The nuance between the two is the context in which you can use to answer the question of, “Will sagas solve my problem?” Sagas specifically DO NOT solve data consistency problems, they solve logical consistency problems.

TLDR; Sagas solve problems with operations that must maintain logical consistency across services that are totally separate; a tenant of microservices that you don’t necessary have to follow. Sagas solve logical consistency problems by performing dependent mutations in a specific order and upon failure (or identification of a logical inconsistency) can perform a compensating operation to undo successful mutations. Instead of sagas, it may be possible to maintain logical consistency by interacting with your data under the same mutex/transaction and ensure data consistency; by coupling those previously separate databases, you can empower certain code to enforce data consistency without having to perform compensating operations outside the scope of the data store.

I’ll provide a disclaimer in that I’m doing this for the first time. One of the most important things I learned…is that sagas aren’t the only way to solve logical consistency problems, you can also solve them…with more data consistency! I’m going to introduce a handful of examples and describe why sagas could be a good solution (or why not):

Imagine that you run some super cool e-commerce business and you decided to “roll your own”, so you’ve created a number of microservices to separate responsibilities:

  • Inventory: this service manages inventory for the shop; it can catalog inventory, as well as increase and decrease inventory
  • Payment: this service manages payments; payments can be made, queried for status as well as cancelled and refunded
  • Shipping: this service manages shipping inventory; inventory can be shipped, delivered and returned
  • Order: this services manages orders; orders use the inventory, payment and shipping services to achieve the business case of getting inventory from the warehouse to someone’s home…

This application has a pretty obvious happy path, but there are some business cases that could cause logical inconsistencies between the different services; consider how we’d handle the following situations:

  • What if the user cancels an order? After payment? Before shipping? After shipping?
  • What if the inventory is zero when the payment is made when it wasn’t while the order was being created?
  • What if one of the services is unavailable (e.g. the payment service is down when the order is in-progress and waiting for shipment)
  • What if inventory is removed for an in-progress order?

This hypothetical application has some obvious pitfalls, but similar to determining whether the problem you’re trying to solve is a data or a logical consistency problem; there’s a spectrum of solutions and sagas is on the far end of this spectrum because it solves an incredibly specific problem AND it’s complex to boot. I think whenever saga comes up as a possible solution, the appropriate response is: “How can I NOT use a saga to solve my problem?” Sagas are born from the tenet that each microservice should have its own database or store that is completely separate from other microservices. Sagas are unique in that they don’t use distributed transactions and are relatively asynchronous to maintain performance. But what if I told you that this orthodox practice was unnecessary?

In the “real world”, you’ll find that databases are generally the one thing that isn’t really containerized (to the extent of a microservice); it’s not something ephemeral that can be scaled up or down at a whim, it’s very much like the monolithic applications microservices attempt to subvert. Any DBA worth their salt will be against having separate databases for every microservice they might even tell you to go fuck yourself.

In the article “Race Conditions: threads passing in the night”, mutexes and locks are only as good as the scope of their environment; as the database is generally the only scalar and IF it’s the scalar between microservices…you can scope the mutex to the database…”And there you have it!”

An alternative to using a saga to resolve logical consistency is to use database transactions and appropriate relational models (e.g., normalization) to provide more data consistency. This obviously has some caveats; there are situations where you should separate microservices databases (e.g., human resources info shouldn’t be in the same database as secret product information) when their contexts are vastly different from the perspective of content, maintenance and performance. So at a high-level, those are the exact situations where you’ll need sagas; you can’t use the alternative of “more” data consistency when the data is in separate databases.

Reviewing the questions I posited earlier about our examples, here are some solutions with the assumption that since all these services are within the same context, they can be within the same database:

What if the user cancels an order? After payment? Before shipping? After shipping?

Normally, if we were using sagas, if a user cancelled an order, we would start a saga to create compensating transactions to update the inventory, shipping and payment services such that their responsibilities are in a sane state: (1) inventory is incremented by one, (2) payments are refunded and (3) if already shipped, shipment is cancelled. Or maybe if it’s already been shipped, you can only return it rather than not ship it which is logically different.

What if the inventory is zero when the payment is made when it wasn’t while the order was being created?

This is a little more complicated because it’s not driven by the order service (which would be a hypothetical saga orchestrator) but we’d need to have the inventory service communicate when inventory has changed and maintaining logical consistency (you can’t ship something you don’t have).

What if one of the services is unavailable (e.g. the payment service is down when the order is in-progress and waiting for shipment)

This situation, unlike the previous, would be driven by the order service, In the event the service was unavailable, the order service would need to “store” the state of the order (e.g., pending payment) and continue to attempt the operation until successful. In addition to this, there may be use in implementing a circuit breaker pattern

What if inventory is removed altogether for an in-progress order?

The core of this question is a data consistency question, could you create an order with a product that doesn’t exist in the inventory, what if it’s removed AFTER adding it to an order? Aside from being able to communicate inventory changes, logical consistency must be maintained by ensuring that (1) there is inventory for all items in an order and (2) the inventory for an order exists.

I think there are some obvious concerns about creating a monolithic database layer, both in the spirit of microservice architecture, but also in the perception of all the things you “avoid” by having separate databases (i.e., coupling). I think the most important thing that you should take away about databases is:

Databases are created with the express purpose of concurrent usage, data consistency and maintaining relational models (that’s what it’s made for)

Databases enable the very thing we need sagas to do under most cases: maintain logical consistency through data consistency. They’re not just a tool to store data and access it relationally, they’re as important as the language(s) you’re writing your microservices in. So I think the real question becomes: “If I couple my services at the database level, how can I ensure that they’re not coupled within the code?”. And I think for most situations, that’s totally possible by following these rules:

  • A service should only interact with tables it owns (to limit is responsibility)
  • If you want to prevent leaky abstractions, database errors dealing with foreign key constraints should be wrapped
  • If an operation involves interacting with tables within the same database owned by different services, a “new” service should be made that combines the two and if mutating data, a transaction should be used
  • If an operation involves interacting with tables within different databases owned by different services, sagas should be employed (whether or not you create a service depends on your implementation)
  • Sagas aren’t necessary if your operation is read-only; you still SHOULD create a new service to prevent coupling, but sagas are SPECIFICALLY when you have to perform an operation that mutates data dependently

Don’t develop microservices on hard mode, the saga design pattern is HARD MODE, it should be a solution you employ after doing everything you can to avoid it. Microservices is an idea, not a religion; although there are some sure fire rules to starting (i.e., the orthodox method of having a separate database per microservice), once you have to implement a monolithic idea it’s really easy to see how the microservice architecture can quickly fall apart. Like any foray into programming, the more experience you get, the more you realize the rules you can bend and the ones you can outright break.

--

--

Antonio Alexander

My first love was always learning (and re-learning); hopefully I can share that love with you.