Events Patterns: What is Transactional Outbox and why it may be critical for your data consistency
Asynchronous communication between services often is critical in distributed systems. It allows to build loosely coupled services that may work almost independently, just exchanging events. This way of communication is so robust and flexible that there are even types of architectures fully built on events: Events Sourcing and Event-Driven Architecture (EDA). But it’s not the topic of the article for now.
Today we’ll talk about data consistency while transferring information through events. A common reason to send an event is a change of an entity in one service that other services are interested in. Sometimes it’s not that important to deliver all the events, especially when there are millions of them and losing one will change really nothing. But sometimes, e.g. working with payments, you can’t lose even a single event, since it may cause serious consequences, so you need a guarantee that consumers will eventually receive all the changes made on the producer’s services side. And of course the received event should mean that change had actually happened and been committed (sounds obvious, but it’s important as we’ll see further).
What’s wrong with the simplest approach?
— I will just send my events into Kafka (Rabbit, NATS, etc.) and it will do all the work, — some inexperienced developer may say.
But the problem is how to do it safely. What if the data was already changed in the database, but the message broker is not available (it’s down, network issues, etc.)?
— Ok, I can wrap the call to the message broker with the database transaction, responsible for data change. In that case, when sending events to the broker fails, it rolls back the whole transaction. (The same inexperienced developer’s possible solution).
But it may be even more dangerous. The event can’t be returned back if the transaction failed as it was sent to the broker before the transaction was committed or rolled back. And it’s not all the possible issues with this approach. If your broker is fast enough and your transaction is slow enough, then consumers may receive and handle the event before related data is saved in the database. It may be a big issue if the event handler depends on the newly changed data.
These two problems should be clear from the pictures below.
An additional problem with both solutions is that the application should know where to send the event and do it synchronously. It may be not that significant, but anyway it’s an additional dependency and additional network call.
Solution
Fortunately, the problem of saving several chunks of data together is not new and was solved millions of times. And the best solution for it was already mentioned — it is a database transaction. Transactional systems in most of the RDBMS follow principles of ACID, of which we need only A that stands for Atomicity. The only thing that should be changed is that events should be stored inside the same database.The most common way to do it is to create a separate table looking roughly like that:
Using this table we get a simpler picture of the previous scheme:
Now the event is guaranteed to be created with the related data or reverted in case of failure. This solution is so common in the world of distributed systems that it has already become a pattern called Transactional Outbox.
Let’s sum up what this pattern brings us:
- guarantees event creation in case of transaction success
- guarantees that no event will be sent if the transaction fails
- no need to use 2PC or other complex technologies
- allows to create an event in the middle of the transaction without worrying about the result
- removes the dependency of application code on the message broker
What’s next
Ok, should I now remove my message broker, which took so much time to set it up (properly)? Don’t hurry! As you may notice, Transactional Outbox pattern solves only part of the problem — it creates events, but doesn’t deliver them to the destination. Events are safely stored in the outbox table but then somehow should be transferred to the consumers. And here you have several options, which I will just briefly mention:
- still use the message broker, publishing events there from outbox table
- allow other services to read the outbox table directly (not recommended)
- create separate service, that will push data to the consumers
- create polling consumers, constantly asking for new events
Every option is a discussion topic on its own and should be chosen depending on the size and complexity of your project.
Links
https://microservices.io/patterns/data/transactional-outbox.html