April 10, 2021
I like to work on my reading skills. I think that I should have read so many books and articles related to system design. I also think that I should have understood so many concepts as well. I need to put together some hard work.
Putting your events on a diet
Anybody can write code that will work for a few weeks or months, but what happens when that code is no longer your daily focus and the cobwebs of time start to sneak in? What if it’s someone else’s code? How do you add new features when you need to relearn the entire codebase each time? How can you be sure that making a small change in one corner won’t break something elsewhere?
Complexity and coupling in your code can suck you into a slow death spiral toward the eventual Major Rewrite. You can attempt to avoid this bitter fate by using architectural patterns like event-driven architecture. When you build a system of discrete services that communicate via events, you limit the complexity of each service by reducing coupling. Each service can be maintained without having to touch all the other services for every change in business requirements.
But if you’re not careful, it’s easy to fall into bad habits, loading up events with far too much data and reintroducing coupling of a different kind. Let’s take a look at how this might happen by analyzing the Amazon.com checkout process and discussing how you could do things differently.
What do I mean by event?
Before we get to the checkout process, let me be specific about what I mean by the word event. An event has two features: it has already happened and it is relevant to the business. A customer has registered, an order was placed, a new product was added—these are all examples of events that carry business meaning.
Compare this to a command. A command is a directive to do something that hasn’t happened yet, like place an order or change an address. Often, commands and events come in pairs. For example, if a PlaceOrder
command is successful, an OrderPlaced
event can be published, and other services can react to that event.
Commands only have one receiver: the code that does the work the command wants done. For example, a PlaceOrder
command only has one receiver because there’s only one chunk of code capable of placing the order. Because there’s only one receiver, it’s quite easy to change, modify, and evolve the command and the handling code in lockstep.
However, events will be consumed by multiple subscribers. There might be two, five, or fifty pieces of code that react to the OrderPlaced
event, such as payment processing, item shipping, warehouse restocking, etc. Because there can be many places subscribing to the event, modifying the event can have a large ripple effect through multiple different systems, as you’ll see shortly.
Let’s buy something
Let’s go to Amazon to buy Enterprise Integration Patterns by Gregor Hohpe and Bobby Woolf, which is valuable reading for anybody building distributed systems. You visit Amazon, put the book in your shopping cart, and then proceed to checkout. What happens next?
Note: Amazon’s actual checkout process is more complex than presented here, and it changes all the time. This simplified example will be good enough to illustrate the point without getting too complicated.
As you’re guided through the checkout process, Amazon will gather a bunch of information from you in order to place the order. Let’s briefly consider what information will be necessary for your order to be completed:
- The items in your shopping cart
- The shipping address
- Payment information, including payment type, billing address, etc.
When you reach the end of the checkout process, all of this information will be displayed for your review, along with a “place your order” button. When you click the button, an OrderPlaced
event will be raised containing all of the order information you provided, along with an OrderId
to uniquely identify the order. The event could look something like this:
class OrderPlaced {
Guid OrderId
Cart ShoppingCart
Address ShippingAddress
PaymentDetails Payment
}
In your Amazon-like system, there will be subscribers for this event that will spring into action once it has been published: billing the order, adjusting inventory levels, preparing the item for shipment, and sending an email receipt. There could be additional subscribers that manage customer loyalty programs, adjust item prices based on popularity, update “frequently bought with” associations, and countless other things. The important thing is that, a few days later, a new book arrives in a box on the doorstep.
So everything is great, right?
Event bloat
This OrderPlaced
event decouples the web tier from the back-end processing, which makes you feel good about yourself but hides more insidious coupling that could get you into trouble later. It’s like overeating at a big family gathering—it feels good in the moment, but eventually you’re going to have a stomachache.
An event such as this robs each service of autonomy because they are all dependent upon the Sales service to provide the data they need. These different data items are locked together inside the OrderPlaced
event contract. So, if Shipping wants to add a new Amazon Prime delivery option, that information needs to be added to the OrderPlaced
event. Billing wants to support Bitcoin? OrderPlaced
needs to change again. Because the Sales service is responsible for the OrderPlaced
event, every other service is dependent upon Sales.
With each change to the OrderPlaced
event, you’ll need to analyze every subscriber, seeing if it needs to change as well. You may end up having to redeploy the entire system, and that means testing all of the affected pieces as well.
So really, you don’t have autonomous services. You have a tangled web of interdependent services. The aim of event-driven architecture was to decouple the system so that changes to business requirements could be implemented by only targeted changes to isolated services. But with a fat event like the one shown above, this becomes impossible.
Congratulations, you’ve created Frankenstein’s monster. In essence, you traded a monolithic system for an event-driven distributed monolithic system. What if you could untangle these systems so that they are truly autonomous?
Time for a diet
To trim down the event and get it into fighting shape, you need to put it on a diet. To do that, let’s start over and analyze each piece of information in the OrderPlaced
event and assign it to a specific service.
OrderId
and ShoppingCart
relate to selling the product, so those can be owned by Sales. ShippingAddress
, however, relates to shipping the products to the customer, so they should be owned by a Shipping service. Payment
relates to collecting payment for the products, so let’s have that belong to a Billing service.
class OrderPlaced {
Guid OrderId // Sales
Cart ShoppingCart // Sales
Address ShippingAddress // Shipping
PaymentDetails Payment // Billing
}
With these boundaries drawn, we can review the checkout process and see if there’s a way to improve things.
Slimming down
The trick to slimming down our events and reducing coupling between services is to create the OrderId
upfront. There’s no law that all IDs must come from a database. An OrderId
can be created when the user starts the checkout process.
You can start the checkout process by sending a CreateOrder
command to the Sales service to define the OrderId
and the items in the cart:
class CreateOrder {
Guid OrderId
Cart ShoppingCart
}
The next step of the checkout process was selecting the shipping address. Rather than adding that data to the OrderPlaced
event, what if you instead created a separate command?
class StoreShippingAddressForOrder {
Guid OrderId
Address ShippingAddress
}
You can send the StoreShippingAddressForOrder
command from the web application straight to the Shipping service that owns the data. The order hasn’t even been placed at this point, so no packages are getting shipped just yet. When it does come time to ship the order, the Shipping service will already know where to send it.
If the customer never finishes the order, there’s no harm in having completed these steps already. In fact, there are valuable business insights to be gained from analysis of abandoned shopping carts, and having a process to contact users who have abandoned shopping carts can prove to be a valuable way to increase sales.
Next in the checkout process, you must collect payment information from the customer. Since Payment
is owned by the Billing service, you can send this command to Billing:
class StoreBillingDetailsForOrder {
Guid OrderId
PaymentDetails Payment
}
The Billing service will not charge the order yet—just record the information and wait until the order is placed. If your organization does not want to bear the security risk of storing credit card information, the payment can be authorized now and captured after the order is placed.
All that’s left is to place the order. By creating the OrderId
upfront, we were able to remove most of the data that was in the original OrderPlaced
event, sending it instead to other services that own those pieces of information. So the Sales service can now publish an extremely simple OrderPlaced
event:
class OrderPlaced {
Guid OrderId
}
This slimmed-down OrderPlaced
event is a lot more focused. All the unnecessary coupling has been removed. Once this event is published by Sales, Billing will take the payment information, which it has already stored, and charge the order. It will publish an OrderBilled
event when the credit card is successfully charged. The Shipping service will subscribe to OrderPlaced
from Sales and OrderBilled
from Billing, and once it receives both, it will know that it can ship the products to the user.
Let’s take a look at the two versions of the OrderPlaced
event again:
// Before
class OrderPlaced {
Guid OrderId // Sales
Cart ShoppingCart // Sales
Address ShippingAddress // Shipping
PaymentDetails Payment // Billing
}
// After
class OrderPlaced {
Guid OrderId
}
Which event would be the least risky to deploy to production? Which would be easier to test? The answer is the smaller event, with all of the unnecessary coupling removed.
Fighting shape
The benefit to slimming down our events is getting them into fighting shape to tackle changes in business requirements that are sure to come down the line. If we want to introduce Amazon Prime shipping or support Bitcoin as a form of payment, it’s now a lot easier to do it without having to modify the Sales service at all.
To support Prime shipping, we would send a SetShippingTypeForOrder
command to the Shipping service during the checkout service. It would look something like this:
class StoreShippingTypeForOrder {
Guid OrderId
int ShippingType
}
This would be the second command we send to the Shipping service, along with StoreShippingAddressForOrder
. The addition of Prime shipping will change how the Shipping service prepares an order, but there’s no reason to touch the OrderPlaced
event or any of the code in the Sales service.
In similar fashion, we could implement Bitcoin, a concern of the Billing service, in a few different ways. We could add Bitcoin properties to the PaymentDetails
class used in the StoreBillingDetailsForOrder
command. Or we could devise a new command specifically for Bitcoin and send that instead of StoreBillingDetailsForOrder
. In this case, Billing would not publish an OrderBilled
unless payment had been made in one of the two forms. After all, the Shipping service just cares that the order was paid for. It doesn’t care how.
In any case, support for Bitcoin would be implemented solely by changing elements of the Billing service. Sales and Shipping would remain completely unchanged and would not have to be retested or redeployed. With less surface area affected by each change, we can adapt to changing business requirements much more quickly.
And that was kind of the point of using event-driven architecture in the first place.
Summary
In event-driven systems, large events are a design smell. Try to keep events as small as possible. Services should really only share IDs and maybe a timestamp to indicate when the information was effective. If it feels like more data than this needs to be shared between services, take it as an indication that perhaps the boundaries between your services are wrong. Think about your architecture based on who should own each piece of data, and put those events on a diet.
For more information on how to create loosely-coupled event-driven systems, check out our NServiceBus step-by-step tutorial.
No comments:
Post a Comment