May 13, 2021
Introduction
It is the most challenge thing to learn by book reading. I like to use blogger to help me, track every minute and also highlight things new to me.
CQRS document by Greg Young
Page 46/56
Event Storage as a Queue
It has been previously discussed that the events coming out of a domain are also an [Integration Model]. Very often these events are not only saved but also published to queue where they are dispatched asynchronously to listeners either within the same system (the reporting model is a good example) or to other applications. An issue that exists with many systems publishing events is that they require a two phase commit between whatever storage they are using (Relational or otherwise) and the publishing of their events to the queue.
The reason that the two-phase commit is needed is that a catastrophe could occur during the small period of time between when the write to the data storage commits and when the write to the queue commits. If a failure were to happen during this period the message would not be published on the queue (or if the other direction it may be published but the change may not be saved). If either case were to happen the listeners of the events would be out of sync with the producer.
The two-phase commit can be expensive but for low latency systems there is a larger problem when dealing with this situation. Generally the queue itself is persistent so the event becomes written on disk twice in the two-phase commit, once to the Event Storage and once to the persistent queue. Given for most systems having dual writes is not that important but if you have low latency requirements it can become quite an expensive operation as it will also force seeks on the disk. Figure 5 illustrates the two phase commit between data storage and a publishing queue.
Some try to get around this problem by only writing to a queue then have something on the other side of the queue update the data storage with the changes represented by the events, this however has some issues. The largest issue is that not all of the events will be able to be written to the storage, eventual consistency has been introduced and it is possible that an optimistic concurrency problem will occur on the write of the events. Dealing with this problem in a production system is non-trivial.
Many organizations do the opposite, use the event storage as a queue. Adding a sequence number to the Events table previously discussed allows the use the Event Storage as a queue. Figure 5 illustrates the change to the schema of the Events table.
The database would insure that the values of sequence number would be unique and incrementing, this can be easily done using an auto-incrementing type. Because the values are unique and incrementing a secondary process can chase the Events table, publishing the events off to the queue. The chasing process would simply have to store the value of the sequence number of the last event it had processed, it could even update this value with a two-phase commit bringing the update and the publish to the queue into the same transaction. This process can be seen in Figure 7.
The work has been taken off of the initial processing in a known safe way. The publish can happen asynchronously to the actual write. This lowers the latency of completing the initial operation, it also will limit the number of disk writes in the processing of the initial request to one. This strategy can be extremely valuable when dealing with low latency requirements as it allows much of the work on the initial processing to be offloaded to another process asynchronously and in a safe way, there is little difference whether the publish happens as part of the initial processing or asynchronously as generally messages are published asynchronously anyways, using the Event Store as a queue just raises the time until the message is actually published slightly, this can be viewed as slightly raising the SLA.
CQRS and Event Sourcing
CQRS and Event Sourcing become most interesting when combined together. This chapter looks at the intersection of these two concepts within a system where Domain Driven Design has been applied.
CQRS and Event Sourcing have a symbiotic relationship. CQRS allows Event Sourcing to be used as the data storage mechanism for the domain. One of the largest issues when using Event Sourcing is that you cannot ask the system a query such as “Give me all users whose first names are ‘Greg’”. This is due to not having a representation of current state. With CQRS the only query that exists within the domain is GetById which is supported with Event Sourcing.
Event Sourcing is also very important when building out a non-trivial CQRS based system. The problem with integration between the two models is a large one. The maintaining of say relational models, one for read and the other for write, is quite costly. It becomes especially costly when you factor in that there is also an event model in order to synchronize the two. With Event Sourcing the event model is also the persistence model on the Write side. This drastically lowers costs of development as no conversion between the models is needed.
The original stereotypical architecture with using commands in Figure 1 can be compared to Figure 2 CQRS with Event Sourcing and found to be roughly equivalent amounts of work.
Cost Analysis
The client will be identical amounts of work between the two architectures. This is because the client operates in the exact same way. In both architectures the client receives DTOs and produces Commands that tell the Application Server to do something.
The queries between the two models will also be very similar in terms of cost. In the stereotypical architecture the queries are being built off of the domain model, in the CQRS based architecture they are being built by the Thin Read Layer projecting directly to DTOs. As was discussed in “Command and Query Responsibility Segregation” the Thin Read Layer should be equally or in some cases less expensive.
The main differentiation between the two architectures when looking at cost is in the domain model and persistence. In the stereotypical architecture an ORM was doing most of the heavy lifting in order to persist the domain model within a Relational Database. This process introduces an Impedance Mismatch between the domain model and the storage mechanism, the Impedance Mismatch as discussed in “Events as a Storage Mechanism” can be highly costly both in productivity and the knowledge that developers need to have.
The CQRS and Event Sourcing based architecture does not have an Impedance Mismatch between the domain model and the storage mechanism on the Write side. The domain produces events, these same events are all that is stored. The usage of events is all that the domain model knows about. There is however an impedance mismatch in the read model. The Event Handlers must take events and update the read model to its concept of what the events mean. The Impedance Mismatch here is between the Events and the Relational Model.
The Impedance Mismatch between events and a Relational Model is much smaller than the mismatch between an Object Model and a Relational Model and is much easier to bridge in simple ways. The reason for this is that the Event Model does not have structure, it is representing actions that should be taken within the Relational Model.
Looked at from this perspective, the two architectures have roughly the same amount of work being done. Its not that its a lot more work or a lot less work; its just different work. The event based model may be slightly more costly due to the need of definition of events but this cost is relatively low and it also offers a smaller Impedance Mismatch to bridge which helps to make up for the cost. The event based model also offers all of the benefits discussed in “Events” that also help to reduce the overall initial cost of the creation of the events.
Integration
Everything up until this point has been comparing the systems in isolation. This rarely happens within an organization. More often than not organizations do only rely on systems but on systems of systems that are integrated in some way.
With the stereotypical architecture no integration has yet been supported, except of course perhaps integration through the database which is a well established anti-pattern for most systems. Integration is viewed as an afterthought.
The integration must be custom written. Many organizations choose to build services over the top of their model to allow for integration. Some of these services may be the same services that the clients use but more often than not there is additional work that must be done in order to support integration.
A larger problem exists when the product is being delivered to many customers. It is the teams responsibility to provide hooks for all of the customers and how they would like to integrate with thesystem. This often becomes a very large and unwieldy piece of code, especially on systems that are installed at hundreds or thousands of different clients all of which have different needs. The business model here tends to be to bill the client for each piece of custom integration, this can be quite profitable but it is a terrible model in terms of software.
With the CQRS and Event Sourcing based model, integration has been thought of since the very first use case. The Read side needs to integrate and represent what is occurring on the Write Side, it is an integration point. The integration model is “production ready” all throughout the initial building of the system and it is being tested throughout by the integration with the Read Side.
The event based integration model is also known to be complete as all behaviors within the system have events. If the system is capable of doing something, it is by definition automatically integrated. In some circumstances it may be desirable to limit the publishing of events but it is a decision to limit what is published as opposed to needing to write code to publish something.
The event based model is also by nature a push model that contains many advantages over the pull model. If the stereotypical architecture desired a push based model then there would be large amounts of work added to track events and ensure that they were synchronized with what the system recorded in its own data model.
Differences in Work Habits
The two architectures also differ greatly in parallelization of work. In the stereotypical architecture work is generally done in vertical slices. There are four common methodologies used.
- Data Centric: Start with database and working out.
- Client Centric: Start with client and work in.
- Façade/Contract First: Start with façade, then work back to data model then work finally implement client
- Domain Centric: Start with the domain model, work out to the client then implement data model
- Technical Proficiency
- Knowledge of the Business Domain
- Cost
- Soft Skills
No comments:
Post a Comment