Event Sourcing
"[E]very operational command executed on any given Aggregate instance in the domain model will publish at least one Domain Event that describes the execution outcome. Each of the events is saved to an Event Store in the order in which it occurred. When each Aggregate is retrieved from its Repository, the instance is reconstituted by playing back the Events in the order in which they previously occurred... To avoid this bottleneck we can apply an optimization the uses Aggregate state snapshots." [1]
CQRS
Instead of having one data store for everything, "the change that CQRS introduces is to split that conceptual model into separate models for update and display, which it refers to as Command and Query respectively". [2]
I've seen this work well in a project that used a Lambda Architecture. Here, risk data was being written slowly but surely to a Hadoop cluster. When a day's data was complete, the data could be converted into a format that was digestible by the Risk Managers.
"Having separate models raises questions about how hard to keep those models consistent, which raises the likelihood of using eventual consistency." [2] For us this was fine as the Risk Managers were in bed while the munging was taking place.
"Interacting with the command-model naturally falls into commands or events, which meshes well with Event Sourcing." [2] And this is where
Kafka came in.
Martin Fowler is cautious about using CQRS [2] but we had a good experience. In our domain, the Risk Managers were not interested in the Query model after two weeks and so it could be thrown away. Had it been needed at any time in the future, it could always have been generated again from the Command model.
Lambda Architecture
Lambda Architectures consist of three main pieces:
- The batch layer, managing the master dataset (an immutable, append-only set of raw data) and pre-computing batch views.
- The serving layer, indexing batch views so that they can be queried in a low-latency, ad-hoc way.
- The speed layer, dealing with recent data only, and compensating for the high latency of the batch layer.
As mentioned in the CQRS subsection above, "even if you were to lose all your serving layer datasets and speed layer datasets, you could reconstruct your application from the master dataset. This is because the batch views served by the serving layer are produced via functions on the master dataset, and since the speed layer is based only on recent data, it can construct itself within a few hours." [3]
Since Hadoop has an append-only, highly available file system (
HDFS), it makes it an obvious choice for the batch layer.
"The serving layer is a specialized distributed database that loads in a batch Batch layer view and makes it possible to do random reads on it (see figure 1.9). When new batch views are available, the serving layer automatically swaps those in so that more up-to-date results are available. A serving layer database supports batch updates and random reads. Most notably, it doesn’t need to support random writes." [3] Cassandra was our choice for this layer.
Finally, "the speed layer only looks at recent data, whereas the batch layer looks at all the data at once. Another big difference is that in order to achieve the smallest latencies possible, the speed layer doesn’t look at all the new data at once. Instead, it updates the realtime views as it receives new data instead of recomputing the views from scratch like the batch layer does. The speed layer does incremental computation instead of the recomputation done in the batch layer." [3]
Lambda architecture shares some
similarities with CQRS but is very different in that it never updates data (although it can add data that renders old data obsolete).
Feature Branching
The idea here is that each developer works on his own branch. He regular synchs with the main branch to lessen the pain of later merging his work into it. With Git, he'd regularly run:
> git checkout master
> git pull
> git checkout BRANCH_NAME
> git merge master
> git push
There is some debate on whether Feature Branching is an anti-pattern. Ideally, the system should be architected such that a feature is isolated to a particular silo of the code and not spill out to other components. Thus, a developer can merrily work on that particular silo and check into the main branch all the time. Mainly for political reasons, though, this is not always possible. Software like
Stash that integrates with Jira can make code reviews even within a distributed team quite pleasant.
[1] Implementing Domain Driven Design, Vaughn Vernon
[2] Martin Fowler's
blog.
[3]
Big Data, Nathan Marz