We have been experiencing on the latest years a huge hype on using non relational databases and now deciding which persistence model choose may be something complex.
Deciding the best way to keep the state of your domain objects will depend on your project, but we can’t be unaware of the options, and balance the pros and cons.
I would like to leave here a reflection on the use of NoSQL databases and the persistence of aggregates. Maybe it can help you on your decision.
In order to have the same concepts talked in this post, according to Martin Fowler, an aggregate is a collection of domain objects, that can be treated as a unit.
I want to start this reflection by talking about the relational database model, pointing that you will have to split your business model into tables, columns and relationships. With this type of data model, you will always have to make an infrastructure effort to translate lines of a result set into your domain objects, even if you are using an ORM. Also it is very common using design patterns to transform your data into something relevant on your domain, the factory pattern is an example.
See below an image made by Martin Fowler that shows very clear this complexity.
With this scenario, we can raise up three steps to model a system with a relational database: Create tables and relationship between them, create the strategy to get the data, create the strategy to translate this data into your domain objects.
On the other hand, NoSQL databases offer a schemaless way to store/restore this data, by it’s key/value simplicity.
There are several products implementing this idea, with tiny variations, but all of them share the same concept: Storing documents (usually JSON) identified by a key.
This model is great for applications because storing/restoring data is made in a natural way on most programming languages, serializing the aggregate state in JSON and converting JSON into objects.
Also, NoSQL Databases offer a great scalability option to your business. They are modeled in a way that the data can be processed in clusters. There is a detailed article made by IBM showing this case, including most NoSQL players, that can also help on your model decision.
The side effects of this model are on the way that you will read this data, because if we lool into a collection of “SalesOrders” and we need to know the orders made in a specific day, or all orders with the total value greater than US$ 100, we would not have a defined key for those queries and end up having a “full table scan” in every query.
The above example, shows analytics questions on the top of the data, in this case, those questions may be answered through an analytical database, making use of Data Warehouses. On the transactional side, we would have only transaction actions, such as, restoring a sales order, change the status, cancel, and so on.
Finally, it looks reasonable using NoSQL to store the state of your aggregates, due to the low cost of maintenance, scalability, easy when implementing a solution. The only restriction is on the way that you will ask for the data.
If there is something else that was not covered on this reflection and you think that it is relevant to share, please, leave a comment.
This article is also available in Portuguese.