Wednesday, September 26, 2012

DDD on top of Services, DDD inside Services - A Domain Driven approach to SOA

This post is about Domain Driven Design (DDD) and Service Oriented Architecture (SOA) and how the former can be used to build services for the latter. But it starts of in a few observations about current state of the union.

The "DDD needs a database" assumption
I've come to find it being a pretty common understanding that software systems using DDD building blocks have to be backed by a database, and, even more specific, having repository implementations using an ORM framework. This leads to the assumption that domain entities, once added to the repository, are attached to an EntityManager that is responsible for "auto-magically" saving changed data as transactions are completed. I find these assumptions bringing to many constraints on a domain driven system architecture without any need. In fact, I can't find anything in the DDD approach, as put forward by Eric Evans, that supports this understanding. Instead I see the domain model as a piece of code completely free of any explicit or implicit dependencies on the surroundings. For a couple of years I've been working using this approach and in this post I will describe the architecture of such a system where repository implementations rely on services published by other systems as well as storing part of the data in a dedicated database.

SOA as spaghetti on top of CRUD
I've also found Service Oriented Architectures (SOA) to be implemented as mostly CRUD operations for fine grained data objects, or services with, even really complex, business logic firmly placed in Transaction Scripts (TS). TS might be good enough for simple things, but quite quickly the code turns into spaghetti as complexity rises. Complex business logic is where DDD really shines, but given the assumptions discussed above, many people don't see it as an option when there is no underlying database, but a set of CRUD like services.
And there we have another problem with many service implementations. When just using simple DTOs as parameters and/or return values and presenting the client with a CRUD-like service API, what support are we then giving the client-side developer? A bunch of getters and setters that could be set in any combination! How to know what (from a business point-of-view) makes up a valid object? And how to know what are valid modifications? Such APIs force the client-side developer to have intimate knowledge about the server-side implementation, and changes to the API won't show up as compiler errors. When APIs that I depend on change, I like the compiler to be able to notify me.
I think we can do much better and offer the client-side developer much better support by extending the service API with some DDD building blocks like Entities and Value Objects. In this post I will explain how we did that, in effect implementing a published domain.

Separation of concerns - The independent domain model
One of the principles of DDD that I think are most important to apply is "Separation of concerns". It is also part of Robert C Martin's SOLID principles by the name "Single responsibility principle". In short, every piece of code should deal with one problem only and have only one reason to change. It is also an important part of writing clean code since it increases clarity if a piece of code only does one thing. In the DDD perspective I apply this principle by separating code that models the business domain in order to solve business problems from technical code that glues the application together or handles communication with the outside world, such as databases or services exposed by other systems. This leads to a system with a traditional layered architecture with a slight twist: the domain model is kept in the center with no outgoing dependencies whatsoever. The domain model is concerned only with solving business problems, while the surrounding integration layer (or "anti-corruption layer" in Evans' terms) is concerned only with technical issues. The domain model is built using the DDD building blocks (entities, value objects, repositories, services and factories), with repositories (and sometimes also services and factories) represented as interfaces used by the domain, but implemented in the integration layer. Thereby creating the dependency from the infrastructure layer to the domain, and that's the twist! This is perhaps an unusual architectural style, but it is not new. Similar models have previously been described as “Hexagonal Architecture” and “Onion Architecture” among others. Robert C Martin has a nice summary of the history of this architecturalstyle on his blog.
This architectural style brings the following benefits:
- Easy to read business logic, since it is not mixed with code for interaction with databases, messaging services or other technical concerns. This caters for fewer mistakes as the software evolves.
- Easy to deploy. The domain code can be deployed in different runtime environments without changes, since no dependencies exist. At one point we made great use of this as we were moving to a new deployment platform while at the same time developing new functionality for the business.
- Easy to test business logic. All business logic can be tested by automation without any need for a runtime container. Doesn't really have to argue why that is good...

Repository on top of services
Since the domain is only concerned with business logic and the business doesn't care how entities are stored, only the fact that they can be stored and at what step in the business process that happens, all repositories in the domain are represented as interfaces. It is now up to the integration layer to provide implementations for those repositories.
Regardless whether entities are to be stored in a database, in files, or by calling a remote service (In my case it was services exposed via a customer specific API on top of stateless EJB, i.e. basically RMI, but it could be web services or any other protocol as well), or a combination, a repository needs access to the internal state of the entity in order to get (for storing) and set (for re-constructing) values. Typically most of that state isn't public in the domain model, so the repository implementation needs to gain access. Depending on security settings, reflection might be an alternative. In most cases it is not. Instead we made the members of the entity protected and created a specific sub-class of the entity in the integration layer that we used both for reading when state was not public, and for re-constructing entities on read requests.
With the problem of gaining access to internal state solved it is pretty easy to write a repository implementation that reads and writes most of the data over one or several  remote service calls and keeps the rest in a few database tables in a dedicated database. This split storage model is often a result from the fact that existing external services might not fully support the needs of the domain model. Another reason might be performance. In some cases we had to cache carefully selected data pieces in our own database to ensure timely retrieval. The beauty of this approach is that we can do all those tricks needed in the repository implementation without having any part of it leak into the domain model code; the separation of concern is total.

Published domain and services with business meaning
As mentioned above, CRUD-oriented services that let a client store and retrieve DTOs with getters and setter for every attribute push a great burden onto the developer of the client code. If using DDD to implement the service internally, we could do better by offering that domain knowledge to the client, packaged as a published domain and services that carries business meaning. A CRUD-service will never have a place outside the integration layer of the client, while the signature of a service well crafted in business terms might be a good candidate for a service API used directly in the domain model of the client. Of course in the shape of an interface with a small implementation in the integration layer to carry out the technical lifting of making a remote call. But the point is, the service signature can be used verbatim and the integration layer can be kept thin since the client domain doesn’t have to re-define what the service means in business terms; hard earned domain knowledge is reused.
The same goes for the business objects. If we publish those parts of the domain that can be used outside our service implementation, we offer the client-side developer to directly benefit from the domain knowledge we have gathered in building our service. But what parts can be published? Well, if you think about it, the part of a good API that is most useful is the limitations it imposes, i.e. the help you get to avoid doing stupid things. By publishing a model of domain objects as plain Java objects, with constructors and accessors ("getters", but in business terms and not necessarily following the Java Beans convention), we help the client-side developer to avoid constructing invalid objects and accidentally changing state that, from a business perspective, should be immutable. With only this much the  client-side developer gets much better support than with only the raw data format offered by our stateless services. In addition it might also be appropriate to offer a thin layer on top of the service calls that that exposes the services in terms of these domain objects and takes care of transforming them to/from the raw format used to go over the wire.
Having a published domain gives the client-side developer the choice to either just model transformations of it into  a more suitable model for the client context or, if contexts are closely related, decide to take on a conformist approach and extend the domain objects with additional functionality. I've done both in different contexts and it is so much better than having to experiment with DTOs to find out what are valid combinations of attribute values.

Conclusion
DDD is suitable for implementing domains on top of external services. It is also suitable for the implementation of such services, and if we carefully select parts of the domain model to publish we offer great help to the client-side developer.

Saturday, September 1, 2012

When is it appropriate to design a service?


In a system architecture using Domain Driven Design (DDD) you typically find a few typical building blocks (stereotypes) - Entities, Value Objects, Repositories and Domain Services – where the first two are stateful objects and the rest are stateless services implemented either as infrastructure services outside the domain (Repositories) or inside the domain containing only business logic (Domain Services). A common question is “When is it appropriate to design a service instead of placing the business logic in an entity or value object?”

For developers more used to building procedural designs, rather than object-oriented ones, it seems to be more natural to place logic in stateless services and have them operate on objects that are no more than data containers. This is what is commonly referred to as an “anemic domain model”. It is considered an anti-pattern in the DDD community since it decouples data from behavior and thereby produces a much less expressive and knowledge tense domain model.

I'm not a fan of stateless services in the domain model. Instead I try to favor bringing business logic into the Entity or Value Object that holds the information needed - in OO-terms it is called the "information expert". In most cases it isn’t that hard, especially if functionality is decomposed into short methods, each allocated to the object representing the concept at heart of the functionality.

A specific type of functionality that might be trickier to handle is entity creation. Who is to be responsible? For sure, it can’t be the entity itself. In general my experience is that there will be some sort of hierarchy between concepts. E.g. a SalaryPeriod might be connected to an existing RegistrationPeriod, which also contains submitted Timesheets. Then it is a good fit to have the RegistrationPeriod handle creation of the SalaryPeriod. So in general it is almost always possible to find a good place for rules regarding creation of an entity in a parent concept. The same goes for functionality that has to operate over several entities of a given type.

However, there might be cases where no suitable parent concept exists in the domain. That is one of the cases where I find designing a domain service appropriate. And there are others. Here is a small extract from a previous blog post of mine where I briefly describe another situation where I think domain services are a good choice:
 "In general I think you could talk about two types of systems, or parts of systems; those mostly concerned with changes in object state and those mostly concerned with processing data streaming through the system. In the first case entities, aggregates and repositories are a natural fit, in the second I think transaction scripts (in DDD context called domain services, since they do only concern domain logic, no infrastructure code [..]) are a nice fit. When the most important feature is to crunch some data, perhaps modify it and then route it further to some recipient (like another system or some persistent store) I think it is the "processing pipeline", i.e. the stateless service code, which should be emphasized. So in those cases the internals of the data isn't very interesting and might be better left in some simple DTO format."

To make the list of appropriate service design complete I’ll end with adding a few lines on external services. These services get injected into the domain. In the domain I would have only an interface describing the service in terms of the domain. This is the way to integrate with surrounding infrastructure or other systems. It is the same pattern as with Repositories, in fact a Repository is just a specialized service.

Another example of an external service is when some part of the business logic, e.g. a calculation is broken out into a separate service, implemented using another programming language or paradigm. From a domain point-if-view it is as if we get that calculation service from another system instead of doing it ourselves. The reason might be performance or it might be that the logic already exists and we want to continue using it instead of re-implementing. However, each time this happens the maintenance burden is increased a bit, so I think it should be done with careful consideration.

To sum up, favor business logic implementations in the domain objects, not in services. It leads to a more elaborate and therefore more useful domain model which is easier to keep consistent than logic spread over disparate services. I’m convinced in the long run this approach makes maintenance of the system easier. 

Wednesday, August 8, 2012

How to handle reporting with Domain Driven Design?


A pretty common question regarding Domain Driven Design (DDD) is how to handle reporting functionality in a system using a DDD-approach. As in most cases, the answer is "it depends".

If what we are aiming for is easy to change reporting I would transfer data into a BI-system and run reports from there. There is absolutely nothing to gain in designing our own BI-tool. There are plenty of them in the market, and in combination with some common data warehouse design patterns they do a good job both extracting, storing and providing easy prepared and ad-hoc reporting. If we do not need a fancy reporting interface just a separate relational database schema would do for running some SQL-queries. The key is to keep reporting separate from the business system. This is a good rule using DDD or not.

If it is more of "a small summary" or some accumulated totals that should be shown inside the business application I'd try to keep it inside the domain model. Most values, just like any attribute, would fit nicely inside an entity. E.g. if we need to calculate an OrderSummaryByStock, it might be placed as Stock.runningOrderSummary(). This makes it readable right of the entity it belongs to. If running into performance problems I'd look into keeping those numbers up-to-date as part of the command or update transaction, storing the accumulated numbers in the database, as part of the entity.

If you see overall problems with read performance due to multiple reads per write I would consider a CQRS-approach (with separate read-views kept up-to-date by events exported from the domain model as it gets updated), at least for the views in problem.

Saturday, January 28, 2012

Domain Driven Design and batch processing

In DDD the design of a system is very object centric and therefore focuses on individual objects (or aggregates of objects) that interacts through sending messages (commands, queries and events). This is very unlike traditional batch processing where one or several functions are applied iteratively to a batch of input data. Between the two there is a significant missmatch, but nevertheless, once in a while we need to offer a batch oriented interface to our domain logic or need to call a service that offers a batch interface.

Implementing a batch interface
This is the easiest part. We just need to create a thin script that manages the iteration over the batch, for each entry makes a call to the proper application service (tasked with coordinating calls on the domain model) and collects any response data returned. All business logic needed to perform the batch operation is held inside the domain package, as usual. From a domain model point-of-view, the batch processing script is just another client calling the same application services as any online client would do to perform the same task. It is just that this one is making many requests over a short period of time.

Calling a batch interface
In general, batch processing is just an asynchronous call. Yes, you combine several requests into one, but a call with just one request in the batch would still be a valid one. As long as there is no importance in which requests are made together and no relevance in the responses comming back together or aggregated in some way, that is. But then it isn't truly a batch, then it is just a single request with many input parameters. In the following I will discuss a possible solution for when the service is a true batch, i.e. serving many unrelated requests in one call, asynchronously.
Let's consider a case where we have a type of domain object including a method that is to be implemented with a call to an external service. A service that happens to have a batch processing interface. From a domain perspective the nature of this method is asynchronous. We just don't know how long we will have to wait for the result. But the fact that the call is made in batches, and not one by one is an implementation detail to be handled by the infrastructure layer.
The domain should be fully decoupled from the batch handling, which should be handled by infrastructure code. In the domain we define a service interface that takes the request for one domain object and returns nothing. This service could be called by an application service or by the domain object, or any other object or service in the domain. Then we define an event handler that gets notified when the result of the request arrive. The event handler would be responsible for taking appropriate action on the domain object depending on the result.
In the infrastructure layer we will implement the service with a message queue and on the other side of that queue some code that combines individual requests into suitably sized batch calls. The frequency and size of those batches might be tuned for performance and response times. E.g. one batch for every X number of requests, but at least one batch every Y minutes provided at least one request has been made.
Then we need a batch-driven routine that handles responses, splits them into individual messages and places them on a response queue for the event handler to process in the domain.