Wednesday, September 26, 2012

DDD on top of Services, DDD inside Services - A Domain Driven approach to SOA

This post is about Domain Driven Design (DDD) and Service Oriented Architecture (SOA) and how the former can be used to build services for the latter. But it starts of in a few observations about current state of the union.

The "DDD needs a database" assumption
I've come to find it being a pretty common understanding that software systems using DDD building blocks have to be backed by a database, and, even more specific, having repository implementations using an ORM framework. This leads to the assumption that domain entities, once added to the repository, are attached to an EntityManager that is responsible for "auto-magically" saving changed data as transactions are completed. I find these assumptions bringing to many constraints on a domain driven system architecture without any need. In fact, I can't find anything in the DDD approach, as put forward by Eric Evans, that supports this understanding. Instead I see the domain model as a piece of code completely free of any explicit or implicit dependencies on the surroundings. For a couple of years I've been working using this approach and in this post I will describe the architecture of such a system where repository implementations rely on services published by other systems as well as storing part of the data in a dedicated database.

SOA as spaghetti on top of CRUD
I've also found Service Oriented Architectures (SOA) to be implemented as mostly CRUD operations for fine grained data objects, or services with, even really complex, business logic firmly placed in Transaction Scripts (TS). TS might be good enough for simple things, but quite quickly the code turns into spaghetti as complexity rises. Complex business logic is where DDD really shines, but given the assumptions discussed above, many people don't see it as an option when there is no underlying database, but a set of CRUD like services.
And there we have another problem with many service implementations. When just using simple DTOs as parameters and/or return values and presenting the client with a CRUD-like service API, what support are we then giving the client-side developer? A bunch of getters and setters that could be set in any combination! How to know what (from a business point-of-view) makes up a valid object? And how to know what are valid modifications? Such APIs force the client-side developer to have intimate knowledge about the server-side implementation, and changes to the API won't show up as compiler errors. When APIs that I depend on change, I like the compiler to be able to notify me.
I think we can do much better and offer the client-side developer much better support by extending the service API with some DDD building blocks like Entities and Value Objects. In this post I will explain how we did that, in effect implementing a published domain.

Separation of concerns - The independent domain model
One of the principles of DDD that I think are most important to apply is "Separation of concerns". It is also part of Robert C Martin's SOLID principles by the name "Single responsibility principle". In short, every piece of code should deal with one problem only and have only one reason to change. It is also an important part of writing clean code since it increases clarity if a piece of code only does one thing. In the DDD perspective I apply this principle by separating code that models the business domain in order to solve business problems from technical code that glues the application together or handles communication with the outside world, such as databases or services exposed by other systems. This leads to a system with a traditional layered architecture with a slight twist: the domain model is kept in the center with no outgoing dependencies whatsoever. The domain model is concerned only with solving business problems, while the surrounding integration layer (or "anti-corruption layer" in Evans' terms) is concerned only with technical issues. The domain model is built using the DDD building blocks (entities, value objects, repositories, services and factories), with repositories (and sometimes also services and factories) represented as interfaces used by the domain, but implemented in the integration layer. Thereby creating the dependency from the infrastructure layer to the domain, and that's the twist! This is perhaps an unusual architectural style, but it is not new. Similar models have previously been described as “Hexagonal Architecture” and “Onion Architecture” among others. Robert C Martin has a nice summary of the history of this architecturalstyle on his blog.
This architectural style brings the following benefits:
- Easy to read business logic, since it is not mixed with code for interaction with databases, messaging services or other technical concerns. This caters for fewer mistakes as the software evolves.
- Easy to deploy. The domain code can be deployed in different runtime environments without changes, since no dependencies exist. At one point we made great use of this as we were moving to a new deployment platform while at the same time developing new functionality for the business.
- Easy to test business logic. All business logic can be tested by automation without any need for a runtime container. Doesn't really have to argue why that is good...

Repository on top of services
Since the domain is only concerned with business logic and the business doesn't care how entities are stored, only the fact that they can be stored and at what step in the business process that happens, all repositories in the domain are represented as interfaces. It is now up to the integration layer to provide implementations for those repositories.
Regardless whether entities are to be stored in a database, in files, or by calling a remote service (In my case it was services exposed via a customer specific API on top of stateless EJB, i.e. basically RMI, but it could be web services or any other protocol as well), or a combination, a repository needs access to the internal state of the entity in order to get (for storing) and set (for re-constructing) values. Typically most of that state isn't public in the domain model, so the repository implementation needs to gain access. Depending on security settings, reflection might be an alternative. In most cases it is not. Instead we made the members of the entity protected and created a specific sub-class of the entity in the integration layer that we used both for reading when state was not public, and for re-constructing entities on read requests.
With the problem of gaining access to internal state solved it is pretty easy to write a repository implementation that reads and writes most of the data over one or several  remote service calls and keeps the rest in a few database tables in a dedicated database. This split storage model is often a result from the fact that existing external services might not fully support the needs of the domain model. Another reason might be performance. In some cases we had to cache carefully selected data pieces in our own database to ensure timely retrieval. The beauty of this approach is that we can do all those tricks needed in the repository implementation without having any part of it leak into the domain model code; the separation of concern is total.

Published domain and services with business meaning
As mentioned above, CRUD-oriented services that let a client store and retrieve DTOs with getters and setter for every attribute push a great burden onto the developer of the client code. If using DDD to implement the service internally, we could do better by offering that domain knowledge to the client, packaged as a published domain and services that carries business meaning. A CRUD-service will never have a place outside the integration layer of the client, while the signature of a service well crafted in business terms might be a good candidate for a service API used directly in the domain model of the client. Of course in the shape of an interface with a small implementation in the integration layer to carry out the technical lifting of making a remote call. But the point is, the service signature can be used verbatim and the integration layer can be kept thin since the client domain doesn’t have to re-define what the service means in business terms; hard earned domain knowledge is reused.
The same goes for the business objects. If we publish those parts of the domain that can be used outside our service implementation, we offer the client-side developer to directly benefit from the domain knowledge we have gathered in building our service. But what parts can be published? Well, if you think about it, the part of a good API that is most useful is the limitations it imposes, i.e. the help you get to avoid doing stupid things. By publishing a model of domain objects as plain Java objects, with constructors and accessors ("getters", but in business terms and not necessarily following the Java Beans convention), we help the client-side developer to avoid constructing invalid objects and accidentally changing state that, from a business perspective, should be immutable. With only this much the  client-side developer gets much better support than with only the raw data format offered by our stateless services. In addition it might also be appropriate to offer a thin layer on top of the service calls that that exposes the services in terms of these domain objects and takes care of transforming them to/from the raw format used to go over the wire.
Having a published domain gives the client-side developer the choice to either just model transformations of it into  a more suitable model for the client context or, if contexts are closely related, decide to take on a conformist approach and extend the domain objects with additional functionality. I've done both in different contexts and it is so much better than having to experiment with DTOs to find out what are valid combinations of attribute values.

DDD is suitable for implementing domains on top of external services. It is also suitable for the implementation of such services, and if we carefully select parts of the domain model to publish we offer great help to the client-side developer.

1 comment:

  1. This is quite similar to the approach that is being followed by the team of Apache ISIS, I think. I have just recently take a look at more deep to that framework and is becoming really interesting.

    I really like the way the Domain Entities are exposed through REST, including their operations, automatically (but can be changed programatically).

    It also has an authorization module that exposes only the authorized operations to the logged user.

    Also persistence is automated, with JPA or other persistence mechanisms.

    Previous versions of that framework were only judged from the automatic User Interface generation, which, in my opinion, is good for prototyping but it's the least important part of the framework objectives (despite the opinion also of some of the team members, as I have read).