Wednesday 23 June 2010

One team, one language

On a previous post I was discussing, among other things, how code often doesn't represent the business properly. The code "satisfies" the business requirements but doesn't express them very well. The main reason for that is because we, developers, like to abstract business terms and rules into technical implementations and patterns.

Very often, we discuss the user stories (requirement documents, use cases, whatever the methodology used is) with the "domain experts" and as soon as we understand what needs to be done, we map the requirements to a technical design (actions, services, entities, helpers, DAOs, etc) that is completely meaningless to the domain experts. Sometimes, even among developers themselves, different names and expressions are used to refer to the same thing. The main reason is that different developers talk to different domain experts and come up with different abstractions. As a result,  the usage of different terms to describe requirements leads to confusion, duplication of code and unpredictable behaviour in the system.

The first step towards an expressive and domain-focused design is to have a common language among ALL members of the team. ALL means ALL: developers, domain experts (business analysts, users, product owner, etc), testers, project manager and anyone else involved in the project.

Developers very often say that domain experts don't understand objects and database and because of that, they need to "translate" business requirements into software design. However, we developers don't understand the business as well as the domain experts do, what more often than not, leads to imperfect and confusing abstractions.

The Ubiquitous Language

A language structured around the domain model and used by all team members to connect all the activities of the team with the software.
The Ubiquitous Language is one of the most important things, if not the most, in Domain-Driven Design. The main idea is that the whole team speaks a single language, that is the business language.



Business terms related to the software to be implemented must enter the ubiquitous language and each of these terms must be understood clearly by all members. During requirements gathering sessions and planning meetings, these terms must be captured and made available to everybody. Technical terms from the development team and business terms not relevant for the piece of software being implemented MUST NOT enter the ubiquitous language. 

Capturing the ubiquitous language

In order to capture the ubiquitous language, it is mandatory that you work on a iterative software development environment. Trying to capture the ubiquitous language up-front could straitjacket the whole process, inhibiting team members to make the necessary changes along the way. As the language is used to express the business requirements, it is natural that it evolves during the lifetime of the project, where new terms are added, deleted and also re-defined.


Methodologies like Extreme Programming (XP) says that the only documentation should be the code. Not even comments on the code are appreciated, since they can easily get out of sync with the code. The code should be the only documentation since it is the only one that represents exactly what the system does.

On the other hands, we have UML (Unified Modeling Language), that in theory, should be a great candidate to document the ubiquitous language since the whole purpose of UML was to document requirements and express them in a language that is common to developers and business people.

In summary, showing code during discussions with domain experts, testers and other members of the team during a design session is not exactly a fantastic idea. Also, using just UML, because of its bureaucracy, rules and details is also a bad idea. The whole UML notation could easily straitjacket the creative process during the exercise.

There are many discussions about what would be the best way to capture the ubiquitous language. My preferred way is to draw diagrams (boxes and arrows mixed with some well understood UMLish notation) where each box represent a "domain object" (aka domain concept). A domain object can be anything that is expressed by the business, like client, organisation, product, route specification,  sales system, etc. They would all be boxes. Add to it a few arrows linking the boxes and with just a couple of words explaining how they related to each other. Sometimes a mixture of a class and sequence diagram (or an active diagram) can be very helpful, but don't get to picky about any notation. Preferably, draw on a white board, take a picture and store that on the wiki. For further sessions, just open the wiki and re-draw just the bit of the design that is important to the feature being discussed. Make the necessary adjusts on the white board, take another picture and stored it on the wiki again. There is much more to that, if you want to dive into agile modeling, but I will leave it to another post.

This goes way beyond transforming nouns and verbs into classes and methods. Taking the examples above, for example, client, organisation and products could be transformed in entities; route specification could be a strategy class used by a routing service (that would also need to be added to the diagram and to the ubiquitous language); sales system would be an external system that we need to integrate to, etc.

Making the code more expressive

The code should reflect all concepts exposed by the model. Classes and methods should be named according to the names defined by the domain. Associations, compositions, aggregations and sometimes even inheritances should be extracted from the model. 

Sometimes, during implementation, we realise that some of the domain concepts discussed and added to the model don't actually fit well together and some changes are necessary. When it happens, developers should discuss the problems and/or limitations with the domain experts and refactor the domain model in order to favour a more precise implementation, without ever distorting the business significance of the design. 

Ultimately, the code is the most important artefact of a software project and it needs to work efficiently. Regardless of what many experts in the subject say, code implementation will have some impact on the design. However, we need to be careful and very selective about which aspects of the code can influence design changes. As a rule, try as much as you can to never let technical frameworks limitations influence your design, but as we know, every rule has exceptions.

A common implementation problem that very often get in the way is mapping objects to databases using ORM tools. In this case, bending the model a little bit in favour of a more realistic relation among entities is not a bad thing. Just make sure that changes like that are represented in the model and understood by everyone involved.    

DOs and DON'Ts 
  • Don't try to model everything. Focus on the core of your application. We are not working on a waterfall or Unified Process project here.
  • Try to model just the key concepts of the domain problem;
  • Do not clutter your models with too much details. Keep it focused on the main responsibilities;
  • Limit your discussions and changes in the model to the business concepts (domain objects) related to the user story being discussed;
  • Don't add architectural concepts like DAOs, Actions, etc. We are not writing implementation diagrams. 
  • As your application grows, break the application into multiple models (domains), explicitly defining the context and boundaries within which a model applies. This is called Bounded Context in Domain-Driven Design.
  • Avoid thinking purely on the implementation when designing your model. Understanding and modeling the business is the most important thing here. 
Challenges of a Model-Driven Design

Model-Driven Design (MDD) is more an art than a science. It takes a lot of practice and willingness to get it going and get it right. Refactoring towards deeper insights must be seen as a positive and essential part of the project development. TDD and continuous integration are also essential for any agile and domain-driven application.

The goal of MDD is having the software expressing a deep and supple design.
Last buy not least, developers with good Object-Oriented Design skills are needed in the project. The lack of design skills could easily transform ANY software project in a total failure in the long term.
Source

Saturday 12 June 2010

The Wolf in Sheep's Clothing

In the last few years, I've noticed that the majority of the projects that I've participated roughly followed the same design. Speaking to colleagues and friends, the great majority said that they were also following the same design. It's what I call "ASD" design (Action-Service-DAO).

by Action we mean a Struts Action, Spring Controller, JSF backing bean, etc. Any class that handles an action triggered by the user interface.

We can argue that there is nothing really wrong about this approach since if we map that to a three-tier architecture, that is followed by the majority of the applications, the ASD classes would be in the right places, keeping a good isolation between the tiers.


One of the problems of this approach is when services are created randomly, with different granularities, low cohesion and with a weak representation of the business domains. Services end up being function libraries, almost like an utility class where all "functions" related to a specified "module" are grouped. When it comes to DAOs, the situation is not very different. DAOs are developed in a way that they become utility classes where sometimes we find a single DAO with all queries, inserts, delete, updates for an entire module or sometimes you find one DAO per entity. Either way, regardless what the query returns (or what it is its intention), or any rules related to updates or deletes, the methods will go to the same class. As the application grows, the code becomes something like this:


Looking at the picture above, can we really say that it is object-oriented programming? As services and DAOs don't strongly represent business concepts, the code becomes procedural. I don't want to bang on about the advantages of OOP over Procedural code. I believe that too many people have already discussed that over the past 30 years. My point here is to discuss what a "design" like that can do to an application.

Following bad examples


Unfortunately, in software development, bad examples are easier to be followed than good examples. With a non-expressive business model like that, the services get overloaded with methods. When adding new features to the application, developers need to go through all the methods of a service to see if there is already a method that does what they need. Due to the lack of cohesion and method explosion in the services, many developers will just add a new method in there, that can be very similar to existing ones, instead of re-factor the existing ones. This leads to services with confusingly similar methods and a lot of duplication. As a chain of "bad" events, the more methods a service has, the more classes depending on it the application will have. Many dependencies means that re-factoring the class would be harder, discouraging any one willing to improve the quality of the code.

Duplication does not happen just inside a single service. It's also very common to find different services with very similar methods, if not the same. In general, this is due for the lack of clarity on the responsibility of each service. According to the feature that developers are working on, they will choose a service to add (or reuse) the methods they need. If the responsibility and granularity of each service is not clear, business rules related to a certain area of the business will be spread all over the place since different developers will think that the method will belong to different classes.  


Exposing DAOs to the presentation tier

Another side effect of this approach is that since the DAOs are also "utility classes" with no business meaning (it just has an architectural meaning), some developers can easily expose the DAOs to the actions, without going through the services. Let's give more meaningful names and more details about the Action 4 and DAO 4 shown above. Let's call them ClientAction and ClientDAO.



Here the user interface needs to display a list of clients. ClientDAO has a method that returns a list of clients. In theory this may make sense. The most used argument in favour of this approach instead of adding a "ClientService" in between the ClientAction and the ClientDAO is that the service would have a method that just delegates the call to the DAO. The service layer, in this case, would be considered redundant and just pointless extra work. 


Looking at this isolated scenario, having a service in the middle really looks a bit overkill and unnecessary. However, we will probably want to do more things with a client. We will probably want to create a new client, delete or update an existing client.


Here is where the problems begin. Very rarely, we have a pure CRUD application. In general, many business rules have to be performed before or after any changes are made in the persistence tier (generally a database). For example, before inserting a new client, a credit check needs to be performed. When deleting a new client, we need to make sure that there is no outstanding balance for that client and close his or her account. We also need to archive all the orders for this client.  Whatever the application's business rules dictate. Multiple entities (or database operations) may be involved in a operation like that. We need to be able to define transaction boundaries. Besides CRUD operations, we will also have all business methods like check if client has credit, add orders to a client, etc. This is too much for a single DAO to handle. DAOs should not perform any business logic and should be hidden from the presentation tier. It should be hidden (encapsulated) even from different services. DAOs should just deal with the persistence. Business logic and transaction boundaries should be controlled by a class with business responsibilities, that in the case of this "poor" design, it would be a service.   



So even having a few methods in the service that just delegate the responsibility to a DAO, it is still a price worth paying. The service should hide (encapsulate) the details of it's implementation. A client code does not need to know how and where the data is persisted.

OK, enough of this. Exposing DAOs to the presentation tier is WRONG. Let's move on.

Moving away from the procedural code (baby steps)

The first thing to do to move away from the procedural code is to make your code more expressive and aligned to the business. We need to narrow the gap between developers and domain experts (business analysts, users, product owner or whoever knows the business rules) so we all start speaking the same language. The code must be written in a way that it expresses the business and not just architectural concepts that means nothing to the business. Actions, Services and DAOs are technical terms with no connection to any business term.

There are a few techniques that can be applied (or even combined) in order to make it possible. In future posts, I'll be talking about some of these techniques in more details. I know, it's frustrating that I will not tell you right now how to do it after have spending all this time criticising the ASD procedural design. The important thing for now is to know that what looks a good solution, since it is used by many different people and in many different projects, may not be as good as it seems.

In the meantime, there is something that we can start doing to our code in order to reduce the mess and make the necessary refactorings easier in the future.

Preparing your code for an easier transition

The following advices will help us to get our code to a point where it can be easily refactored into a more domain (business) focused approach in the future. It will still be a bit procedural but will be much more well organised and a notion of components (business components) will start to emerge.

1. Identify key concepts (domains) in your application. (e.g. Client, Order, Invoice, Trip, Itinerary, etc)
2. You can create / refactor services for each one of the key domains.
3. Try to keep a well balanced granularity for all services.
4. Avoid services for small parts of the domain. For example, do not create a service for line items. Line items should be handled by the OrderService.
5. Services are not allowed to manipulate multiple domains (with exceptions of whole-part relations - compositions)
6. DAOs are encapsulated by the services and never accessed by any other class.
7. Not every service must access a DAO, but any DAO must be accessed by one and only one service.
8. Services must delegate operations to other services if the operation is related to a different domain from the one the service is handling.
9. Services talk to each other. DAOs never talk to each other.
10. Parts (like a line item) are never exposed by a service. Service always exposes the whole (like Order). Parts are accessed via the whole. (e.g. order.getLineItems();)

With the following rules, our procedural code starts looking more like meaningful objects, with a reasonably well defined interface and some significance in terms of business.

  
As mentioned before, this is still a bit procedural but this design already solve a some of the problems discussed earlier like having more meaningful and specific services. The responsibility and boundaries of each service are more defined, making it easier for developers to look for implemented methods and re-use what it is in there, reducing duplication. 

In future posts I'll be talking about the next steps towards a more expressive and business focused code, less procedural and more object-oriented.  


If you are curious about the next steps and can't wait for my posts, have a look at:
http://domaindrivendesign.org/
http://en.wikipedia.org/wiki/Behavior_Driven_Development
http://en.wikipedia.org/wiki/Model-Driven_Architecture
http://en.wikipedia.org/wiki/OOD

Sunday 6 June 2010

Object-Oriented Design Principles - Part 2

In the first part of this Object-Oriented Design principles series, I covered the Single Responsibility Principle (SRP). I had also covered the Liskov Substitution Principle (LSP) on a previous post. So let's move on to another Object-Oriented Design Principle.

The Open Closed Principle (OCP)

Software entities (classes, modules, functions, etc.) should be open for extension, but closed for modification

We know that requirements will change, new requirements will come and we will need to change some existing code during the life of a project. However, when a change in one place results in a cascade of changes in other classes, that's a sign that the design is not quite right. Changes causing side effects are undesirable since it makes the program unstable, rigid and fragile where parts can not be re-used.

The Open Closed Principle target all problems described above. Being open for extension means that modules can be extended in order to make them behave in a new or different way. Closed for modification means that we should not change existing code unless that we are fixing a bug.

There were two proposed ways to achieve the OCP and both use inheritance:
  1. Dr. Bertrand Meyer, that coined the Open Closed Principle in 1988, proposes the use of inheritance. When a new feature or a changing on an existing feature is needed, a new class must be created, inheriting from the old one. The new class does not necessarily keep the same interface. 
  2. A few other authors redefined the OCP in the 90ies. Basically they suggested the use of abstract base classes and concrete implementations. The abstract class would define the interface that would be closed to modifications. The concrete classes would implement the abstract class interface (open for extension) and multiple implementations could be created and polymorphically substituted for each other. 
Nowadays, with the evolution of frameworks with dependency injection capabilities, a better approach would be having a client class pointing to an plain Java interface and inject the implementations. 

OCP is an old OOP design principle and is one of the most important ones, due to the problems it solves. It can be an interesting approach in cases where the bureaucracy of changing old code and deployment to the production environment is very high. Some companies may ask for code reviews, a long test cycle (non-automated), documentation, etc. With OCP, no existing code is changed and just new code is added.

However, in a more agile environment, good test coverage, IDEs that have good re-factoring tools (like Eclipse, Idea, etc) I wouldn't be too worried about changing classes and interfaces but this would not invalidate all the advantages of the OCP.

Source
http://en.wikipedia.org/wiki/Solid_%28object-oriented_design%29
http://en.wikipedia.org/wiki/Open/closed_principle
http://www.objectmentor.com/resources/articles/ocp.pdf
http://en.wikipedia.org/wiki/Design_by_Contract
http://en.wikipedia.org/wiki/Information_hiding

Thursday 3 June 2010

Object-Oriented Design Principles - Part 1

Almost every developer that I know would be able to give a reasonable explanation about inheritance, encapsulation and polymorphism. However, there is much more to Object-Oriented Programming (OOP) than that. In order to come up with a good and clean design, we need to bear in mind some Object-Oriented Design (OOD) principles. Although many of these principles were already published in books and blogs, they are known to just a very small percentage of developers. I don't want to show my age here, but very rarely we find young developers talking about OOD Principles.

There are quite a few OOD principles out there, created/coined by different developers and academics, but I will be listing here just the OOD principles that I consider to be the most important ones.

The following 5 principles together are known by the mnemonic acronym "SOLID". They were first put together by Robert C. Martin (Uncle Bob) in the early 2000s and should be applied at class level. They are:

The Single Responsibility Principle (SRP)

There should never be more than one reason for a class to change. 

The SRP is the simplest OOD Principle and probably one of the hardest to get right. It is also one of the most violated principles.

Let's have a look at the following class:

public class TripItinerary {
    public void addPlace(Place place) { ... }
    public void removePlace(Place place) { ... }
    public List<Place> getPlaces() { ... }
    public void displayOnMap() { ... }
    public Place findPlaceByName(String placeName) { ... }
}

The class above has 3 different responsibilities, that means, 3 different reasons to change:
  1. Store the places to be visited.
  2. Display the itinerary on a map.
  3. Finds a place by name.

Storing places on the itinerary may have rules like not adding repeated places, keeping the sequence that places will be visited, etc.
Displaying on the map may vary according to the map API being used like Google Map, Bing Map, Yahoo Maps, etc.
Find a place by name may involve calling a web-services to see if the place exists, if there are more than one place with the same name, checking the type of place (city, town, waterfall, monument, castle, etc).

Ideally we would have three different classes to do that, each one with its own responsibility.

public class TripItinerary {
    public void addPlace(Place place) { ... }
    public void removePlace(Place place) { ... }
    public List<Place> getPlaces() { ... }
}

public class ItineraryMapService {
    public void displayOnMap(TripItinerary tripItinerary) { ... }
}

public class PlaceService {
    public List<Place> findPlaceByName(String placeName) { ... }
}

With these approach, we can change the internals of all classes without having the risk of breaking any of the behaviour of the other classes. Without this separation, the design becomes fragile and might break in unexpected ways when changed.

When thinking about a single responsibility, think cohesion at class level. Before creating a class, we need to define what its responsibility should be and the reason for its existence. Before adding any other public method to an existing class, check if this new method relates to the other public methods (the class interface). When creating public methods for a class, have them at least at a communicational cohesion level.

On the next posts, I'll be covering the remaining SOLID OOD principles.

Object-Oriented Design Principles - Part 2

Source
http://en.wikipedia.org/wiki/Solid_%28object-oriented_design%29
http://en.wikipedia.org/wiki/Single_responsibility_principle
http://www.objectmentor.com/resources/articles/srp.pdf

Wednesday 2 June 2010

The Liskov Substitution Principle (LSP)

The Liskov Substitution Principle was initially introduced by Barbara Liskov in a 1987 conference keynote address entitled Data abstraction and hierarchy. LSP is also part of SOLID, a group of five Object-Oriented Design Principles put together by Robert C. Martin in the early 2000s. 

Functions that use pointers or references to base classes must be able to use objects of derived classes without knowing it.

The LSP's summary above looks quite obvious. If the calling code was written to use a base class, replacing it with a sub-class (inheritance) should make no difference to the calling code. That means, the calling code should not be changed and should be totally agnostic about which implementation is being used.

LSP's importance is noticed mainly when it is violated. If a subclass causes changes on the calling code, that means, the calling code needs to test which subclass it is dealing with (instanceof, casting, etc), the code is violating the Liskov Substitution Principle and also the Open Closed Principle. This violation causes high coupling, low cohesion and a cascade of changes.

Violation of Liskov Substitution Principle

public void drawShape(Shape s) {
    if (s instanceof Square) {
        drawSquare((Square) s);
    } else if (s instanceof Circle){
        drawCircle((Circle) s);
    }
}

The Liskov Substitution Principle also imposes a few rules that the sub-classes must obey. It bears a certain resemblance with Bertrand Meyer's Design by Contract in that it considers the interaction of subtyping with pre- and postconditions

...when redefining a routine [in a derivative], you may only replace its precondition by a weaker one, and its postcondition by a stronger one.

This means the sub-classes must accept everything that the base class accepts (pre-condition) and must conform to all postconditions of the base class.

Example of a more subtle violation
public class Rectangle {
    private int height;
    private int width;

    public Rectangle(int height, int width) {
        this.height = height;
        this.width = width;
    }
    public int getHeight() {
        return this.height;
    }
    public void setHeight(int height) {
        this.height = height;
    }
    public int getWidth() {
        return this.width;
    }
    public void setWidth(int width) {
        this.width = width;
    }
}

public class Square extends Rectangle {

    public Square(int size) {
        super(size, size);
    }  
    public int getHeight() {
        return super.height;
    }
    public int getWidth() {
        return super.width;
    }
}

In the code above, a Square IS A Rectangle. Note that a rectangle can have different sizes for width and height, but in a square, height and width must be of the same size. What would happen in the following code is executed ?

Rectangle r = new Square();
r.setHeight(5);
r.setWidth(6);

We should not allow this to happen since that would make the Square object invalid. In a square, height and width must always be the same. A quick fix for this would be to override the setter methods:

public class Square extends Rectangle {

    public Square(int size) {
        super(size, size);
    }  
    public int getHeight() {
        return super.height;
    }
    public int getWidth() {
        return super.width;
    }
    public void setHeight(int height) {
        super.height = height;
        super.width = width;
    }
    public void setWidth(int width) {
        super.width = width;
        super.height = width;
    }
}

Although this approach would fix the problem, is a violation of the Liskov Substitution Principle since the methods will weaken (violate) the postconditions for the Rectangle setters, which state that dimensions can be modified independently.

One interesting thing to note is that if we analyse the Rectangle and Square classes in isolation, they are consistent and valid. However, when we look at how the classes can be used by a client program, we realise that the model is broken.

Every time you get yourself adapting a sub-class so that it does not break or its state does not become invalid if used in the context of a super-class, this is a clue that the sub-class should not be a sub-class at all.

In this example, maybe a Square is not a Rectangle. Square should not have height and width to start with. It should just have size. And since a Square is not a Rectangle, it should never be used in a Rectangle context.

public class Square {
    int size;
    
    public Square(int size) {
        this.size = size;
    }  
    public int getSize() {
        return this.size;
    }
    public void setSize(int size) {
        this.size = size;
    }
}

We can not validate a model in isolation. A model should be validated just in terms of its clients.

Source
http://en.wikipedia.org/wiki/Liskov_substitution_principle
http://www.objectmentor.com/resources/articles/lsp.pdf