Saturday 24 April 2010

Cohesion - The cornerstone of OO

Cohesion is probably the most important concept of Object-Oriented Programming since it promotes a good level of encapsulation, separation of concerns and responsibilities, re-usability and maintainability.

Definition


Cohesion (noun) : when the members of a group or society are united.
Cohesive (adjective) : united and working together effectively.
Cambridge Dictionary



In computer programming, cohesion is a measure of how strongly-related and focused the various responsibilities of a software module are.
Wikipedia

Cohesion at method level
  •  Coincidental (worst): Performs multiple operations and some times they are not related to each other. 
  • Conditional: According to an if statement, different attributes are modified or different values are set to the different attributes. 
  • Iterative: Several attributes (Array variables) are modified as a result of a looping.
  • Communicational: More than one attribute is modified according to only one input. 
  • Sequential: More than one variable (object) modification result in the change to only one attribute.
  • Functional (best): Method modifies fewer than 2 attributes. 

Cohesion at class level
  • Coincidental (worst): Methods grouped arbitrarily and have no significant relationship (E.g. Util classes with methods handling strings, lists, mathematical calculations)
  • Logical: Methods grouped because they logically are categorized to do the same thing, even if different by nature (E.g. grouping all I/O handling routine, all database selects, inserts, etc.).
  • Temporal: Methods grouped by when they are processed at a particular time in program execution (E.g. validates data, persist the data, create audit information, notifies user via email).
  • Procedural: Methods grouped because they always follow a certain sequence of execution. (Verify is user exist, performs login, write logs, retrieve user's detail)
  • Communicational: Methods grouped because they work on the same data (E.g. Insert, delete, validate, update a client entity).
  • Sequential: Methods grouped because the output of one method can be used as an input of other methods. (reads a file, process the file).
  • Functional (best):  Methods grouped because they all contribute to a single well-defined task. (parsing XML)

Applying cohesion in the real world

Depending of the type of software you are writing, you will need to compromise a little bit. It is not always possible to have all methods and classes at the highest cohesion level (functional).

If you are building a framework or a very generic part of your system, chances are that the majority of your classes and methods will be at sequential and functional levels. However, when writing a more commercial application, I mean, an application where there are business logic, database access, users, etc, there is a good chance that many of your classes and methods will be more at the communicational level.

In a more simplistic way, each class and each method should have a single responsibility. A technique that I use for that is to write a brief description (javadoc) for each class and method before writing the methods of the class or the body of the method. This forces me to think what the responsibility of the class or method that I'm creating is and as soon as I realise that the class or method is doing more than what I described, I know that I need to break it down in more classes or methods (private or public ones).

Some people use other criteria, like number of public methods per class or number of lines per method. This sort of metrics are helpful because it makes you re-analyse your code and can be a good indicator that something is not quite right. A class with many public methods is an indication that the class may be doing too much and does not have a single responsibility, having a low cohesion. A method with many lines is also an indication that this method may be doing too much. One of the problems with this approach as a cohesion measure is knowing how much is too much. Is 10 lines of code per method too much? What about 20? Is 10 public methods in a class too much? Number of methods per class or number of lines of code per method don't necessarily tells much about how cohesive the class or method is, but they can be used as a "smell detector". Thinking on a single purpose for each class and each method before you implement them will help you to keep your classes and methods small without much effort.

The more cohesive your code is, the more reusable, robust, easy to test and reliable it is. 

Source
http://en.wikipedia.org/wiki/Cohesion_%28computer_science%29
http://www.waysys.com/ws_content_bl_pgssd_ch06.html
http://en.wikipedia.org/wiki/Single_responsibility_principle