It is very interesting to observe how often we forget the "why's" as compared to the "what's" when talking about approaches and methods. Today we will tackle one such area that software teams should be very familiar with - system design and what we call the "best practices" of designing code (no irony here, it's just that "best practices" sometimes sounds as an oxymoron really).
Businesses evolve and development teams have to catchup or even drive this constant change implementing new products, maintaining them thru their life cycle, and if possibly, even doing this efficiently. And as complex as code is per se, teams have to not as much try to "design it right first time" but rather learn how to keep it healthy during the life of a product. One of the primary methods to sanitize the "code smells" is refactoring which can be applied at the different levels of abstraction from the entire system point of view up to a very low level (class-, method-, or statement level). The method itself is known for long time and a lot of books and blog posts published on this topic (some of them are worth very high praise like Martin Fowler's "Refactoring" or Bob Martin's "Clean Code", I love these books and I suggest them).
But here comes one big problem: both the programming paradigms (namely OOP) and refactoring patterns that reflect best practices within these paradigms spawn a gap between what system should do (let's call it scenarios for now; e.g.: "login scenario" or "book a flight scenario") and the way it is implemented. One may say: well, but code is low level of abstraction that implements the much more "abstract" requirements. That is only partially true, as the OO analysis suggests: entities of the business domain should be reflected in the system object model and their behaviors encapsulated respectively. So, in other words, there very likely to be classes like "Account", "Contact", "Lead", "Pipeline" etc if we are implementing a CRM system, for example. Of course there will be more, in fact, a whole bunch of auxiliary objects that perform system level functions (like controlling the UI, accessing external web services, or persisting data to a DB etc), but point is that the domain gets reflected in the code. And that is without doubt a good thing. The Problem is though that neither of these classes (domain entities) directly cause defects in software, defects are not in entities in other words, defects lay in scenarios. And unlike domain entities, that are represented as solid "chunks" of code, scenarios are chaotically spread across the system and involve multiple different entities in different context each time. To illustrate this: if you ask a tester on the team, what's the most severe defect in the current system build, he would say something like: zip code not validating during the sign-up - actually referring to a scenario. Whilst if you ask a developer, where exactly this defect is - it would make her think harder as this scenario is not a kind of specific entity or structure in the code she could logically refer to.
So what scenarios really are from the code perspective? For those more or less familiar with programming it is very easy to articulate as a program stack trace when the user interacts with the system in the context of a specific scenario. In other words it is (typically long) list of class methods called one after another (typically nested) and to perform a scenario in a pretty mature system you may have tens or even up to a hundred method calls in a particular user scenario. Pretty big number! Well, no wonder it is hard to diagnose the problem in the scenario fast if it takes 30 method calls to go end-to-end. So this spawns a first question:
why it takes so many method calls and how to make it less?
The first thing that pops up is that how small or big are those methods. And here's the problem: one of the conceptual commandments of refactoring suggests that we keep method bodies as well as the list of input parameters short. Quite often it is taken to the extreme extent where the code is just a huge pile of a 5-line methods. And if you ask "Why?" you most likely will never hear a reasonable explanation. But it looks "good" and looks like someone applied an effort in keeping code "well structured". The other extreme side is huge methods. Then of course you may have a stack trace of user scenario consisting of only 5-7 calls but now the problem is how to find a defect in a long method body (that would also accept 15 parameters and depend on another 10 private fields in its class). Now we can apply this logic to designing and refactoring a software system:
design the system as a balance between its behavior and structure. Apply OO design principles and approaches so that the code not only looks good but is also relatively easy to track and think of as part of specific scenarios.
This relates to the entire whole set of OO design principles and concepts. For instance you can make a good use of Single Responsibility Principle to split up a big "stinky" class that lives there for quite a bit and everyone just adds and adds to it. But taken to the extreme point it may spawn tons of classes and deep hierarchies and usage chains that would simply make it impossible to track the logic when applying changes or even debugging a problem.
The idea also reveals the other problem with refactoring - unfortunately it so often happens in a "context independent" manner. In other words it doesn't really make much sense to go ahead and restructure the "Account" class (sticking still to our CRM example) because we think we can make it better. It may end up dangerous after all. Instead we refactor it in the context of a scenario. This way we also think of the other scenarios that the change may affect. The good approach would be to pick your current user scenario that you work with, see what classes are affected and if, for example, class "Lead" has a big method which is called as a part of the scenario, then it is just the right time to split it. We split it only to the extent where it adds enough clarity but does not spawn a ton of unnecessary methods each containing 3 lines of code or so.
We didn't touch automated testing as another important aspect of this problem, which we'll surely do in the next post or two.
Finally, just to put it straight: I'm a big supporter of Object-Oriented Programming, Analysis and System Design and honestly think that it was one of the most significant advancements in the industry. It's not that some other paradigm (like Functional Programming) would solve it. At the same time we must be aware of the actual gap that we described between system behavior and its implementation. It's not just that we know about it but we now also know how to crucially mitigate the impact: we should code and refactor in the context of specific scenarios as opposed to modifying entities alienated from the behaviors associated with them.
- There's a gap between what should be implemented (scenarios) and how it is implemented (system design)
- Unlike the domain entities, scenarios don't explicitly exist in the code
- Each scenario is typically spread uneven across the codebase and is hard to keep track of, modify or debug
- An easy way to illustrate this problem is just to observe a scenario stack trace - it is typically way too long
- Therefore system design should support a balance between manageable implementation of the scenarios and OOP principles
- To support such a design, refactoring should be performed in the context of a scenario
- This also protects us from taking refactoring methods to the extreme edge which has a hidden negative effect on the system maintainability even though the code may look well-structured and elaborate