search


Cohesion, Coupling, Entropy, Evolution, Locality in Software
Tue Nov 01 00:00:00 UTC 2022

I want to tie together these 5 seemly unrelated concepts to answer two simple questions:

  • What is good code?
  • How to write good code?

Someone once told me bad code is code other people write. It was said in jest, but it demonstrates how subjective code quality is. No one purposesly writes incomprehensible code. Code that is easy to undersand for one person might be incomprehensible to another. It would be like judging that a poem written in Chinese poem is terrible because you don't know Chinese. Just because you do not understand the poem means that it is bad. Understandability is not a good way to judge code quality. Is there an objective way to judge code quality?

Note the word software has the word soft as a root word. The soft indicates it is malleable and can be changed. Contrast that with hardware which is hard and costly to make changes. This malability of software is the source of it's power and complexity. Making software easy to change is hard. Therefore, our definition of good code is code that is malable and adaptable. When requirements change, good code can change and adapt with low cost. In contrast, bad code has high cost of change. Bad code might as well be hardware because it is too costly to change. This leads us to more questions:

  • What makes code hard to change?
  • Do you need to understand code in order to change it?

The first question, 'what makes code hard to change?', has already been answered in a design principle 'High Cohesion and Loose Coupling' Kent Beck recently said he spent the last 17yrs learning how to explain cohesion and loose coupling in software design. If it took Kent that long to learn how to explain such a simple idea, I have no hope in doing a better job but I shall try.

Cohesion and coupling are both measures of dependencies in code. High dependencies within a module is good because a change within the module is less likely to have a domino effect on things outside the module. The module may expose some interface points so other modules can use it. The risk of change is isolated to the interface points. The interface of a module should be minimized to minimize the risk of change. What is the difference between cohesion and coupling? Whenever I learn something new, I try to find similar concepts in different systems. In particular, I like to examples in natural systems like physics, chemistry, and/or biology. Mother nature has been creating things before humans existed, so whatever humans have designed, mother nature has most likely already solved it

One of the biggest challenge about maintaining code is complexity.

Low entropy is devoid of meaningful structure. Putting everything in one place is good at the beginning but as the code develops you break it into more namespaces. Increasing entropy of the system by decreasing entropy in modules. This is how structure forms