Friday, May 06, 2005

Single Codebase - How many Codelines?

On the YahooGroup for the 2nd edition of Kent's Beck's book Extreme Programming Explained, Kent described a practice he calls Single Code Base:

There is only one code stream. You can develop in a temporary branch, but never let it live longer than a few hours. Multiple code streams are an enormous source of waste in software development. I fix a defect in the currently deployed software. Then I haveto retrofit the fix to all the other deployed versions and the active development branch. Then you find that my fix broke something you were working on and you interrupt me to fix my fix. And on and on.

There are legitimate reasons for having multiple versions of the source-code active at one time. Sometimes, though, all that is at work is simple expedience, a micro-optimization taken without a view to the macro-consequences. If you have multiple code bases, put a plan in place for reducing them gradually. You can improve the build system to create several products from a single code base. You can move the variation into configuration files. Whatever you have to do, improve your process until you no longer need them. [... example removed ...]

Don't make more versions of your source code. Rather than add more codebases, fix the underlying design problem that is preventing you from running from a single code base. If you have a legitimate reason for having multiple versions, look at those reasons as assumptions to be challenged rather than absolutes. It might take a while to unravel deep assumptions, but that unraveling may open the door to the next round of improvement.
Kent is equating creation of a new codeline with establishing a new code base within the same repository. He does so with good reason: creating a new codeline for supporting a legacy release and/or multiple customer variants is indeed creating a new project instance, complete with its own separately evolving copy of the code.

I posted a lengthy response to Kent's initial description of a Single Code Base. I feel I understand Kent's position intimately well. At the same time I think that having to support one or more legacy releases is a business constraint that is far more unavoidable then Kent's post seems to suggest. I summarized my opinion as:
  1. Transient branches are fine (even ones that last more than a few hours) and do not cause the waste/retrofitting described. But you do need to follow some rules regarding integration structure and frequency
  2. Variant branches are "evil", and should be solved with good architecture/factoring or else configuration that happens at later-binding-time
  3. Multiple release branches are often a necessity in order to support multiple releases. And supporting multiple releases is highly undesirable, but often unavoidably mandated by the business/customer
Under the heading of "Transient Branching", I include patterns like Private Branch, Task Branch, and "organizational coping" branches like Docking Line, and the Release-Line + Active-Development-Line pair. Another example (tho not short-lived) is Third Party Codeline. And of course if any branching is done, then proper use of a Mainline is essential (I think Mainline does for branching what refactoring does for source-code).

So while I vigorously agree with the desire against adding new codelines for supporting multiple releases, and that it's certainly good idea to question if it is truly necessary (and to fight "like heck" against using branches as a long-term solution to handling multiple variants), I think challenging the multiple maintenance constraint too vehemently isnt a great idea however once you understand the business need that is driving it.

We might still disagree with doing it, but at that point I think we need to "bite the bullet" and do it while perhaps exploring softer communication alternatives to persuade the business in the future. Part of that can be getting the business to:
  • Fully acknowledge and appreciate that each additional supported release/variant is a bona fide new project with all the associated impliciations of added cost, effort, management, and administration. (Often a new variant-line or release-line adds 40%-80% additional effort to support and coordinate).
  • Agree that if Multi-Tasking an individual is something that decreases productivity and flow by increasing interruptions and context-switching, then Multi-Project-ing the same team/codebase is an even grander black-hole that sucks away resources and productivity for many of the same reasons
  • Agree that when we do decide to support an additional release/variant, the new project should have some sort of charter and/or service-level-agreement (SLA) that clearly defines the scope and duration of the agreed upon effort and its associated costs.

For some additional reading, see DualMaintenance and UseOneCodeLine on the original Wiki-web, and BranchingAndMerging, ContinuousIntegration and AgileSCMArticles on the CMWiki Web

No comments: