A discussion of development against the odds, describing process, personal and environmental barriers between the people and the job they are trying to do.
This article seeks to explore why things are not always done 'properly'. Based on personal experience working in various teams, I have noticed a common problem: some people in the team 'know' what is idiomatic, what good and bad practices are, and yet the code still rots. I would like to explore with you why this might be, and thus draw some conclusions for both the software developers and the project managers.
Fundamentally, I have identified that there are various 'barriers' in place when a developer is given a task. These barriers can be to do with the process, the tools in use, the other people on the project, or the developer themselves. Typically these barriers are of the kind that narrow the amount a developer touches a code base, and I will argue that this is often just the opposite of what is needed.
The situation where barriers exist tends to encourage practices which I identify as anti-patterns. That is, they are things which occurred frequently, in response to the forces in play, but because the forces in this case were the barriers hampering good practice, the result was a degradation of the code base.
The most significant barrier I encountered was the disabling of multiple checkouts in the source control system. By this, I mean the ability for more than one developer to checkout a given file simultaneously, such that the can all make changes independently.
Let's explore what the barriers might be, with an attempt to categorise them:
Process
- Very bureaucratic change process
- Poor access to files e.g. multiple checkouts not allowed, remote working connection speed problems
- Restrictions on code changes. E.g. in later phases, management decide to touch as little as possible
- Lack of automated testing
Personal
- Lack of ability to make the right decisions about changes
- Fear of failure
- Wanting to keep a low profile
- Laziness
Environment
- Long build times
- High dependencies among modules
I decided to write this article based on a small subset of these barriers, which I had identified as particularly prevalent in one place of work. However, thinking about the topic, I identified other barriers too, which might be causing similar problems for other teams.
So, the talented developer can often be frustrated at the system, his colleagues or his manager. The question I think is important is: why are these barriers a problem? What effect do they have exactly?
Let's examine the kinds of changes that are made to a code base once the project is well under way, since these are often different to the changes made in the opening or closing stages of the project, and this is when the majority of the changes are made. There is compilable code which correctly implements some desired features. There is more functionality to add, existing functionality to change as requirements change, and there are bugs to find in what exists. Thus, changes are afoot, rife perhaps, to existing code. We are actually in maintenance mode as well as new development mode. Assumptions will change and some new features will pose challenges to the designs in place.
So, let's consider a developer who is assigned a task. The product has a graphical front end and works with a hierarchy of objects. We display the name of the item in the list view, but for some items, we wish to display something else. Perhaps a different kind of name, for example a name embellished with other information like the count of sub-objects. The experienced among us might immediately pull out new concepts from this. Aha, we have the concept of displayed name, or list view display value, or display value parameterised by view type. We might have a session at the whiteboard and/or pub to see which concept best fits the expected future changes (perhaps other views might want to display different information too).
Our programmer, however, does not think like that, or at least does not act like that. They want to have an easy life, sign off the requirement and go to the pub. What is the easiest/quickest way to get this done?
Answer: Go to the list view code, and where the name is being pulled off the items, do a dynamic cast to see if we have the special type. If so, call a function on it to get the extra information to display and append this to its name. One function, plus a #include statement: one source file changed. Ah, and no one has that file checked out. Magic!
Someone else given this task would have approached it differently. They would have introduced the concept of different display properties, perhaps parameterised by view type, or something. Ok, so we add some new virtual functions to the object base class. Provide a suitable implementation for most items, and for the special item, we provide the desired implementation. Maybe all the view classes need to be given knowledge of this new concept too. Without getting into a discussion about the merits of implementations in virtual function hierarchies, etc., I hope you can see where I am coming from. The changes necessary here are now much wider, in that they affect both more files directly through code changes, and indirectly through dependencies, most notably the object hierarchy. We may even end up with a fundamental change to both the objects and the view. Heaven forbid!
What have I explained here? That the changes you can make to software can be done in different ways. I think there is often a correlation between the correctness of a change and the amount it affects the code base. Here is a hierarchy of changes, ordered by ascending effect on dependent files:
- Code change in one function, as outlined above. Note the nasty smell of a dynamic cast, or other form of type ascertainment.
- Add a parameter with a default value to a function that needs to do something different in some circumstances, and call it with a non-default value where the new behaviour is required. (requires change to two files for the changed class and files for calling classes where some different behaviour is required, and rebuild of dependent things).
- Add a new function (same overhead as above, but with additional potential responsibilities of documentation, etc. and the visibility of having changed that interface more).
- Add a new virtual function to a base class (requires access to the base class, and the derived class for which the specialised version is required, requiring access to more files and a longer rebuild). Perhaps other families of objects need to change too to know of the new concept in town.
In the team where this kind of thing was most prevalent, there were two main barriers in play. One was personal: laziness, lack of thought and ability to pull out concepts. The other was process - multiple checkouts were not allowed. As a result, the codebase was already in decline a long time before the end of the project. Particularly prevalent were functions which had a Boolean parameter at the end, one of the lower impact changes mentioned above. This was my anti-pattern nemesis.
Anti-Pattern - Parameterise Function with Boolean (often with default value)
Forces: Some functionality exists, but a change in behaviour is needed in certain circumstances. Other places using that functionality are required to remain the same.
Solution: Add a Boolean parameter to the function. Give it a default value, and in the function body, preserve the current flow through the code for that value. For the non-default value, do something different as required.
The parameter would often be called doX or dontDoX, indicating that you could get it to do something subtly different with either true or false. There would often be a default so that existing code would compile and do the same as it did. Most people reading this article would try and think of more graceful ways to solve this problem; use of Template Method design pattern perhaps, or perhaps two separate functions, each of which calls a helper to do the common bits, the names of these functions clearly indicating which behaviour they cause. The point is that that process of thinking simply did not occur. Sometimes it was lack of ability or experience, sometime laziness.
Consequences: The results of this anti-pattern are mysterious calls to functions. GetName( false ); Whilst totally comprehensible to the developer at the time, should either the developer or the time change, comprehensibility declines. One could use techniques designed for comment-less coding to mitigate the mess somewhat, such as declaring a named constant variable and passing this in, or using an enum, but I feel this is really tidying up after a mess, rather than avoiding creating that mess.
But there were code reviews; surely only good code will get checked in?
At least, the process included a mandatory code review for each checkin. The reviewer had to check a box on the requirement sign-off page. Unfortunately, this review more often than not ended up as simply reading the code. Any questions asked were dismissed with lame excuses. "Oh, yes, I could do it that way, but this works and is much simpler." "That would be overkill." "The files were checked out."
I managed to get a reputation for being a difficult reviewer, because I would demand changes where I felt that things were not being done properly. A few people in the team really liked this, and we formed a band offering good constructive criticism of each other's designs and code. Other team members sought reviewers of the old school type so they didn't have much hassle.
Now for the second main barrier in place: the lack of multiple checkouts. This was rigidly enforced and defended by the project management team for fear of collisions of code changes causing compilation failures and bugs. It is certainly true that when checking in a file that someone has changed under your feet, you have to go through a process of reconciling those changes and making sure your own changes still make sense. But, that is something you have to do anyway when making a checkin, because the same issue occurs between different source files. For example, the same issue still applies if you decided to call a function which no longer exists on a class, or for which someone has changed the signature.
So, their worry was really not fully founded in my opinion.
Furthermore, it had exactly the opposite effect to that intended. The more diligent programmers did try to do things properly and, as discussed above, often needed to make small changes to a larger number of files. We often ended up in deadlock. Because someone knows they won't be able to checkout a file if someone else grabs it first, they check it out as soon as they can and hog it. So, the second person needing it does the only sane thing and makes their own copy writeable so as they can get on with their own changes. There are ways to make this more robust. These might range from simply writing down which files are thus affected, to making a temporary source code control system for such situations. However it is done, we have the same risks as when the main tool is used to do multiple checkouts, but with extra overhead and risks as well! These extra risks are things like the developer accidentally overwriting their working copy, or forgetting to integrate the changes made in the meantime. The multiple checkout facility, in this case of SourceSafe, is specifically there to help manage this situation for the developers, but it wasn't being used, for fear of the very thing which it helps to avoid.
Now, it also had the effect of changing the nature of changes made to the files. Things tended to back up on developers' machines who were not lucky enough to get all the files they needed, thus changes were often much larger than they could have been. Several requirements might be addressed in one checkin, for example. With multiple checkouts allowed, this would not be the case, as you have the ability to keep checking in your atomic changes, and keeping them concerned with one issue at a time. So, we had a vicious circle. Lack of multiple checkouts made the lack of multiple checkouts much worse to live with!
What can we learn?
Any constraints on developers will have a tendency to reduce their effectiveness, when considering the project as a whole, evolving thing. Constraints can be those imposed by the management via process, or the tools in use, through to the developers themselves, through laziness, lack of ability, or lack of experience.
If you are a project manager, be aware that the constraints you impose on your team must be those which will encourage better code being checked in, not merely conform to some loose idea of how things should be done. A good constraint might be "any code checked in must have unit tests". A bad constraint might restrict access to files (as with the well-intended lack of multiple checkouts).
If you are a developer, think hard about what you check in. Is it a result of some compromise because of file availability, or because you didn't feel like changing base object class for some reason? Think how your changes will affect the future of the code, and if you can make that future brighter, then consider doing so. Fight your corner where you can see that well-meant processes are causing a derogatory effect.
There will be barriers in place at your place. Hopefully they will be lower than you can jump.