Wednesday 19 November 2014

Make Concepts Explicit when Fixing Bugs

Dear Junior

To find implicit concepts and make them explicit is a very powerful way to improve code. But, sometimes it is hard to judge beforehand which concept is important enough to make explicit. We try to make good judgements, but sometimes we miss.

But, no reason to despair, we can take Time as our helpful aid. So, instead of everything perfect from start, we continuously perfect it whenever we find a reason. Two of the topmost reasons are either a bug appearing or further development of the code.

Let me for now fokus on the case of when a bug appears. If the code does not behave the way we intended it to do, then there is most often a place in the code that was to convoluted. In that convoluted state, subtle misunderstandings and simple mistakes could hide. The solution is to clarify the code until it is obvious whereof the mistake consists.

Things get clearer by examples, so let me clarify what I mean with a very minimal example, the story about a bug and its fix.

In real life, more or less a decade ago, I worked on a project where we did a batched import of records coming from an external source. The code looked roughly like this following in the class ImportCron.

public class ImportCron {

    private RecordImporter importer;
    private Logger logger;

    void runImport()  {
        List records = importer.listWaitingRecords();
        logger.log(Level.INFO, "Importing records: " + records.size());
        for ( Object rec : records) // imports each of the records in turn
   …
    }

Now, the import was run once a minute, but most of the time there where non records waiting.  Problem was the import log became filled with rows saying "Importing records: 0" so we had to grep out the non-zero lines each time we wanted to look at what was happening.

One of those days I got tired of this, fired up the editor, and wrapped the logging in an if statement to filter out all those zero-import writes.

    void runImport()  {
        List records = importer.listWaitingRecords();
        if (records.size() == 0)
            logger.log(Level.INFO, "Importing records: " + records.size());
        for ( Object rec : records) // imports each of the records in turn
   …
    }

A few days later we released during lunch, which was the best time for our users. The deploy went smooth, apps started up as expected, and soon it was time for the first import "tick".

Within a minute, import was run the first time, and the log line read:
<timestamp> Importing records: 0
While standing confused, a minute went by, and the log line read:
<timestamp> Importing records: 0
Another minute went by … and no log line.
While firing up my editor to check the code, another minute went by and a third log line appeared:
<timestamp> Importing records: 0
A colleague of mine looking in the database calmly reported "We just imported three records".
It only took me a split of a second to realise my mistake once I had the code in front of me.

    void runImport()  {
        List records = importer.listWaitingRecords();
        if (records.size() == 0)
            logger.log(Level.INFO, "Importing records: " + records.size());
        for ( Object rec : records) // imports each of the records in turn

    }

Obviously my spine reflex for coding "check for zero" made me write the boolean expression the wrong way around. By the way, did you capture that mistake at first glance in the code on my first description? Confirmation bias is a nasty thing.

Now, I must say this kind of bugs are pretty uncommon, bugs that are typos or simply mispunching the keys. Most bugs in my opinion are rather that pieces of code subtly misunderstand each other.

Well there are at least two ways of fixing this code. The obvious, and fastest would be to quickly change "==" to "!=". However, humbled by the mistake I had done in the first place, I realised that that kind of hasty coding was what got me in the trouble in the first place.

One of my coding mantras since long have been "Code should mean something, not just do something". So, a better way out would be to make the code more meaning-ful. From Eric Evans I learned the phrase "Make implicit concepts explicit", which says the same thing in this context, but gives a better guiding direction forward.

I took to the challenge of fixing the bug by finding what implicit concepts had been missed, and making them explicit until it was blatantly obvious that the code was wrong. An added benefit would be to be able to make a test proving the code was wrong.

Extracting the boolean condition to a method of its own would force me to spell out the meaning of that piece of code.

    void runImport()  {
        List records = importer.listWaitingRecords();
        if (containsRecords(records))
            logger.log(Level.INFO, "Importing records: " + records.size());
        for ( Object rec : records) // imports each of the records in turn

    }

    static boolean containsRecords(List records) {
        return records.size() == 0;
    }

Granted, this is more code than I started with - but I think the code is more "to the point" (phrased inspired by Rickard Öberg, another of the great programmers).

At least this made me able to write a test

    @Test
    public void shouldConsiderEmptyListNotContainingRecords() {
       Assert.assertFalse(ImportCron.containsRecords(Collections.EMPTY_LIST));
    }

Well, at least this is what the test would look today - at the time JUnit looked slightly different.

Anyway, now I have a failing test. Claiming "the empty list does not contain records" is simply false - as proven by the test. We safely update the code to fix the bug.

    static boolean containsRecords(List records) {
        return records.size() != 0;
    }

Upon which the test switched to green as expected.

During this "refactor -> put under test -> fix" something interesting has happened. The concept "import list contains records" that hitherto had been implicitly represented by the technical "records.size() != 0" has now been made explicit and given a name "containsRecords".

It might be claimed that the main benefit in this example was derived from the drive to put code under test before making a change. Undeniably, that is a point, but I think it only tells half the tale. 

Simply putting things under test can be done in a myriad of ways, and I have seen several very technology-based efforts. I do not think those efforts have paid off handsomely. Mostly the code get cut along technical lines to push in tests in the gaps. But, the code does not get more comprehensible over time.

Focusing on "make implicit concepts explicit" have been a more productive course for me when working with code.

Yours

   Dan

ps My team-mates did not force me to skip lunch to fix the bug. We went out together, and when we came back satisfied and rested, I sat down and fixed the code. We released the fix during lunch-time the day thereafter.