100% Code Coverage via automated tests in Symfony applications

On 2017-04-26 the Symfony User Group Cologne gave me the opportunity to speak about Code Coverage.

If you create software that is planned for continuous enhancements and maintenance over several years then you’ll need a sustainable strategy for code quality.
Code coverage by automated tests is one important metric, and its value should equal 100% at all times.
My talk shows why this is so and how it can be achieved in PHP / Symfony based applications.

See the full presentation at Speaker Deck

Natural Enemies of “Moving Fast” in Software Development Projects, Part 1

“Writing tests is the natural enemy of moving fast.”

Some of our dev teams create software using Adobe Experience Manager (AEM, formerly known as CQ).
When you do AEM or Apache Sling then the annual adaptTo tech meetup in Berlin is the place to go.

This year, i.e. September 2016, I heard an interesting phrase.
Twice.
“Writing tests is the natural enemy of moving fast.”

I was surprised.
Sure, if you’ve written only production code in a project and only later on start writing the tests then that will take a lot of time, it will likely be painful and it will slow down progress.

But the 2nd time I heard the phrase was in a talk about TDD, and that’s when I was really surprised.
I’ve been doing XP and TDD since 2005 and – as far as my experience goes – writing tests (and preferably early) is key if you want to move fast in any project of considerable size (… and AEM projects usually are of considerable size).

But I might be wrong, so let’s do an example coding task.


The irritat0r coding task

Requirements

Let’s create a functionality that displays personalized texts to web users
in order to irritate the hell out of them
(which is my understanding of what advertising is basically all about).

The text shall look like this:
Hey ${salutation}, did you know that ${message}?“.

    • If the user is known, e.g. a logged in principal, then salute her by the first name.
    • If the user is anonymous then salute her by just “you”.
    • As for the message it shall be possible to send a String message.

Examples:

  • Hey Andreas, did you know that the longest recorded flight of a chicken was 13 seconds?
  • Hey you, did you know that it is 8 times more likely to get killed by a pig than by a shark?

In version 1 we will implement the following message types:

      • text message, i.e. a fixed text, like http://www.did-you-knows.com/
      • message that displays how many kilometers earth has moved around the sun since the user’s birthday, which of course only applies to logged-in users who provided their birthday.

To keep things proper irritat1ng the application shall choose the message randomly from a pool of all appropriate messages.

In version 1 the functionality shall be available as

  • a HTTP response of type text/plain.
  • a JSON representation
  • an AEM component

As for non-functional requirements:

  • Production code shall be 100% covered by tests
  • The code shall be part of a larger system
  • The code shall allow for being changed and extended for new functionalities

 

Solution Outline

First, let’s do a short design session:

  • We’ll create a Java / maven OSGi bundle called “irritat0r“.
  • We’ll create an AEM servlet called “Irritat0rAemServlet” which allows access to the HttpSession and will send the irritat0r text via the HttpServletResponse
  • We’ll get the user’s id from the HttpSession if HttpSession contains such an id
  • If it does we’ll use the (existing) UserService and get the User object by that id.
  • We’ll create a function that creates the salutation (for known user or anonymous)
  • We’ll create a function that gets 1 appropriate message from the pool of messages.
  • We’ll create a function that assembles the text out of the salutation and the message.
  • (intentionally left blank… will follow in part 2)

So, where should we start?

We could start with the OSGi part

To do so we would need some OSGi container which needs to be set up and started.
Luckily, OSGi requires us to start it only once – but we’d need some way to deploy the bundle into the container.

We could deploy manually

  • Compile the code
  • Package it as a jar including the meta data
  • Open Felix (or Equinox etc)
  • Choose Bundle
  • Upload
  • Restart Bundle

That’s what needs to be done each time we’ve changed the code and we want to see the effect.

I guess we agree that this is a waste of time so let’s state that
“Manual work is a natural enemy of moving fast.”

 

We could use Maven

(or Gradle etc)

So the manual part is transformed into an automated process. Good.

But we’d still need to run Maven each time we’ve changed the code and we want to see the effect.

You may have become accustomed to this procedure but, believe me, somewhere out there PHP, Ruby, JavaScript Developers and even HTML coders are laughing at a random Java hacker right now.
How long do your Maven scripts take? 30 seconds? 2 minutes? Now multiply that times the number of changes you do to your code.

I believe that
“Deploying to verify code changes is a natural enemy of moving fast.”

 

We could fake it

There’s a reason why ZERO TURNAROUND has been successful with JRebel, and the reason is Maven. And ant. And, you name it.

Using JRebel would indeed remove the need to run Maven for verifying code changes – at least for code without OSGi annotations. We’ve been using JRebel in AEM projects. It works for “normal” code but it cannot (reliably) redeploy OSGi services and Servlets, so it has been Maven time for us again.

Also, JRebel is not cheap – it will cost you ~ $475 (as of 2016-10-10).
Per year.
Per seat.

 

But there’s another reason

and it’s less obvious: It’s what happens to the way you think.

You probably know that writing tests for an OSGi service is not the easiest task you can think of.
You might need to install OSGi Mocks (which are in fact stubs, not mocks), Google for code examples and fiddle around with them.
You’ll probably not make the 100% coverage but you’ll have an excuse because, hey, “that’s hard to test” or “that part needn’t be tested”.

Eventually, you’ll get used to a code coverage below 100% before the project even started.

And you’ll be so happy you made it run after all the hard work that you might feel inclined to add more code – i.e. domain logic –  to the OSGi service.
As if it was to make it pay off the pain.

Chances are that this additional code will not be fully covered with tests either, partly because it will be inside of a class that is hard to test, partly because you have already put up with less than 100% coverage.
Test coverage will be the first “victim” of that approach, and the Broken Window Theory will take its toll on you.

Second, putting domain logic into the OSGi service would break Clean Code practices like the Single Responsibility Principle.
Also, as OSGi is the container and hence by definition located on the outside layer of the application, it would violate the Clean Design principle that “All dependencies must only point inwards“.
See also “Ports & Adapters“.

Those violations will eventually lead to messed up dependencies, and, boy, if you really want to slow down development then screwing up your dependencies is a pretty efficient way to do so.

That’s why
“Losing control of dependencies is a natural enemy of moving fast.”

And thus
“Binding domain logic to external devices, ports etc is a natural enemy of moving fast.”

Third, if you need to deploy to verify your code you’ll probably verify your code less often.
If you know that running Maven takes 2 minutes you might not want to run it after each code change.
Instead you’ll keep piling up code changes and deploy them all at once, every 15 minutes or so, and go get a cup coffee while Maven is running.
Sounds familiar?
(Maybe Maven is secretly sponsored by the coffee machine industry, but that’s just a theory.)

Chances are that parts of the code you created within those 15 minutes turn out to be flawed. Sometimes it will be “just a bug”.
But sometimes you will realize that you’ve chosen the wrong approach and now have to delete, change and redo a major part of the code. And as you piled up code changes this might lead to a significantly large piece of rework. That’s of course a waste of time and it will slow down progress.

And that’s why
“Verifying code changes in chunks of more than just a few lines is a natural enemy of moving fast.”.

 

…and there’s nothing to learn here

Lastly, why would we want to start by testing OSGi?

It’s safe to assume that OSGi works, that’s why we use it in the first place.
There’s nothing to learn about the domain here.


We could start with the Servlet/JSP part

Alternatively we could start by coding the Servlet or maybe some JSP.

However we’d be facing the same problems:

  • We’ll need to deploy each time to see the effect.
  • Testing web output will require additional effort like setting up Selenium, HtmlUnit etc
  • Getting 100% coverage will be hard to do and we might put up with less than 100%.
  • Again, you might want to make the pain pay off and add more code, i.e. domain logic, to the Servlet or – even worse – the JSP.

There may be something to learn here, though, i.e. the look and feel of the front end.

 

…but you might be F5’ing a lot

Making sure the front end looks the way it should is an important task, and I admit it’s still pretty hard to do so without manual work. (BTW: has anyone evaluated Galen yet? I’m curious…)

However, if that requires running mvn install each time you want to preview you UI then that’s a waste of time. (HTML coders like immediate feedback, e.g. by using Firebug, brackets.io etc.)

Also, once you have set up this process you might feel inclined to use the UI to also test functionality, not just look and feel.
As a result you’ll be pressing F5 a lot (-> waste of time), comparing the browser’s content to some expectation about the outcome.
In order to do so you’ll be using – your eyes.

First, it’s a manual process and we already know that’s a natural enemy.

Second, it’s exhausting. You might feel as if you were moving fast.
You’re not.
In fact you’re just keeping yourself busy and the way you experience time will change.

Third, your eyes will only see what they are looking for.
If your code change has broken stuff that your eyes are not looking for then you’ll miss that bug.
Let alone if stuff has got broken in other areas of the site, i.e. pages you did not F5. You’ll miss those bugs as well.

And delivering a bug, even more so an avoidable bug, thus making testers report it back and then fixing it, is a HUGE waste of time (cf. Cost to Fix), so
“Bugs are natural enemies of moving fast.”

And that’s why
“Testing domain functionality via the UI is a natural enemy of moving fast.”


First pay-off

We haven’t written a single line of test or production code yet.
However, the Test Driven Development approach has already paid off I think:
TDD kept us from starting with the wrong tasks.
It prevented us from using “Natural Enemies of Moving Fast”.


For some real coding stay tuned for Part 2 which I’ll publish soon and which will include a lot of code examples, I promise.

 

Spoiler alert

Where should you start?
Start at the core, i.e. the domain logic.

You’ll easily be at 100% coverage right from the beginning and you’ll stay at 100% from that point of time on.


What do you think?
Please leave a comment and let me know.

If you like this post then please share it.

Happy hacking!

The REAL reason why developers need Soft Skills

Developers like to code.
All day, sometimes all night, too.

Developers like playing around with things.
New frameworks, yeah. New tools, whoopie.
We like to see a one-liner have big effects, or a small piece of configuration dramatically change the behavior of a program.
We look for that Arthur C. Clarke moment when “Any sufficiently advanced technology is indistinguishable from magic”.

When it comes to Soft Skills, well, that can wait for later when none of the IMPORTANT stuff is left.
Right?


Picture a new development project.

You do some analysis, some solution design and you get a rough idea about how much effort it will take.

Let’s say the effort needed is 2,100 person days.
So if 1 person does all the work, and we assume 21 work days per month then this 1 person will need 100 months to finish the project.

That’s 8 years.

The processes will be super-efficient, though, – no overhead whatsoever!
But you’ll need to wait 8 years before anyone can use the software.
Just think of the major versions of your favorite programming languages, application servers, operating systems, tools, libraries etc 8 years ago and you know how outdated the software will have become by the time it’s done.

And what about that poor guy or girl hacking code on the same project for 8 years using stuff that has become outdated years ago?
Unless you find a way to legalize slavery, odds are that this developer will quit at some point during the project.

OK, new idea.
Let’s have 100 developers work on the project.
This should make the project get done in only 1 month!
Right?
I guess you already know that this is a fallacy. Planning, coordination, integration, QA, communication paths etc, well, overhead will explode instead.
And how many of those 100 people did you have in mind when you estimated the 2,100 days?
Probably you were thinking about the effort it takes to design, implement and test the software but chances are you weren’t thinking about all the additional overhead needed to make 100 people work together.
I’m, of course, referring to “The mythical man-month” by Fred Brooks.

So what is the ideal number of people to work on such a project?
Well, it’s more than 1 and less than 100, this much we know by now.
As a matter of fact, our industry is too young to know that number for sure, based on hardcore scientific research at least.

There’s one rule-of-thumb, though, that I stumbled upon some years ago and I like to use in such situations. (I think it originates from the works of Brooks, Rausch-Schott and Retschitzegger but I can’t find the reference, sorry).

The math goes like this:
${ideal number of people on the team} := square root of ${number of person months}
In our scenario the effort was 100 months, i.e. √100 yields 10 people.*
If that was my project I might suggest setting up a team of 10 and let them work for something like 1 year. (Actually, I prefer slightly smaller teams.)

When 10 people work closely together for 1 year straight then they’d better know how to get along with each other.
They’ll need commitment, empathy, respect, reliability, team spirit, collective ownership and focus on the domain. They’ll need the ability to help, criticize and take criticism, make rules and follow them and most importantly:
They need the willingness and the skills to communicate about the project every day. Well. A lot. Even face-to-face.
That’s the Soft Skills required for such projects.

What if you don’t have these skills?
Well, according to the rule of thumb you will be able to work only on projects that can be done by 1 person.
Do the the math, allow some rounding and you’ll get 2 months max. (√2 months => ~ 1.4 people)

So if you don’t take Soft Skills seriously you will be limited to projects that require up to 2 months effort. Maybe 3 months; it’s a rule-of-thumb after all.
But still, if you like to evolve into more complex challenges, well, there aren’t that many projects that are complex, challenging and take just 2, maybe 3 months to get done.

Frameworks, tools, libraries come and go but good communication and team skills will permanently add to your profile. They are the really important stuff you need for complex tasks.

 

What do you think?
Please leave a comment and let me know.

If you like this post then please share it.

Happy hacking!


* And minus 10 people, of course.
But, personally, I don’t think that firing 10 people will get the project done in 10 months…

Functional DAOs: Amazingly easy way to make Java faster and less hungry for RAM

Some weeks ago I did a code review and I saw something like this in the source:

class UserExporter {
    private final User2CsvConverter user2CsvConverter = new User2CsvConverter();

    private final UserDao userDao;

    public UserExporter(final UserDao userDao) {
        this.userDao = userDao;
    }

    public List<String> getCsvLines() {
        final List<User> userList = userDao.loadAllUsers();
        final List<String> lines = new ArrayList<>();
        for (User userItem : userList) {
            lines.add(user2CsvConverter.convert(userItem));
        }

        return lines;
    }
}

So there’s some data storage with “Users”. The userDao (DAO = Data Access Object) is responsible for getting the data and then returning it as a list of Users.
The next step loops thru all those User items in the list we just got from the DAO and executes some magic on it, i.e. in this case it converts each User object into a String, obviously in order to export all those this line into a CSV file.
Each CSV line will, again, be added to a List of Strings.
Later on the actual CSV export happens:

    int process() {
        final List<String> csvLines = userExporter.getCsvLines();

        int linesWritten = 0;
        csvWriter.writeLine("First Name;Last Name;Email");
        for (String line : csvLines) {
            csvWriter.writeLine(line);
            linesWritten++;
        }

        return linesWritten;
    }

So the List that the UserExporter.getCsvLines() method returned will, again, be looped in order to write each line to some external output device, e.g. a file or System.out.
—-
The code is straightforward, it has a clean layering (DAO abstracting the storage, CsvWrite abstracting the actual output device) and it even has 100% test coverage.
So what the heck could I possibly be complaining about, right?

Well, first of all not only have I seen a lot of similar code like this but I actually used to code like this myself.
Also, this is obviously a simplification of the real code (which I cannot show you, of course).

There are 2 important things to know about the real code:
1) The real data storage is pretty big so in some cases the DAO may return a lot of items.
2) The real code does much more magic on the lists.
As a result I know that we shot the server a bunch of times when we started big exports. It struggled for hours and then failed with OutOfMemory Errors. (And as web client users got impatient they repeated the export some more times only to find out that it – surprisingly – didn’t make the server work faster.)

Someone suggested the usual “we need more RAM” thing but there’s an easier and better solution at hand.
The biggest problem are all those ArrayLists that are being created all over the place.
The DAO creates one, passes it to the CsvConverter which “only” loops it but creates its own ArrayList and so on.
Everytime you use up a lot of memory for those Arrays and a lot of CPU for looping.

For the complete original, imperatively coded example, cf. https://github.com/acnebs-com/clean-code/tree/master/functional-dao/src/main/java/com/acnebs/posts/functionaldao/imperative

So how can we fix it?
Let’s get rid of the lists!
How do we do it?
We use functional style programming.

Here’s how you can use Java 8 to get an equivalent solution with NO (in numbers: 0) Lists at all.
Instead of a returning a List, let’s make the DAO take a Java 8 Consumer that will be called back for each “User” item:

interface UserDao {
    void loadAllUsers(Consumer<User> userConsumer);
}

Now the Exporter can do this:

    public void getCsvLines(final Consumer<String> lineConsumer) {
        userDao.loadAllUsers(
                (user) -> {
                    final String line = user2CsvConverter.convert(user);
                    forEachLine.accept(line);
                }
        );
    }

It calls the DAO and for each User item the dao “returns” (line 3), i.e. has to “offer”, the converter can do it’s magic (line 4). No List, no loop needed.
The CSV line that is being created for each User item doesn’t get added to a list either. Instead it will be “returned” or “offered”, well, handed over to a Consumer that is called back for each CSV line (line 5).

This String Consumer is passed to this method by the outer layer of this application:

    void process() {
        csvWriter.writeLine("First Name;Last Name;Email");
        userExporter.getCsvLines(csvWriter::writeLine);
    }

Again, no Lists, no loops.

When you use this approach you kind of execute the “front end”/output code when you access your data while still keeping a clean architecture.

Or, as another way to look at it: In imperative programming, when your method returns a value, you force the method’s clients to “live” with this value no matter what. If that return value happens to be a big Collection then there’s no other way for the client but to use this Collection with high memory consumption and all.
But if your method takes a Consumer instead then the clients can choose how to handle the data. If all they need is just a list after all then they can pass a plain and simple consumer that will do the trick. Actually that’s what I do in my UnitTests (line 4+5):

    @Test
    public void testLoadAllUsers_ok() throws Exception {
        UserDaoJsonImpl dao = new UserDaoJsonImpl("/users-integration-test.json");
        List<User> data = new ArrayList<>();
        dao.loadAllUsers(data::add);
        assertEquals(3, data.size());
    }

But if they want some special handling, like in this example application, then the clients will have the freedom to choose a more clever way.

This gets even more exciting when you think of how to handle special cases:

    public List<User> loadAllUsers() {
        try {
            final URL resource = this.getClass().getResource(path);
            if (resource == null) {
                throw new FileNotFoundException(path);
            } else {
                final InputStream is = resource.openStream();
                final ObjectMapper mapper = new ObjectMapper();
                return mapper.readValue(is, new TypeReference<List<User>>() {});
            }
        } catch (IOException e) {
            throw new RuntimeException("loadAllUsers: IOException e", e);
        }
    }

In this solution, when an IOException occurs, the loadAllUsers method throws a RuntimeException hence forcing each client code to handle it.
Alternatively the method could have caught the IOException and logged a messages to a logging framework.
Or it could have swallowed the IOException exception hence leaving no clue for client code about what happened.
Whatever the solution we choose – the client code will just have to accept it.

Now think of giving the client code freedom to choose how to handle special cases:

    @Override
    public void loadAllUsers(final Consumer<User> userConsumer) {
        loadAllUsers(
                userConsumer,
                e -> {throw new RuntimeException("loadAllUsers: Exception e: " + e.getMessage(), e);}
        );
    }

    public void loadAllUsers(final Consumer<User> userConsumer, final Consumer<Exception> errorConsumer) {
        try {
            final URL resource = this.getClass().getResource(path);
            if (resource == null) {
                throw new FileNotFoundException(path);
            } else {
                final InputStream is = resource.openStream();
                final ObjectMapper mapper = new ObjectMapper();
                final TypeReference<User> valueTypeRef = new TypeReference<User>() {};
                final JsonNode jn = mapper.readValue(is, JsonNode.class);
                final Spliterator<JsonNode> jsonNodeSpliterator = jn.spliterator();

                StreamSupport.stream(jsonNodeSpliterator, true)
                        .forEach(jsonNode -> {
                            try {
                                final JsonParser jsonParser = jsonNode.traverse(mapper);
                                final User user = jsonParser.readValueAs(valueTypeRef);
                                userConsumer.accept(user);
                            } catch (IOException e) {
                                errorConsumer.accept(e);
                            }
                        });
            }
        } catch (IOException e) {
            onError.accept(e);
        }
    }

In line 28 an IOException is caught. The client of this method can now choose what to do in such an event, cf. line 5:
It can choose to throw re-throw the Exception, aborting the whole export if only 1 User item is broken, as it does in the example above.
Or it could choose to ignore broken items, just log them and continue:

    @Override
    public void loadAllUsers(final Consumer<User> userConsumer) {
        loadAllUsers(
                userConsumer,
                e -> {LOGGER.error("An oopsie occurred: " + e.getMessage(), e);}
        );
    }

Or, how about a progress indicator that should be triggered on each item in order to output some info?
Just add some code for that to your Consumer.

    public void getCsvLines(final Consumer<String> lineConsumer, final Consumer<User> progressConsumer) {
        userDao.loadAllUsers(
                (user) -> {
                    progressConsumer.accept(user);
                    final String line = user2CsvConverter.convert(user);
                    lineConsumer.accept(line);
                }
        );
    }

For the full code please cf. https://github.com/acnebs-com/clean-code/tree/master/functional-dao/src/main/java/com/acnebs/posts/functionaldao/functional8

OK, I can hear you: “We’d like to use Java 8 but they don’t let us”.
Fair enough, but actually I have been lying to you – there’s hardly any “real” Java 8 in the solution I’m proposing.
You can use the same approach and architecture in Java as early as probably 1.0.
Sure, there’s no Consumer interface in Java 7 but so what: just create it yourself.
Actually I should have done that in my Java 8 code, too, because now my public methods have a dependency to java.util.function.
This is generally not the cleanest way to do it, just as you should not use Maps, Lists or Sets in your public method signatures.

interface UserConsumer {
    void doOnUser(User user);
}

//...

interface LineConsumer {
    void doOnLine(String line);
}

//...

class UserExporter {
    private final User2CsvConverter user2CsvConverter = new User2CsvConverter();

    private final UserDao userDao;

    public UserExporter(final UserDao userDao) {
        this.userDao = userDao;
    }

    public int getCsvLines(final LineConsumer lineConsumer) {
        final int out[] = new int[]{0};

        userDao.loadAllUsers(
                new UserConsumer() {
                    @Override
                    public void doOnUser(final User user) {
                        final String line = user2CsvConverter.convert(user);
                        lineConsumer.doOnLine(line);
                        out[0] = out[0] + 1;
                    }
                }
        );

        return out[0];
    }
}

So you can use the functional way in Java 7, too. You just cannot use lambdas.
And, the code looks somewhat awkward, right?
(For more details cf. the complete Java 7 solution)

That raises an interesting question:
If high memory consumption and expensive loops in Java have been bothering us for years while there has been a way to solve it using Callback interfaces – why aren’t we using them in Java 7 all the time?
Is it because developers are too lazy to write those extra lines of code? Hmm.

Maybe it’s just because it feels wrong.
The Java 7 solution just looks unnatural, unfamiliar, it feels like you’re coding wrongly.
So is using your intuition the problem?

Of course you should listen to your intuition.
When you start coding you need to realize what you don’t know in order to find out what to learn.
You will travel from this “unconscious incompetence” to “conscious incompetence”, then “conscious competence” until you eventually know all those things while you’re not even aware of them (“unconscious competence”). (See Four stages of competence for more about this)

It’s this unconscious competence that’s yelling at you from the inside when you’re about to do something awkward, weird, and most of the time this intuition is right: you’re about to do something wrong, so stop it.
But sometimes it’s just another unconscious incompetence.
I learned imperative style programming some centuries ago so it feels natural and familiar. Functional is new so of course it feels strange at first.

And, programming is using a language to utter ideas. So just like human languages, programming languages influence the way you think. If your programming language has no “natural” support for functional programming it’s no wonder you don’t think of it all the time.

So I think we should go easy on ourselves. It’s OK to do things like imperative programming for a while and then improve your style later.
If it wasn’t OK how would we ever start? There will always be a next big thing we cannot afford to wait for.

In fact, I try utilizing this effect: I try to make the good things easy, familiar and natural to use while making the bad things hard.
For example, decoupling by using interfaces is a good thing.
I encourage my developers to create dependencies only toward interfaces. That’s why I want interfaces to be named in a pretty way, like “Exporter”, so it “feels right” to use them. If they were called “IExporter” or “ExporterInterface” then developers might avoid using them.
Implementations, however, should be called in a way that makes developers know immediately that they should not bind their code directly to something like an “ExporterJava8FooBarUsingJsonCsvImpl”.

What do you think?
Please leave a comment and let me know.

If you like this post then please share it.

Happy hacking!