In Defense of Human Testing

Published on December 4, 2017 · 8 mins

Developers are hackers — good developers are, at least.

Hackers hate repetition. They hate doing the same thing over and over again, because it dulls the mind and takes time away from the interesting stuff.

It follows that, when a developer has the chance to delegate something to a computer, they’ll take it without hesitation and write a program that will do the heavy lifting for them. This is great: it’s the main reason software exists — to free up the limited time and resources of the human race.

This is how test automation was born, around 20 years ago — probably one of the greatest inventions in our industry. Test automation allows us to write and refactor code with confidence, while also helping us document complex flows and providing greater assurances to clients. It’s a tree that grows more branches every day: unit testing, integration testing, functional testing, acceptance testing, TDD… Each of these has its own place in the software development workflow.

What was once a niche has become the standard in software development, thanks to languages like Ruby, which promote testing automation and consider it a first-class citizen in the developer’s ecosystem. I myself consider a good test suite to be paramount for a solid development workflow and do my best to promote testing at all levels of an organization.

And yet, I’m here today to tell you that test automation is not the silver bullet we like to think it is.

The Illusion of Safety

The fundamental problem with automated tests is that they lie. It doesn’t matter how well written and thoroughly maintained your test suite is: it’s bound to give you false positives.

Unfortunately, many developers, especially the most inexperienced, trust their test suite blindly: green means it’s working. This can give you a false assurance that your product is bug-free, when in fact it has bugs just like any other piece of software — but they will be found by your users.

There are many different reasons automated tests are not enough to ensure the health of a codebase — almost all of them boil down to the fact that machines are fast and stupid, while people are slow and smart:

Automated tests don’t catch errors you don’t test for. This one is obvious: your test suite will only catch bugs in the conditions and modules you test. You might be testing that your users can pay with a credit card, but what happens when they choose to pay with a prepaid card with no funds on instead? Will you handle the error gracefully or will you crash in a rather user-friendly way? To automate this test you first have to think of it, and developers are not terribly good at coming up with test scenarios, or when they do, they’re biased (more on this below).
It’s almost impossible to do holistic automated testing. Even though we have integration and system tests, it’s very difficult to test a complete system in integration in the same way a real user would do it. This means that certain bugs that only occur under very specific conditions, which are often a result of a system being used over an extended period of time, are difficult — when not impossible — to identify in automated tests. An example of this are issues that occur because of data corruption.
Some tests can only be performed by humans. Machines cannot test for good UX because, by definition, UX is a human’s world. Perhaps one day AI will reach a point where it’s able to differentiate a good experience from a bad one, but we’re definitely not there yet. Machines do not currently exhibit the empathy that is required to assess a system’s usability.
Automated tests are biased. I put this last, but it’s probably the biggest problem with test automation: since automated tests are usually written by the same developers who have implemented the features, they are inherently biased by the developers’ understanding of the aspect being tested. If a developer gets a feature wrong, their tests for that feature will be wrong too, and will still pass. In the best case, this means time and money are lost correcting both the feature and the test. In the worst, a broken feature with passing tests slips into production. Even when you adopt acceptance tests, which test a feature against acceptance criteria, you’re bound to miss some scenarios along the way.

This paints a pretty grim picture of test automation and might discourage you from adopting the practice, but that’s definitely not my goal. In fact, automated test have several advantages over manual tests: they are cheaper, much faster and — when written well — fully reproducible and deterministic. They can dramatically bring down an organization’s development costs and defect density, while at the same time increasing development speed.

We should be able to find a middle ground, so that we can take advantage of both automated and manual testing, using each methodology where and when it makes sense. So how do we do that?

Good Testing Starts with People

There’s really no way around this: if you want good (manual and automated) tests, you need to start with people and ideas, not tools and processes. Here are some suggestions to make your testing practice more human:

Identify your MWP. Your MWP (Minimum Working Product) is made by those features of your product that must absolutely work at all times, if you want to retain your users. For a document editor, for instance, you would include in this group the ability for your users to create a new document and edit an existing one. These are the features that you want to test both manually and automatically. The best input can be provided here by the people who have defined the product, who know the why — which brings us to the next point.
Involve stakeholders in testing. Testing is not something just for devs and QA people. No matter how hard it is, you need to include key decision makers in testing as much as possible. Hopefully, these are the people who breathe the product and understand the users, and are likely to provide valuable insight to the development team. You must also encourage these people to train everyone else on the principles behind the product, so that others can follow in their footsteps.
Retain the human component. QA teams often seem an inefficiency, but in my experience they are amongst the greatest contributors to a product’s quality and innovation. A good QA team doesn’t limit themselves to mindlessly going through a checklist — instead, they regularly conduct exploratory testing, to identify potential flaws and areas for improvement. They see the product from the perspective of the end user, but they also have the technical know-how to report problems and suggest solutions. This is stuff a machine will not be able to do.
Automate as soon as possible, but not sooner. A point comes when there is no reason to keep running the same test over and over manually, because the steps and expected outcomes have been defined. This is when you should write an automated test scenario for it — but not sooner. Don’t try to get ahead of yourself and automate a test before you fully get the implications. If possible, spend time with the people doing the testing to understand what they look for and how they reason. Once you grasp this, you will be able to replicate it.

How We Test at Batteries911

At Batteries911, we we have a dedicated QA team of three, but everyone at the company knows what our users expect of us and what must work at all costs. These core flows have automated end-to-end tests, which are a pretty good approximation of the tests our QA team runs manually. This means we can complete a huge refactoring and have the confidence that we haven’t broken anything.

Our suite of automated tests covers pretty much everything at the unit, integration and system level, but not always in an overly specific way. When we find a new scenario that can be automated, or an improvement we can make to an existing test, we go ahead and expand the suite, which frees up the time of the QA team to think of more ways to… break our code.

Of course, this was not our first attempt at testing. Previously, we used to put way too much confidence into our automated tests, which caused our defect density to go through the roof. With this methodology, everyone is able to do what they do best: humans are thinking, machines are executing. As a result, we have seen the quality of our work grow tremendously, along with the happiness of our users.

For us, there are still a lot of open questions: for instance, we are still trying to identify which metrics really matter in QA and how we can best track them. We’d also love for developers to be more involved in QA, but we don’t want to use their time in the wrong way, so there is a balance that we have to strike. Finally, we’re looking into ways to keep a better record of our QA tests and runs, experimenting with JIRA and Zephyr.

As you might expect from a startup, our entire workflow is in constant evolution, but we are now much more confident that we can ship the best possible version of our product, because we have found a testing methodology whose outcomes we can trust.

If you’re struggling with testing and QA in your organization, I encourage you to try this approach — you might find that it gives you a fresh perspective on your work, while also making your users much happier.