Nobody has time for manual end-to-end testing

The value of automated end-to-end tests becomes much more obvious when looking at the ways that manual end-to-end testing often falls short.
Illustration of arrows encircling an hourglass


The argument for automated end-to-end testing

It shouldn’t be a surprise that nobody has time for manual testing. And, if you are familiar with how we do things here at Pixo, then you know that we are BIG into unit tests and integration testing. But if you need a refresher… 

“…unit tests test a single function, they’re faster and cheaper — so it’s widely advised that unit tests should account for 70% of automated tests in your software. Automated integration tests should account for 20%, and end-to-end tests should account for 10%.”
— Landi Najarro, Lead engineer at Pixo (“What is Behat? An intro to the BDD framework”)

The value and reliability of automated unit tests and automated integration tests is undeniable. Yet, when it comes to end-to-end testing (manual vs. automated), this sometimes requires a bit of convincing. The value of automated end-to-end tests becomes much more obvious when looking at the ways that manual end-to-end testing often falls short.     

Disclaimer: Manual end-to-end testing is not a bad thing

Manual end-to-end testing is not a bad thing. In fact, a manual end-to-end test is often the first step of programming an end-to-end test automation. The automation process aside, there are also instances where manual end-to-end testing is the only way to assure good quality. This is especially true for matters of accessibility, such as WCAG 2.1 standards. And, automated accessibility tools like axe-core will politely remind us of that with statements like, “Please note that only 20% to 50% of all accessibility issues can automatically be detected. Manual testing is always required.” So yes, some aspects of quality assessment will always need the attention of a real human doing manual end-to-end testing. It is absolutely valuable to spend time performing manual tests when it makes sense to do so. What isn’t valuable is relying exclusively on manual testing when an automated test can produce the same or better results.

Related terms

  • Behavior Driven Development: Developing software by defining the intended user/system behavior. For an in-depth look, see: “What is Behavior Driven Development, and why is it valuable?”.
  • End-to-end testing:  Any software test that requires a user interface (usually a web browser, but not always). This can include things like acceptance testing, cross-browser testing, functional testing, usability testing, and dozens of other terms that may or may not have slightly overlapping definitions.


Why manual end-to-end testing will always fall short

Why manual end-to-end testing will always fall short

It’s tempting to think that a small system, under just the right conditions, can probably be end-to-end tested manually… but, that is rarely ever the case. That “small” system quickly becomes much larger than anticipated, and those “just right” conditions can quickly change for the worse. And really, isn’t that exactly why we test in the first place? 

If we wanted to pretend that things will always operate smoothly, then there wouldn’t be a need to test anything. There would never be a miscommunication about how a certain feature should operate, there would never be cross-browser problems, and there would never be any code refactoring — but pretending that won’t happen won’t change reality. All of those things can definitely happen. And, more often than not, they do. What’s worse is that manual testing can do very little to prevent these problems. 

There are two big reasons why exclusive use of manual end-to-end testing will always fall short: 

  1. False assumptions: Manual testing makes it easy to falsely rely on the “common sense” of a human tester. Because we assume the tester will know what to do and what to expect, we take less care with defining clear steps and clear outcomes.     
  2. Logistical impracticalities: Manual testing requires people and time. The obvious constraints inevitably result in cutting corners (fewer people testing less of the system) or ever-rising maintenance costs (more and more people testing to keep pace with a growing system).

Manual end-to-end testing enables false assumptions 

Having an in-depth conversation about the expected behaviors of a complex system is not everyone’s idea of fun (well, it is for me, but I’m not everyone). Uncovering all of the nuanced details for every possible scenario takes time, takes patience, and requires diligent documentation. It can be difficult to communicate everything, making it tempting to avoid it all together.

Manual end-to-end testing is logistically impractical 

If someone tells you that they are manually doing their end-to-end testing, it likely means one of two things. There are a lot of people doing all of the things or there is one person trying to do more than one person can. 

Scenario 1: Many people testing everything

In some cases, there might be a huge team of people who systematically check every function and workflow in a monotonous mechanical way (checking both legacy functions and newly iterated features). That work load will always grow with every new iteration, and the team will constantly need to grow along with it. Aside from how unfulfilling that kind of robotic work must be for the team, it also continues to cost more and more to sustain.

Scenario 2: A few people test some things

As terrible as that first option sounds, the other extreme is far worse. Imagine one person who frantically attempts to manually test all of the new features from the most recent iteration. Then, they might do the same tests on a second browser (but they probably won’t have time). This method of “manual testing” is especially troublesome for a lot of reasons. The biggest one, however, is that it ignores all of the legacy features. Now imagine the horror of discovering that the most important legacy feature in your system, the checkout paywall, has malfunctioned over the weekend (gulp!). Believe it or not, legacy features are definitely capable of breaking when code gets refactored during a new iteration. Testing only “the new stuff” won’t ensure system stability.

Well, what does work?

Well, what does work?

  • Behavior Driven Development
  • Automated end-to-end testing
    • Use clear criteria to build automated pass/fail tests where scripted actions imitate user behavior with greater speed and consistency than manual testing. (Stay tuned for a future post on this topic.)
    • Weed out cross-browser and cross-platform woes by running cloud-based automated tests on a variety of machine configurations simultaneously. (Stay tuned for a future post on this topic too.)