Thursday, July 16, 2015

Seperation of Tests

Go to any automation conference and there will no doubt be a couple of talks on flaky tests, they are one of the larger pain points when dealing with automation.  There are a number of approaches to reducing the problem of flaky tests, I would suggest watching talks from previous GTAC or selenium conferences, I would like to talk about splitting the test results, or the execution themselves.

We frequently had a problem where the build was constantly red, atleast one test had failed.  From a Pm's point of view they stop seeing the benefit of automation, testers themselves start to lose hope that they will keep see a green build and worst of all the results stop being valued by the team.

We started by moving tests that were flaky or had defects attached to them into a separate run. We continued to execute these defect/flaky tests, looking to see whether the defect tests failed earlier or started to pass and making sure that the flaky tests were flaky.

Unfortunately quite often during projects testers are pressed for time, so with this steup we had tests still sitting in the regression suite waiting to be investigated whether the failure was the result of the flaky test or a real defect.  This lead us to creating a final run called investigation, where all tests that had failed in the previous regression run are rerun.  These are ususally run straight after completion of the regression run.  The results from this enable us to hopefully allocate the test into the correct run (flaky or defect).

In the future we hope to automate the process of allocating the tests into the correct run.

Monday, July 13, 2015

Breaking vs Failing Tests

So your test failed, what does it mean?  One of the first things you need to determine is did it fail on the test or the setup.  Sadly I've frequently found that tests fail before they get to the actual part of the system they are testing, especially with gui testing.

The problem with tests failing in this way is that we often report them as failures and this can massively skew our results.  If we have a suite of 50 tests targetted at the Accounts functionality of our system but the accounts tab has been removed so we are unable to navigate to it should this be reported as 1 failure or 50?

By marking one test as failing due to a check that the accounts tab is available and the rest as breaking we solve this problem.  Suddenly we have 1 test failure and 49 grouped breakages, which is a far more indicative of the actual state of the system.

So a breaking test is a test that fails before it gets to what it is checking/testing/asserting.  I highly recommend incorporating the breakdown of failures to include breaking tests into your automation reports.