Thursday, March 1, 2012

Chasing order related failures in JUnit tests

Today we were confronted with a tricky situation where a few JUnit tests failed on our hudson build server whereas everything went fine locally. The failing tests hadn't recently been touched - the types under test neither.
As I'm pretty sure many of you have run into the same problem before here's a short transcript on how that problem was solved:

1.) Analyze what changed (as always, right?)
In our case a new unit test for another type had been added. A new test causes existing ones to fail?!

2.) Spot the difference between central and local build
As I was sure resources where not missing, the potential culprit was the order of execution. Reading the logs
I discovered that NewlyAddedTest was executed before the NowFailingTest on Hudson whereas the order was vice versa locally

3.) Reproduce the problem locally
In order to not miss-use Hudson as a trampoline, I wanted to provoke the failure locally. Therefore I had to reproduce the problematic order. This can easily be done using a TestSuite:


@RunWith(Suite.class)
@Suite.SuiteClasses({NewlyAddedTest.class, NowFailingTest.class})
public class TestSuite {
    // nothing 2 do
}


4.) What's actually the problem?
Normally such a situation indicate that some requirements were not properly tearedDown in NewlyAddedTest. So I first @Ignore(d) the test method of NewlyAddedTest and just executed @Before and @After methods in the suite. NowFailingTest passed successfully. Then I started to analyse how far I had to proceed in the test in order to produce the error. It turned out to be the next-to-last line...

To make a long story short - The newly added test uses a Util-Class with static methods that delegate to a JSR-330 @Singleton. The problem was that the static type kept that singleton in a static variable that was only populated when this class was loaded. If another impl for that singleton got registered later the static type would still use the previous one. Potentially a severe bug/problem...

Lessons learned:
Unit tests are your friend! Even though in this case it was not a specific unit test for the Util-Class that showed the error, we were lucky to have tests that produced the problem