This is a journey into flakyness, meaning randomly failing RSpec tests, how to find and fix them.
Today I encountered a problem with my rspec tests: right after implementing and finishing a decorator in TDD manner, just about to commit, I ran all the specs by hitting ENTER in my guard/spring environment – noticing that some specs didn’t pass anymore. Taking a look at them I realized that it was what I just implemented, so I opened the
match_decorator_spec.rb and took a look at what failed: I found out that I checked a link being returned in one method and that link suddenly contained
http://localhost:3000/ in front which was not the case (and worked that way) before. Witout changing anything I saved the file, inducing guard to run the spec for the decorator again and it passed. Huh? I hit ENTER again and all specs now passed. Even more huh?! I repeated it some times more and they all passed.
Of course it was a sign that my commit wasn’t that wrong so I commited my changes. However it was also a sign that something was wrong – maybe a hickup? So I decided to continue working.
Later I created another commit in another decorator and the problem popped up again – and went like the first time. I commited again, but now was sure that something evil is going on.
First I though that it has something to do with spring, because – by accident – it appeared to be “always” the case when I restarted guard (which restarts spring). On the second run all tests passed. However it turned out that this was only because my spot check simply wasn’t big enough. After some more tests and reruns the problem also appeared within already “used” guard and spring.
A Tool For Finding Dependencies
A quick search on the web revealed the small tool dep_detect.rb. In order to make it somehow usable for me I tweaked it a bit: write files to
tmp, be able to set some known failed seeds and last but not least use spring.
It’s a cool tool which can be very handy, but the problem was, that it didn’t reveal any test that could be marked as “the one dependency” making the relevant test fail. :sad:
Or Doing It By Hand
So I had to dig into it by myself. Here are the steps:
Step 1: Reduce Tests In Question
I grepped through all
rspec_fail_*.txt to find out in which seed my affected test appears topmost. This is where the least tests are running beforehand.
$ grep -n ^ParticipationDecorator tmp/rspec_fail_*.txt | sort -n -t ':' -k2,2
The first one is the spec of choice. I noted all specs running before my affected test inside this file and found only 3.
Step 2: Running Only Tests In Questions
bin/rspec --order defined --format documentation ... on the command line with only those test files in the given order to see my reduced test plan fail. \o/
Or not? – Not always?! Huh?
Well, I also found out, that
--order defined doesn’t really mirror the order given by the files on the command line. How’s that? I really don’t know, but I know that adding the seed of the failed test file helps in setting at least some fixed order. It’s not the same as when running the whole test suite, but since there are only a couple of tests running now, you’ll very likely find a working seed quickly (i. e. letting the tests fail).
Step 3: Selecting The Important Ones
I played around a bit (really only a few tries) to find a combination where only 2 specs were running before my affected test, making it fail.
Interestingly there was another combination that made my test fail with two complete other tests in front. That’s actually why
dep_detect.rb didn’t reveal any explicit unique test that causes the problem, because there seem to be at least some of them.
So for today I found out, that probably one test creating a factory that also sets user’s roles causes my flakyness problem.
I will now have to investigate more and tell, what’s exactly going on.