Andy McKay :: Declaring war on the zamboni test suite (pt. 1)

Nov 20, 2012

Declaring war on the zamboni test suite (pt. 1)

We have this awesome project zamboni that powers the Firefox Marketplace and Add-ons. But it has a problem and that has been annoying me for too long. It's now at the point of embarassment, the test suit is too damn slow. How slow?

On my Macbook Air, it's: 1949 tests in 874.476s.

14 minutes. That 0.44 seconds per test. That's way beyond what it should be. Solitude another Django project clocks in at 260 tests in 13.965s, that's 0.05 seconds per test. Why is zamboni almost 9x slower per test?.

How fast do I want it to go? Well really fast, but at the moment I'd settle for the following benchmark. It takes me 5 minutes to walk out of my home office, get a cup of tea and come back. At that time I expect a full test suite run. That means getting the tests to average 0.15 seconds. A significant improvement. This is the "cup of tea" milestone.

Notes:

yes, from a pure testing point of view our tests do too much
no, I'm not going to mock the database
no, I'm not going to mock Elastic Search (although I'd like to)
maybe I'll mock internal bits that are slow, but I'll try to resist that
no, I don't think this is all due to fixtures
yes, I'm trying to break zamboni down into smaller pieces over time

We've got our measurement and our target. Let's go.

First steps

Preparing for this I found a few areas I could clean out along the way. To our surprise we found a few areas that were making external HTTP calls to other servers. In most cases these were signals that were being called as a side effect. For example fetching icons, which had no meaning to the tests and could pass or fail without issue.

So I wrote nose-blockage, this is a nose plugin to block all HTTP and SMTP connections, by raising an error. If the HTTP call matters, mock it. If not you get an error. If it doesn't matter, the error means it exits quickly.

This didn't make a huge difference, but it made me feel better.

SQL queries

SQL may, or may not be the problem. But it's easy to find a number of SQL queries and hammer it down. So let's look at that. To find the number of queries I used django-statsd with its database patch. I picked a slow test to see the number of queries.

    @patch('mkt.developers.tasks._fetch_content', _mock_fetch_content)
    def test_imageassets(self):
        asset_count = ImageAsset.objects.count()
        self.create_app()
        eq_(ImageAsset.objects.count() - len(APP_IMAGE_SIZES), asset_count)

How many SQL queries? 584

Where are they all? test-utils did one SQL query per model table on start up. That's around 150 queries for the entire suite (not per test). Forget that. Removing that takes us down to 430.

Let's be sophisticated and run each line in turn and count the sql queries (I inserted a return and re-ran)

    @patch('mkt.developers.tasks._fetch_content', _mock_fetch_content)
    def test_imageassets(self):
        # By this point 242 queries.
        asset_count = ImageAsset.objects.count()
        # By this point 243 queries.
        self.create_app()
        # By this point 429 queries.
        eq_(ImageAsset.objects.count() - len(APP_IMAGE_SIZES), asset_count)
        # By this point 430 queries.

Something is wrong with our fixtures and then create_app code.

Miscellany

We use waffle switches in our code to allow feature switching. Tests turn these on or off a lot to ensure that we test the correct configuration so let's just create those in the cache, instead of the db, then we remove a db hit.

Next. Whats up with those fixtures?