subversion


With the new design of svnmock stabilised, I spent some time yesterday putting it into service, testing a new Subversion-targeting backend for Cypress. There’ve been some minor refinements (e.g., making the args parameter to MockSession.add optional), and here are my observations thus far:

  • The low-level nature of svnmock is both a blessing and a curse. The blessing: exact, super-precise assertions as to which API functions should be called. The curse: you effectively end up duplicating the code you’re testing. I’m considering adding support for something like, “I don’t care what the next four API calls are, but the fifth one needs to be X(Y, Z)”, but I’m not sure that’s a road I want to head down. On the other hand…

  • …it’s gratifyingly easy to build macro-methods on top of these low-level primitives, and my test cases have started sprouting private methods to manipulate the appropriate MockSession objects. Rather than endlessly duplicate the MockSession.add() calls needed to test method X, those calls get moved into a mock_X() method on the test case. While this certainly cleans things up, I’m looking for a way to shift more of this burden to the svnmock.mock module.

In the process of testing Cypress, I discovered bugs in Python, Subversion’s python API and SWIG, the tool used to automatically generate Subversion’s bindings for python, perl and ruby. The Subversion bug: there’s a function in the Subversion python bindings with an illegal name, svn.client.import (”import” is a keyword in python, meaning it’s illegal to use it as an identifier). The python bug: the internal function used to register C-language extension modules, Py_InitModule, doesn’t check to make sure that functions/methods/whatevers use legal names. The SWIG bug: SWIG fails to warn adequately when you try to generate an illegally-named function.

I first reported the bug to the python-dev mailing list, offering to patch the bug it myself if there was interest in a fix. The word I got there was that, given this problem’s extreme rarity, it wasn’t worth the time (which I agree with). Next, I sent an email to the general Subversion mailing list, pointing out that this function needed to be renamed; the only response I’ve received to date was that the new name I proposed wouldn’t work, but nothing more constructive. Finally, I posted a bug report to the SWIG project’s bug tracker; the developer in charge of the python backend commented quickly on the bug report, saying that it’s a well-established behaviour of SWIG to allow you to create illegal names, though it does warn about it.

Not exactly the response I was hoping for. Sigh.

While on the train back from Frankfurt this morning, I realised that I had been designing the svnmock API all wrong. The old design was based on simply populating a mock repository in RAM, then letting the SVN-emulation layer interact with that repository. I hadn’t gotten very far into implementing the design, but already I could tell that it was going to be a massive, unwieldly operation. Worse yet, I realised that it wouldn’t offer the level of control I myself would want as a tester. The rest of the trip was spent looking at snow-covered Hessian woods and restarting the API design process from scratch.

While talking over possible design issues with a guy I graduated from uni with, Tyler Hall, I had a brainstorm. Within about 15 minutes, I had ripped out the entire old design, several hundred lines of complicated class interaction (which was only a tiny fraction of what would be needed), and had replaced it with totally new workings. Result: in less than 100 lines, I had a mock up of the entire Subversion API. Better yet, it would be totally future-proof: if the Subversion python API changes radically, I don’t have to release a new version — the existing code will adapt on its own.

Where the old design relied on me knowing what every single function in the API does, the new design is assumes that you know what all these silly procedures do. It works like so:

  1. You create a new MockSession object.

  2. You tell the Session object that you expect function X to be run with parameters Y and Z, and that it ought to return 7; anything else is to be considered an error, and it should blow up.

An example:

from svnmock import mock, core, repos, fs

ses = mock.MockSession()

pool = ses.add(core.svn_pool_create, [None])
scratch_pool = ses.add(core.svn_pool_create, [pool])

We’ll take this slow:

  1. Line 1: svnmock.mock is the module used to populate the testing environment. svnmock.core, svnmock.repos and svnmock.fs are modules holding the emulated API functions and constants. We need to import them so we can refer to their functions later.

  2. Line 3: create a new MockSession object. MockSession instances are used to organise our test environments; having them be objects — as opposed to a more imperative-style interface — allows us to easily swap test environments in and out, making them reusable.

  3. Lines 5-6: here’s where things start to get interesting. In line 5, we specify that the first command to be executed must be core.svn_pool_create, with the sole parameter of None. Line 6 specifies that core.svn_pool_create will be run again, but this time the parameter will — nay, must — be the return value from the first svn_pool_create() call; anything else is to be treated as an error.

See the doc/mock_session.py file in the SVN repository for further examples of the new syntax.

One of the problems with writing code that interacts with a database (for example) is that it’s hard to test; do you have your test suite create, populate and destroy whole databases to test your code against? That’s an awful lot of work. Then there’s the question about testing those tricky edge cases, like certain error conditions: if you’re creating new databases, how do you simulate, say, your database server suffering a catastrophic failure. I can think of one solution, but it would get pretty expensive pretty quickly.

Fortunately, since this is a common enough problem (testing database interaction), smart people have already solved it. In the perl community, for example, DBD::Mock (full disclosure: I work on DBD::Mock some) is one fairly well-known way of testing interaction with database systems. It allows you to say, “Make sure the following SQL statements are executed in the following order, and return these results”. It can also be used to simulate what happens when the connection to the database dies mid-query, for example.

Once you move away from databases, though, to other, though still complex systems, the range of testing options becomes more limited. In my case, I wanted to write several applications and support libraries on top of the Python bindings for Subversion, a popular revision control system. When it came time to write tests for the Subversion-facing code, I faced a dilemma: was I back to creating full-blown SVN repositories for each test case?

My solution was svnmock. The svnmock package does for Subversion’s Python API what DBD::Mock does for perl’s DBI: it makes testing easy. svnmock allows the test writer to say, like with DBD::Mock, “I want the following series of function calls, with this set of parameters and this return value”. While this may seem fairly low-level, it is trivial to write macro-like constructs on top of the current set of primitives. One of my favourite features of svnmock is that it allows you to specify that the return value from api_func_1() must be used as a parameter to api_func_2().

svnmock’s project website may be found here.