revision control


One of SVK’s features that I really like is that there aren’t any bookkeeping directories in your checkouts to get in the way of grep and find. Problem is, when you delete a checkout in the filesystem, there’s nothing to notify SVK of the deletion. It’s not a big deal, it just takes some extra space in my ~/.svk/ directory. I says to myself, I says, “it would be really nice to be able to do svk checkout --list and get a listing of all checkouts”.

Turns out, SVK beat me to the punch:

~] svk co --list
Depot Path                     Path
================================================================
//IPC-DirQueue/local           /home/collin/src/IPC-DirQueue
//Net-Google/local             /home/collin/src/Net-Google
//SOAP-Lite-Mock/local         /home/collin/src/SOAP-Lite-Mock
//WWW-Google-News/local        /home/collin/src/WWW-Google-News
//adaptive_parsing/local       /home/collin/src/adaptive_parsing
//c2-ecore/local               /home/collin/src/c2
//coverme/local                /home/collin/src/coverme
//e2-ecore/local               /home/collin/src/e2-ecore
//everydevel/local             /home/collin/src/everydevel
//functional/local             /home/collin/src/functional
//personal/local               /home/collin/doc
//python-3000/main             /home/collin/src/python-3000
//python-peps/local            /home/collin/src/python-peps
//svk/local/main               /home/collin/src/svk
//svnmock/local                /home/collin/src/svnmock
//test_support/local           /home/collin/src/test_support
//typecheck/local              /home/collin/src/typecheck
//unittest/local               /home/collin/src/unittest
~] 

Beautiful.

Now all I need is an svk checkout --cleanup macro around --list that will purge SVK of any checkouts that have already been removed from the filesystem.

Since I’m forever reinventing stuff that SVK already does, I thought I’d ask before coding up this idea:

A while ago, I got tired of reading through commit messages and not know in what directory or branch they occurred in. Trying to make things a bit easier, I started prefixing each commit message with the path the change is occurring in. For example, if I’m changing these files

trunk/lib/Foo/Bar/Baz.pm
trunk/lib/Foo/Bar.pm
trunk/lib/Foo/Qux.pm

I’d use trunk/lib/Foo/ as the path.

I’ve now gotten pretty tired of having to type this in for every commit, and I’m looking for a better way. Ideally, there would be a way of specifying — either per-user or system wide — that SVK should filter all commits through a given module. That module would be passed a list of all files involved in the commit, the draft commit message, etc., and it could then generate a new commit message. I would use this to have SVK figure out my little path scheme, saving my fingers from having to do that extra typing. Someone else who might benefit from making it easier to generate strictly-formatted commit messages: Subversion.

Why not do this on the repository side with a commit-hook, you ask? Easy: you can’t. Read the section in the SVN manual on hook scripts and scan for the big red warning about trying to use hook scripts to modify a transaction.

I’m pretty sure that SVK doesn’t already have a facility like this in place, but I’ve thought that before, as my pile of SVK gadgets will attest. So I’m asking this time.

When I posted the first version of svk-init, my handy-dandy SVK initialisation script, I mentioned that I wanted to have the script figure out the short name for me, based on the repository URL. Promises made, promises kept:

#!/usr/bin/perl

use warnings;
use strict;

my($source, $short) = @ARGV;

unless(defined $short)
{
  if($source =~ /^[^:]+://(.+)$/)
  {
    my @path = split('/', $1);

    for(my $i = 1; $i < @path; $i++)
    {
        if($path[$i] eq 'branches' || $path[$i] eq 'tags' || $path[$i] eq 'trunk')
        {
            $short = $path[$i-1];
            last;
        }
    }

    # Fallback: use the last part of the path
    $short ||= $path[-1];
  }
}

unless(defined $short)
{
  die('You need to provide a short name for the repository');
}

system("svk mirror $source //$short/main");
system("svk sync //$short/main");
system("svk cp //$short/main //$short/local "
                       ."-m 'Creating //$short/local'");
system("svk co //$short/local ~/src/$short");

This allows me to give svk-init a repository URL of

http://svn.python.org/projects/python/trunk/

and have svk-init correctly come up with “python” as the short name.


Relatedly, SVK’s author, Chia-liang Kao mentioned in the comments that the first version of svk-init could be compressed to svk cp $source ~/src/$short, plus answering a few questions. I tried it out, and sure enough, it does indeed do the job. I’m going to stick with svk-init for two reasons: 1) svk cp uses a different mirror naming scheme, and 2) svk cp requires you to answer 3-4 questions, meaning I can’t start it up and forget it about it like I do svk-init.

Tired of typing the same formulaic SVK commands over and over again whenever I need to mirror another repository to my system (especially in the wake of several successive laptop failures), I whipped up this little bash script. I have christened it svk-init.

#!/bin/sh

source=$1;
short=$2;

svk mirror $source //$short/main
svk sync //$short/main
svk cp //$short/main //$short/local -m "Creating //$short/local"
svk co //$short/local ~/src/$short

The first parameter is the URL to mirror from, the second is a short name for the project. Usage goes something like this:

$ ./svk-init svn+ssh://somehost/var/svn/functional functional

When the script completes, I’ll be left with a working copy in my ~/src/functional/ directory.

Eventually, I’d like to make the short name optional, having the script deduce one for me if the second parameter is left unspecified.

As promised, a discussion of how to use SVK’s pull command:

For the longest time, I would type some variation of

$ svk sync //$project_name/main
$ svk sm //$project_name/main //$project_name/local
$ cd ~/src/$project_name
$ svk update

in order to pull changes from a remote repository to the mirror, merge from the mirror to my local branch, then update my working copy from the local branch.

That much typing sucks, and it was just as I was getting ready to write a macro for this task (in the vein of mymerge and mergeproject) that I discovered svk push. This lead me to investigate pull, which is — as you might have guessed — the opposite of push.

At its core, pull can be thought of as a wrapper around that command sequence above: sync the mirror, merge to the local branch, then update the working copy. The simplest form of pull is this, issued from within a working copy:

svk pull

Like push, you can provide your own working copy path:

svk pull ~/src/project_x/

This has the same effect as if you had cd‘d to the directory and then run the naked pull command.

Also like push, you can provide a depot path instead of a working copy path:

svk pull //project_x/local

This will sync the mirror and merge the new changes, but will not update any working copy.

I’ve been using push and pull for a few weeks now, and I’m thrilled by not having to type out a thousand commands over and over again or write a new macro.

After posting two entries about some custom SVK commands, it seems SVK had already beat me to the punch.

If you recall, I originally wrote the mymerge and mergeproject commands because I was tired of typing this out every time I needed to merge local changes up to the main repository:

svk smerge -I //$project_name/local //$project_name/main

As it turns out, SVK already has a command to do this — and in a more flexible way, to boot: push, an smerge macro like mine.

push takes a single argument, the thing to push changes from. This thing can be either a working copy or a depot path (if you don’t supply this argument, the current directory is used). In either case, SVK will automatically figure out the target for the merge:

  • If you supply a working path, SVK will first figure out which depot path you used when checking out the working copy, then…

  • If you supply a depot path (or if you’ve fallen through from above), SVK will figure out where you copied the depot path from, then merge to that parent path. (Note that it’s meaningless to use push on a mirrored path; svk push should only be used on depot paths copied from somewhere else.)

Once SVK has figured out the source and target paths, it will perform an incremental smerge. Hmm…this sounds suspiciously like my commands!

So, to translate my own commands into svk push:

svk mymerge $project_name

is the same as

svk push //$project_name/local

while

svk mergeproject

is the same as

svk push

Unfortunately, push’s entry in the SVK book can be summed up as “TODO”, the word that dominates the page. I’ve added this to my todo list, but it’s a fairly low priority at the moment.

In a future entry, I’ll talk about svk pull, the handy-dandy, already-written spelling of another custom command I was close to writing.

Continuing my “Extending SVK for fun and profit” series, I present the mergeproject macro, which builds upon the mymerge command I talked about last time.

As I mentioned in the mymerge article, I name my local and mirrored repositories //$project_name/local/ and //$project_name//main/, respectively. In addition, I follow the convention of giving my checkout paths equally imaginative names, like /home/collin/src/$project_name/.

When last we left our heros, I had managed to cut the command to sync my local repository to the mirrored repository down from a monstrous

svk sm -I //$project_name/local //$project_name/main

to a more lazy-coder-friendly

svk mm $project_name

That’s good, but we can go further.

Since all of my project checkouts follow the same naming conventions, and since most of my svk mm commands are issued from within the project’s checkout directory, there’s no reason for me to type $project_name each time. Some File::Spec incantations should be more than enough to figure this out for me.

After some digging around through SVK’s internals, I present you…mergeproject:

package SVK::Command::Mergeproject;
use strict;
use SVK::Version;  our $VERSION = $SVK::VERSION;

use base qw( SVK::Command::Mymerge );

use SVK::Util qw(splitdir catdir);
use SVK::I18N qw(loc);
use Cwd;

sub parse_arg {
    my $self = shift;
    my @arg = @_;
    return if @arg != 0;

    my $pwd = Cwd::cwd();
    my @dirs = splitdir($pwd);
    for(my $i = 0; $i < @dirs; $i++) {
        my $dir = catdir(@dirs[0..$i]);

        # See if the directory is a valid checkout path
        # If it's not, an error will be raised and $@ will be set.
        # If the directory is a valid checkout path, pass only the
        #  directory name -- ie, not the full path -- up to
        #  Mymerge, which will handle the rest.
        eval { $self->{xd}->find_repos_from_co($dir, 0) };
        unless ($@) {
            return $self->SUPER::parse_arg($dirs[$i]);
        }
    }

    die loc(”Unable to find a checkout path while traversing %1n”,
                $pwd);
}

We use Cwd::cwd() to grab the absolute path to the current directory, then use SVK::Util::splitdir() (SVK::Util autoloads all the useful bits of File::Spec for us) to break the path into individual directory names. We then iterate over the list of directory names, building up longer and longer paths with SVK::Util::catpath(). For example, given the current working directory of /home/collin/src/svnmock/trunk/, we’d look in the following succession of directories for SVK checkouts:

/
/home/
/home/collin/
/home/collin/src/
/home/collin/src/svnmock/
/home/collin/src/svnmock/trunk/

stopping once SVK::XD::find_repos_from_co() reports that we have indeed found one. (In the above example, we’d end up stopping at /home/collin/src/svnmock/, the first directory that SVK can map to a repository.) The second argument of 0 to find_repos_from_co() tells SVK that we’re only interested in whether the checkout maps to a repository.

Once we’ve found a valid checkout path, the last directory in the series (the one that actually holds the checkout) is assumed to be the project name and so is passed up to mymerge.

Let’s recap: we went from this:

svk sm -I //$project_name/local //$project_name/main

to this

svk mm $project_name

to now this (using the mp alias for mergeproject)

svk mp

Hooray, laziness!

If you’re interested in doing something similar, put this code in /usr/lib/perl5/site_perl/*/SVK/Commands/Mergeproject.pm or wherever your SVK command modules happen to be. If you want to use a shortcut (I use mp), you’ll need to add a line to the %alias hash in SVK::Command, something like “mp mergeproject”.

All my projects use Subversion for revision control, but on my laptop, I use SVK so I can keep working and committing even when away from an Internet connection. (Also: SVK’s merge support and branch tracking beats SVN’s hands down).

(This isn’t an SVK tutorial. For that, you should check out Ron Bieber’s excellent series of SVK tutorials.)

One part of SVK’s everyday workflow is having a mirrored repository, which represents the remote SVN repository, and a local repository, where you normally commit to. You then merge between these repositories — from local to mirrored to push changes to the main repository, mirrored to local to sync with the main repository.

For each project I work on, I name the mirrored and local repositories //$project_name/main and //$project_name/local, respectively. This means every time I want to push changes up to the main SVN repositories, I type svk sm -I //$project_name/local //$project_name/main — automatically merge all changes between branches, incremental commits.

Because that command never changes, and because I’m lazy, I wrote a mymerge “macro” to save myself the trouble of all that typing. Now, instead of that big, long command, I type svk mm $project_name. Much better.

mymerge works by subclassing SVK’s SVK::Command::Smerge class and overriding the parse_args() method. It does some monkeying around with the arguments, then hands control off to smerge.

package SVK::Command::Mymerge;
use strict;
use SVK::Version;  our $VERSION = $SVK::VERSION;

use base qw( SVK::Command::Smerge );

sub options { () }

sub parse_arg {
    my $self = shift;
    my @arg = @_;
    return if $#arg < 0;

    my $depot = $arg[0];

    $self->{incremental} = 1;
    return $self->SUPER::parse_arg(”//$depot/local”,
                                   “//$depot/main”);
}

The $self->{incremental} = 1; assignment is the same as supplying the -I flag to smerge in the original example. The last line does the important argument-mucking before passing control up to smerge.

If you’re interested in doing something similar, put this code in /usr/lib/perl5/site_perl/*/SVK/Commands/Mymerge.pm or wherever your SVK command modules happen to be. If you want to use a shortcut (I use mm), you’ll need to add a line to the %alias hash in SVK::Command, something like “mm mymerge”.

With the new design of svnmock stabilised, I spent some time yesterday putting it into service, testing a new Subversion-targeting backend for Cypress. There’ve been some minor refinements (e.g., making the args parameter to MockSession.add optional), and here are my observations thus far:

  • The low-level nature of svnmock is both a blessing and a curse. The blessing: exact, super-precise assertions as to which API functions should be called. The curse: you effectively end up duplicating the code you’re testing. I’m considering adding support for something like, “I don’t care what the next four API calls are, but the fifth one needs to be X(Y, Z)”, but I’m not sure that’s a road I want to head down. On the other hand…

  • …it’s gratifyingly easy to build macro-methods on top of these low-level primitives, and my test cases have started sprouting private methods to manipulate the appropriate MockSession objects. Rather than endlessly duplicate the MockSession.add() calls needed to test method X, those calls get moved into a mock_X() method on the test case. While this certainly cleans things up, I’m looking for a way to shift more of this burden to the svnmock.mock module.

In the process of testing Cypress, I discovered bugs in Python, Subversion’s python API and SWIG, the tool used to automatically generate Subversion’s bindings for python, perl and ruby. The Subversion bug: there’s a function in the Subversion python bindings with an illegal name, svn.client.import (”import” is a keyword in python, meaning it’s illegal to use it as an identifier). The python bug: the internal function used to register C-language extension modules, Py_InitModule, doesn’t check to make sure that functions/methods/whatevers use legal names. The SWIG bug: SWIG fails to warn adequately when you try to generate an illegally-named function.

I first reported the bug to the python-dev mailing list, offering to patch the bug it myself if there was interest in a fix. The word I got there was that, given this problem’s extreme rarity, it wasn’t worth the time (which I agree with). Next, I sent an email to the general Subversion mailing list, pointing out that this function needed to be renamed; the only response I’ve received to date was that the new name I proposed wouldn’t work, but nothing more constructive. Finally, I posted a bug report to the SWIG project’s bug tracker; the developer in charge of the python backend commented quickly on the bug report, saying that it’s a well-established behaviour of SWIG to allow you to create illegal names, though it does warn about it.

Not exactly the response I was hoping for. Sigh.

While on the train back from Frankfurt this morning, I realised that I had been designing the svnmock API all wrong. The old design was based on simply populating a mock repository in RAM, then letting the SVN-emulation layer interact with that repository. I hadn’t gotten very far into implementing the design, but already I could tell that it was going to be a massive, unwieldly operation. Worse yet, I realised that it wouldn’t offer the level of control I myself would want as a tester. The rest of the trip was spent looking at snow-covered Hessian woods and restarting the API design process from scratch.

While talking over possible design issues with a guy I graduated from uni with, Tyler Hall, I had a brainstorm. Within about 15 minutes, I had ripped out the entire old design, several hundred lines of complicated class interaction (which was only a tiny fraction of what would be needed), and had replaced it with totally new workings. Result: in less than 100 lines, I had a mock up of the entire Subversion API. Better yet, it would be totally future-proof: if the Subversion python API changes radically, I don’t have to release a new version — the existing code will adapt on its own.

Where the old design relied on me knowing what every single function in the API does, the new design is assumes that you know what all these silly procedures do. It works like so:

  1. You create a new MockSession object.

  2. You tell the Session object that you expect function X to be run with parameters Y and Z, and that it ought to return 7; anything else is to be considered an error, and it should blow up.

An example:

from svnmock import mock, core, repos, fs

ses = mock.MockSession()

pool = ses.add(core.svn_pool_create, [None])
scratch_pool = ses.add(core.svn_pool_create, [pool])

We’ll take this slow:

  1. Line 1: svnmock.mock is the module used to populate the testing environment. svnmock.core, svnmock.repos and svnmock.fs are modules holding the emulated API functions and constants. We need to import them so we can refer to their functions later.

  2. Line 3: create a new MockSession object. MockSession instances are used to organise our test environments; having them be objects — as opposed to a more imperative-style interface — allows us to easily swap test environments in and out, making them reusable.

  3. Lines 5-6: here’s where things start to get interesting. In line 5, we specify that the first command to be executed must be core.svn_pool_create, with the sole parameter of None. Line 6 specifies that core.svn_pool_create will be run again, but this time the parameter will — nay, must — be the return value from the first svn_pool_create() call; anything else is to be treated as an error.

See the doc/mock_session.py file in the SVN repository for further examples of the new syntax.

Next Page »