March 2006


People are doing things with typecheck! Huzzah!

Most recently on my radar, Marc Rijken has contributed a package that combines typecheck with Zope interfaces (like Roles in Perl 6). I’ve imported his code into typecheck's SVN repository, meaning that it will be shipped with the next release of the package.

For those interested in trying it out now, the code is available from the SVN repository (svn://oakwinter.com//var/svn/typecheck) under trunk/contrib.

After quite a long recharging-the-batteries sabbatical from work on community2.com, coding has recommenced. At the moment, I’m going back through my unfinished branches in the SVN repository, seeing what still needs to be merged and whatnot.

Today’s major change: I’ve eliminated eCore’s dependence on the “gods” usergroup. The name “gods” was an unfortunate relic of the engine’s original design (way back in 1998-1999), which had a lot of role playing/Dungeons and Dragons-style overtones to its terminology (to wit: the names of the experience levels on Everything2, another eCore site). All the hardcoded references to “gods” have been scrubbed in favour of an admin_group key in the System Settings settings node. Note that if you don’t set this key, the system will default to using “gods” as the name of the group; this was done to preserve backwards compatibility.

The change looks to be holding; at least, I was able to nuke the “gods” group and nothing broke. For an eCore site, “nothing broke” is generally a pretty good sign.

or, Adventures in the Python Bytecode Compiler, episode 1

It’s been quiet around here lately…almost too quiet. I’ve spent the last 2.5 weeks (minus time spent battling the flu) investigating possibilities for optimising Python’s bytecode compiler and interpreter. One of the areas uncovered was the bytecode-level implementation of list comprehensions, which were being translated to some pretty hideous stuff.

[x for x in range(10) if x % 2]

translates to

      0 BUILD_LIST               0
      3 DUP_TOP
      4 LOAD_ATTR                0 (append)
      7 STORE_NAME               1 (_[1])
     10 LOAD_NAME                2 (range)
     13 LOAD_CONST               0 (10)
     16 CALL_FUNCTION            1
     19 GET_ITER
>>   20 FOR_ITER                31 (to 54)
     23 STORE_NAME               3 (x)
     26 LOAD_NAME                3 (x)
     29 LOAD_CONST               1 (2)
     32 BINARY_MODULO
     33 JUMP_IF_FALSE           14 (to 50)
     36 POP_TOP
     37 LOAD_NAME                1 (_[1])
     40 LOAD_NAME                3 (x)
     43 CALL_FUNCTION            1
     46 POP_TOP
     47 JUMP_ABSOLUTE           20
>>   50 POP_TOP
     51 JUMP_ABSOLUTE           20
>>   54 DELETE_NAME              1 (_[1])

The part I didn’t like was how the list was being built up. At the beginning, a bound append method on an empty list is stowed away, then invoked with each succcessive new element. That means that every element incurs the overhead of some extra stack manipulation and a method resolution+invocation (which isn’t cheap).

My solution was to resurrect an unused opcode, LIST_APPEND. Rather than take its list argument off the stack, it now takes an integer argument, indicating the number of items to look down the stack. This avoids a lot of unnecessary stack manipulation and function call overhead, since we end up calling PyList_Append directly.

The example from above, rewritten to use the new LIST_APPEND opcode:

      0 BUILD_LIST               0
      3 LOAD_NAME                0 (range)
      6 LOAD_CONST               0 (10)
      9 CALL_FUNCTION            1
     12 GET_ITER
>>   13 FOR_ITER                27 (to 43)
     16 STORE_NAME               1 (x)
     19 LOAD_NAME                1 (x)
     22 LOAD_CONST               1 (2)
     25 BINARY_MODULO
     26 JUMP_IF_FALSE           10 (to 39)
     29 POP_TOP
     30 LOAD_NAME                1 (x)
     33 LIST_APPEND              2
     36 JUMP_ABSOLUTE           13
>>   39 POP_TOP
     40 JUMP_ABSOLUTE           1

So, what does this change buy us? Besides shorter generated bytecode, a vastly cleaner C implementation, list comprehensions are now over 16% faster [1]. I’ve submitted this in patch form to SourceForge; hopefully it can make it into Python before 2.5 is released.


[1] - Benchmarking was done by taking all the listcomp examples from Lib/test/test_grammar.py and looping over them a few tens of thousands of times. The old version took 1.69 seconds to run; the new version took 1.41. The testing loop was executed 10000 times each trial.

Update - the patch was rejected. The compiler was apparently supposed to be doing something similar to this already, but that functionality had been left out when the new AST branch was merged in; it’s been reinstated.