June 2007


Just pushed out to PyPI

The contextlib module that ships with Python 2.5 and newer is pretty neat; for example, it includes a tool to transform a generator into a context manager. Not a bad trick, but not enough.

context_tools aims to pick up where contextlib leaves off. It includes tools for turning context managers into setUp() and tearDown() methods for unittest- and test_harness-based tests and into decorators for functions and generators.

Interested? Grab the latest version. Both source and eggs for Python 2.5 are available, and just as soon as PyPI adds support for Python 3, a tarball for that will go up as well.

…to driving your Google to the Google to pick up some more Google. Sony, eat your heart out.

I’m getting pretty sick of seeing blog posts and mailing lists threads endlessly bemoaning that, “the core developers…are causing a huge risk to the Python community by splitting it asunder for a period of years“. Gloom, doom, pox and peril, blah blah blah.

The language has two choices: either continue to bear the burden of what are now considered poor design decisions (e.g., four forms of raise, syntax ambiguities in except statements) or suck it up and let us try and fix some of these problems. It’s like going to the dentist: it may hurt, but if that minor toothache goes untreated and develops into an abscess, you will wish you were dead.

There are two parts to the transition plan: syntactic transition and semantic transition. For syntactic transition, Guido and I have sunk a lot of time into 2to3, which will translate your Python 2.x code into 3.0’s freshly-polished syntax. When it comes to adjusting your code’s semantics, Python 2.6 will feature a Python 3000 compatibility mode, which when enabled will warn you when you do something that will need to be changed before moving to 3.0. Are these tools perfect? No; that’s the price you pay for using a language as flexible as Python. Are they pretty damn good? Yes. Combined, 2to3 and Python 2.6 will make the vast majority of 2.x -> 3.0 transitions as painless as we can make them. For that last little remnant, the code we simply cannot deal with, that’s what your test suites are for. I have absolutely no pity for anyone trying to migrate to Python 3 without a test suite; you’re doing something fundamentally stupid and we will not bend over backwards to save your dumb ass.

As for the observation that pugs, the Perl 6 compiler, will be able to handle Perl 5 source as input and why oh why can’t Python do that, too: Perl 5-on-Perl 6 is a neat trick born in an intersection of necessity and opportunity. The necessity is there because Perl 6 is a fundamentally different language than Perl 5 (or at least it was the last time I looked; they may have changed their minds over the last week), and Perl’s DWIM mentality would make it prohibitively difficult to mechanically translate the old to the new. Also, the Perl 6 compiler can afford to have a Perl 5 runtime built in because there’s only one (serious) Perl 6 compiler, and so the developer and maintenance cost for this extra runtime is isolated within a single project.

Python simply can’t do that. There are four credible implementations of Python I know of (CPython, Jython, IronPython, PyPy), and we can’t ask each one of these efforts to please please won’t you embed a Python 2 runtime in your system? Not going to happen, ever. Given these circumstances, the best we can do is to have a syntax translator that will work across all implementations, and a semantics checker that’s spec-driven and as implementation-agnostic as possible.

If you think you can do better, show us the code. Talk is cheap.

I know what you’re thinking: “what the hell? You can’t subclass modules!” Conventional wisdom == wrong.

import os

class MyOS(os):
    __metaclass__ = ModuleMeta

    def lstat(self, arg):
        return 6

    def rmdir(self, arg):
        raise self.error("No such file or directory: %r" % arg)

Notice that we’re apparently subclassing a module. The metaclass will allow us to override whichever of the module’s functions we desire, leaving the others intact.

class ModuleMeta(type):
    def __new__(cls, name, bases, d):
        d["__getattr__"] = lambda x, y: getattr(bases[0], y)
        return type.__new__(cls, name, (object,), d)

This is the little beauty that makes the whole thing possible. Here, we stick a custom __getattr__() function into the class’s namespace, then replace the incoming bases tuple with our own. The bases we were passed will contain a module, and that will cause the runtime to complain if the module reaches type.__new__().

Some client code:

os = MyOS()
print os.lstat("foo")
print os.times()
os.rmdir("foo")

Our custom os-alike provides its own rmdir() and lstat() functions while using the times() function from the real os module. This works in Python 2.3, 2.4 and 2.5.

I see requests for this fairly regularly when people are wanting to stub out certain functions in a module for testing purposes. Of course, the easy way to do this isn’t to subclass the module at all: just create a class that does what you want.

class MyOS:
    def lstat(self, arg):
        return 6

    def rmdir(self, arg):
        raise self.error("No such file or directory: %r" % arg)

    def __getattr__(self, attr):
        return getattr(os, attr)

No fuss, no muss, and it’s fully equivalent to the above magic metaclass incantations. I’ll talk more about this in a future post.

If you’re in the Silicon Valley area, I’ll be speaking at the next BayPIGgies meeting about test_harness, my testing framework designed to (hopefully) replace unittest. The meeting is Thursday, 14 June at 7:30PM at Google. Check the BayPIGgies site for more details.

For those of a more European persuasion, I’ll be at EuroPython 2007 in Vilnius, Lithuania, giving much the same talk. Since the paper was accepted in the peer review track, I’ll have to science it up some, but the talk should stay pretty much the same. Except that it will be in Lithuanian.

So Tyler says to me, he says:

I went to the Nashville PHP group last night. The conversation turned to which languages are on the rise, and I threw Python into the mix. Problem is, I had very little ammo to arm myself with. Got a list of bullet points as to why Python is better?

Well, yes and no.

In terms of functionality, there’s very little difference between Perl 5, Python, PHP and Ruby. The reasons to choose one over the other are typically very domain-specific (hence subtle and of little use when fighting religious wars): Perl 5 makes text munging simple by having, e.g., regular expressions as first-class citizens; PHP makes web applications more natural because, well, that’s what it was designed to do.

I have nothing really positive (or negative) to say about Ruby. I can’t think of any special niche that it fills. Anonymous blocks? Perl 5 has them. Pure OO? Python has it. call/cc? If you think you need continuations, you probably don’t. You could argue that Ruby serves a purpose by combining all these things, but the number of people who sincerely need a pure OO language with anonymous blocks and continuations is probably around five.

The negative things I can think of with respect to Perl 5 and PHP is that it’s hard to do dependency injection-based testing in these languages. It’s so hard in Java, for example, even Google has invented a tool to make Java DI easier. Python on the other hand makes this dead-simple, making it so much easier to test your code from all perspectives. Hell, it’s so easy in Python, I didn’t even know there was a name for it until I came to Google. I don’t know how easy DI is in Ruby, but if it’s not Python-easy, Ruby loses.

That’s one criterion for programming languages that I don’t see discussed much: ranking languages by how easy the code is to test. One frequent example is mocking a global resource like a time source. C, C++ and Java all require you to come up with unnatural function signatures or link against special libraries when testing in order to gain control over time. It’s easier in Perl 5, but it still requires a good deal of specialized knowledge of how namespaces and module lookups work. Assuming the target library does something like import time at the top, here’s how you take control of a given module’s time source in Python:

>>> import some_module
>>> class StubTime:
>>>    def time(self):
>>>        return 3634634
>>> some_module.time = StubTime()

Done. No specialized knowledge of interpreter details, no crazy setup, just done. If mocking global resources isn’t that easy in PHP, Ruby or any other language, I have little use for it beyond toy projects. Testing is where I feel Python really stands out.