python


I recently spent some time porting setuptools to the py3k-struni branch as a means of testing both 2to3 specifically and the porting process generally. What follows are the notes from the experience. Two things to keep in mind: first, the struni branch, though slated to become the “official” Python 3000 branch, is still very much in flux and currently has 30+ failing tests; needless to say, it’s not an ideal porting target. Secondly, I was attempting this without Python 2.6’s forward compatibility mode, which is still mostly unwritten. As both of these situations change, I’ll keep trying to port more code to test the general readiness of the 2.x -> 3.x migration strategy.

Things to do in your Python 2 code:

  • Don’t write code like this:

    class install(_install):
      new_commands = [
        ('install_egg_info', lambda self: True),
        ('install_scripts',  lambda self: True),
      ]
      _nc = dict(new_commands)
      sub_commands = [
        cmd for cmd in _install.sub_commands if cmd[0] not in _nc
      ] + new_commands

    That won’t work in Python 3000 because of changes to list comprehensions and class definitions. Move the new_commands and _nc declarations out of the class body.

  • Don’t rely on implicit relative imports. In Python 3000, all imports will be absolute by default; you should write one of

    from setuptools.dist import _get_unpatched
    # or
    from .dist import _get_unpatched

    instead of

    from dist import _get_unpatched

Stuff that needs to be easier:

  • The fact that __cmp__ methods are going away sucks, plain and simple. This required me to manually implement four additional comparison methods for every class that had previously relied on __cmp__. I hope their removal will be rethought and retracted.

  • The struni branch currently has three different string-ish types: bytes, str (previously unicode) and str8. Guido has said that str8 will eventually go away, but its presence in unexpected places (like modules’ __file__ attributes) made for some needlessly frustrating debugging. Ignoring str8, the new bytes types is going to be a serious obstacle for anyone wanting to move their codebase to Python 3. Take the following two lines:

    data = open(some_file).read() # read in text mode
    # and
    data = open(some_file, "rb").read() # read in binary mode

    The first returns a str, the second a bytes object; these two types have incompatible APIs, and the current state of the struni branch makes it impossible to write code that operates on both. For example, the signatures of the types’ split() methods are different, and the bytes type lacks a splitlines() method. These aren’t hypothetical differences: I’ve run into both problems while trying to fix several of the tests in the standard library.

  • On the subject of bytes, I ran into two additional bytes/str incompatibilities when porting setuptools. First, when you iterate over str instances, you get single-character strs back; when you iterate over bytes instances, you get integers. Combine this with code that switches based on type, and you end up banging your head against the table when your code starts kicking out errors, complaining that Python can’t iterate over the number 91. Secondly, I am absolutely sick of seeing “cannot concatenate str and bytes types” errors; my general tactic is to start throwing str() calls around until the error goes away, but that kind of shotgun debugging hurts my soul.

This needs to be easier. I hope Guido will release any notes he’s been taking while porting the standard library to use the new bytes and str types.

On the plus side, setuptools helped turn up a few bugs in 2to3, as well as some places where the translation could have been improved (and has been). I intend to repeat this experiment once the struni branch settles down and once 2.6’s py3k-compat mode works.

New new new new: context_tools 0.2. Shiny stuff in this release:

  • Support for Python 2.4. context_tools now supports Python 2.4, 2.5 and 3.0.

  • test_with() now supports multiple context managers (by request).

  • Bug fix: yield_with() now plays better with other decorators.

Download the latest release.

Just pushed out to PyPI

The contextlib module that ships with Python 2.5 and newer is pretty neat; for example, it includes a tool to transform a generator into a context manager. Not a bad trick, but not enough.

context_tools aims to pick up where contextlib leaves off. It includes tools for turning context managers into setUp() and tearDown() methods for unittest- and test_harness-based tests and into decorators for functions and generators.

Interested? Grab the latest version. Both source and eggs for Python 2.5 are available, and just as soon as PyPI adds support for Python 3, a tarball for that will go up as well.

I’m getting pretty sick of seeing blog posts and mailing lists threads endlessly bemoaning that, “the core developers…are causing a huge risk to the Python community by splitting it asunder for a period of years“. Gloom, doom, pox and peril, blah blah blah.

The language has two choices: either continue to bear the burden of what are now considered poor design decisions (e.g., four forms of raise, syntax ambiguities in except statements) or suck it up and let us try and fix some of these problems. It’s like going to the dentist: it may hurt, but if that minor toothache goes untreated and develops into an abscess, you will wish you were dead.

There are two parts to the transition plan: syntactic transition and semantic transition. For syntactic transition, Guido and I have sunk a lot of time into 2to3, which will translate your Python 2.x code into 3.0’s freshly-polished syntax. When it comes to adjusting your code’s semantics, Python 2.6 will feature a Python 3000 compatibility mode, which when enabled will warn you when you do something that will need to be changed before moving to 3.0. Are these tools perfect? No; that’s the price you pay for using a language as flexible as Python. Are they pretty damn good? Yes. Combined, 2to3 and Python 2.6 will make the vast majority of 2.x -> 3.0 transitions as painless as we can make them. For that last little remnant, the code we simply cannot deal with, that’s what your test suites are for. I have absolutely no pity for anyone trying to migrate to Python 3 without a test suite; you’re doing something fundamentally stupid and we will not bend over backwards to save your dumb ass.

As for the observation that pugs, the Perl 6 compiler, will be able to handle Perl 5 source as input and why oh why can’t Python do that, too: Perl 5-on-Perl 6 is a neat trick born in an intersection of necessity and opportunity. The necessity is there because Perl 6 is a fundamentally different language than Perl 5 (or at least it was the last time I looked; they may have changed their minds over the last week), and Perl’s DWIM mentality would make it prohibitively difficult to mechanically translate the old to the new. Also, the Perl 6 compiler can afford to have a Perl 5 runtime built in because there’s only one (serious) Perl 6 compiler, and so the developer and maintenance cost for this extra runtime is isolated within a single project.

Python simply can’t do that. There are four credible implementations of Python I know of (CPython, Jython, IronPython, PyPy), and we can’t ask each one of these efforts to please please won’t you embed a Python 2 runtime in your system? Not going to happen, ever. Given these circumstances, the best we can do is to have a syntax translator that will work across all implementations, and a semantics checker that’s spec-driven and as implementation-agnostic as possible.

If you think you can do better, show us the code. Talk is cheap.

I know what you’re thinking: “what the hell? You can’t subclass modules!” Conventional wisdom == wrong.

import os

class MyOS(os):
    __metaclass__ = ModuleMeta

    def lstat(self, arg):
        return 6

    def rmdir(self, arg):
        raise self.error("No such file or directory: %r" % arg)

Notice that we’re apparently subclassing a module. The metaclass will allow us to override whichever of the module’s functions we desire, leaving the others intact.

class ModuleMeta(type):
    def __new__(cls, name, bases, d):
        d["__getattr__"] = lambda x, y: getattr(bases[0], y)
        return type.__new__(cls, name, (object,), d)

This is the little beauty that makes the whole thing possible. Here, we stick a custom __getattr__() function into the class’s namespace, then replace the incoming bases tuple with our own. The bases we were passed will contain a module, and that will cause the runtime to complain if the module reaches type.__new__().

Some client code:

os = MyOS()
print os.lstat("foo")
print os.times()
os.rmdir("foo")

Our custom os-alike provides its own rmdir() and lstat() functions while using the times() function from the real os module. This works in Python 2.3, 2.4 and 2.5.

I see requests for this fairly regularly when people are wanting to stub out certain functions in a module for testing purposes. Of course, the easy way to do this isn’t to subclass the module at all: just create a class that does what you want.

class MyOS:
    def lstat(self, arg):
        return 6

    def rmdir(self, arg):
        raise self.error("No such file or directory: %r" % arg)

    def __getattr__(self, attr):
        return getattr(os, attr)

No fuss, no muss, and it’s fully equivalent to the above magic metaclass incantations. I’ll talk more about this in a future post.

If you’re in the Silicon Valley area, I’ll be speaking at the next BayPIGgies meeting about test_harness, my testing framework designed to (hopefully) replace unittest. The meeting is Thursday, 14 June at 7:30PM at Google. Check the BayPIGgies site for more details.

For those of a more European persuasion, I’ll be at EuroPython 2007 in Vilnius, Lithuania, giving much the same talk. Since the paper was accepted in the peer review track, I’ll have to science it up some, but the talk should stay pretty much the same. Except that it will be in Lithuanian.

So Tyler says to me, he says:

I went to the Nashville PHP group last night. The conversation turned to which languages are on the rise, and I threw Python into the mix. Problem is, I had very little ammo to arm myself with. Got a list of bullet points as to why Python is better?

Well, yes and no.

In terms of functionality, there’s very little difference between Perl 5, Python, PHP and Ruby. The reasons to choose one over the other are typically very domain-specific (hence subtle and of little use when fighting religious wars): Perl 5 makes text munging simple by having, e.g., regular expressions as first-class citizens; PHP makes web applications more natural because, well, that’s what it was designed to do.

I have nothing really positive (or negative) to say about Ruby. I can’t think of any special niche that it fills. Anonymous blocks? Perl 5 has them. Pure OO? Python has it. call/cc? If you think you need continuations, you probably don’t. You could argue that Ruby serves a purpose by combining all these things, but the number of people who sincerely need a pure OO language with anonymous blocks and continuations is probably around five.

The negative things I can think of with respect to Perl 5 and PHP is that it’s hard to do dependency injection-based testing in these languages. It’s so hard in Java, for example, even Google has invented a tool to make Java DI easier. Python on the other hand makes this dead-simple, making it so much easier to test your code from all perspectives. Hell, it’s so easy in Python, I didn’t even know there was a name for it until I came to Google. I don’t know how easy DI is in Ruby, but if it’s not Python-easy, Ruby loses.

That’s one criterion for programming languages that I don’t see discussed much: ranking languages by how easy the code is to test. One frequent example is mocking a global resource like a time source. C, C++ and Java all require you to come up with unnatural function signatures or link against special libraries when testing in order to gain control over time. It’s easier in Perl 5, but it still requires a good deal of specialized knowledge of how namespaces and module lookups work. Assuming the target library does something like import time at the top, here’s how you take control of a given module’s time source in Python:

>>> import some_module
>>> class StubTime:
>>>    def time(self):
>>>        return 3634634
>>> some_module.time = StubTime()

Done. No specialized knowledge of interpreter details, no crazy setup, just done. If mocking global resources isn’t that easy in PHP, Ruby or any other language, I have little use for it beyond toy projects. Testing is where I feel Python really stands out.

  • PEP 3129, “Class Decorators”. This has already been accepted and implemented, thanks to Jack Diedrich.

  • PEP 3133, “Introducing Roles”. Roles are a competing idea to PEP 3119’s Abstract Base Classes.

Direct any discussion to python-3000.

There’s…too many of them!

Following Guido’s announcement/reminder that all Python 3000-related PEPs* have to be in by the end of April, PEPs have been coming out of the woodwork:

  • PEP 3119 - Guido’s abstract base classes PEP.

  • PEP 3120 - Using UTF-8 as the default source encoding

  • PEP 3121 - More flexible module Initialization and finalization.

  • PEP 3122 - Change how the “main” module is delineated. (This PEP has already been rejected.)

  • PEP 3141 - A proposal for a hierarchy of numeric base classes, based on PEP 3119.

There’re also several pre-PEPs being kicked around in the mailing lists:

*: PEPs impacting the stdlib don’t have to meet this deadline.

Two weeks or so ago, I brought up my unittest redesign on the new testing-in-python mailing list. A number of people were upset that in redesigning unittest, I had rejected nose and py.test; Titus Brown even wrote a few blog posts on the subject, in particular taking me to task for ignoring nose.

I’ll be honest: when I started redesigning unittest, I did ignore nose and py.test. I remembered looking at them a long time ago, when I was first getting frustrated with unittest, casting around for a better, more flexible alternative. py.test has no support for extensions and depends on the rest of the py library, so that’s out. nose has plugins, but my general impression was that it’s just a nice test discovery tool; since that wasn’t what I was looking for, I didn’t care. Thinking that perhaps the project has changed significantly since the last time I looked at it, I took another, closer look at nose’s infrastructure. Verdict: it’s still a nice test discovery tool, but since that’s still not what I’m looking for, I still don’t care.

And now we will have a brief intermezzo, and I will explain exactly why I’m redesigning unittest.

First of all, I didn’t start off with the intention of rewriting the whole module. I began by trying to change the existing design so that it would be easier to compose extensions. So I poked and I tweaked and prodded and twisted unittest until it was unrecognizable, until I was left with something that resembled the old version in name only. That is to say: this didn’t start out as a rewrite — it just ended up that way.

Now, what do I mean when I say “composing extensions”? Yes, unittest as-shipped allows you to extend its functionality by way of subclassing this bit and that bit, but the problem comes when trying to mash two extensions together: you can’t. You can’t put your unittest extensions — say, one that does refcount checking for C extensions or one that writes test results to a database — up on PyPI and have people be able to mix and match to create just the right testing environment for their project.

This all has one major design implications for your testing framework: extensions must operate without knowing anything about what other extensions might be running. The framework has to be designed so that extensions can operate by themselves just as well as they do with 15 others.

nose doesn’t come anywhere close to supporting this.

(Note: the following is based on my best understanding of nose’s codebase and on conversations with others. If I’ve gotten anything wrong, please let me know and I’ll gladly retract it.)

“That’s crap,” you say, “nose has plugins!” Ha. nose plugins don’t come anywhere close to achieving this level of independence. If I want to add a plugin to allow tests to be marked as TODO, there’s no way for this new kind of test-status to make its way into the various reporting plugins. As far as I can tell, just to get TODO tests not to show up as failures in the default console output, I’d have to:

  • Subclass nose.result.TextTestResult, overriding addError() so that it picks up the TODO-ness of the test.

  • Subclass nose.core.TextTestRunner, overriding _makeResult() so that it uses my TextTestResult subclass.

  • Subclass nose.core.TestProgram, overriding runTests() so that it uses my TextTestRunner subclass.

  • Replace nose.core.run() with a function that uses my TestProgram subclass.

    Of course, by the time my plugin is running and trying to do all this subclassing/replacing malarkey, nose.core.run() has already been called, so it’s too late.

By contrast, adding this kind of support to my unittest redesign is trivial. Omitting the TODO() decorator and exception classes (which you’d need for the nose version, too):

class TodoRunner(TestRunner):
  categories = ['todo pass', 'todo fail']

  def handle_exception(self, test, exc_info):
    exc_type = exc_info[0]
    if issubclass(exc_type, TodoPassed):
      self.log_exception('todo pass', test, exc_info)
    elif issubclass(exc_type, TodoFailed):
      self.log_exception('todo fail', test, exc_info)
    else:
      super(TodoRunner, self).handle_exception(test, exc_info)

  def was_successful(self):
    parent_success = super(TodoRunner, self).was_successful()
    return parent_success and not self.still_todo()

  def still_todo(self):
    return self.exceptions['todo pass'] 
           or self.exceptions['todo fail']

  def failure_label(self):
    if self.still_todo():
      return 'TODO'
    return super(TodoRunner, self).failure_label()

With those lines of code, all output extensions — console, database, XML, etc — will automatically recognize TODO tests and treat them as such. No fuss, no muss.

Now, all this isn’t to say that nose is crap. What I said earlier is still true: nose is a good test discovery tool. I even hope to borrow some of its discovery strategies for the new design. What nose is not, however, is an ultra-flexible test environment framework where extensions can be shared easily and openly, and that’s what I’m going for.

Next Page »