January 2007
Monthly Archive
Thu 25 Jan 2007
Following up on my last post about catching exceptions in Python 3, here are some excerpts from the companion PEP I’m working on, which addresses “raise” statements.
There are simply too many forms to the raise statement in Python 2. Quoting from the reference manual:
If no expressions are present, raise re-raises the last exception that was active in the current scope…
Otherwise, raise evaluates the expressions to get three objects, using None as the value of omitted expressions. The first two objects are used to determine the type and value of the exception.
If the first object is an instance, the type of the exception is the class of the instance, the instance itself is the value, and the second object must be None.
If the first object is a class, it becomes the type of the exception. The second object is used to determine the exception value: If it is an instance of the class, the instance becomes the exception value. If the second object is a tuple, it is used as the argument list for the class constructor; if it is None, an empty argument list is used, and any other object is treated as a single argument to the constructor. The instance so created by calling the constructor is used as the exception value.
If a third object is present and not None, it must be a traceback object…and it is substituted instead of the current location as the place where the exception occurred… The three-expression form of raise is useful to re-raise an exception transparently in an except clause, but raise with no expressions should be preferred if the exception to be re-raised was the most recently active exception in the current scope.
That’s pretty complex, and it doesn’t even address string exceptions. Until I started digging around in the interpreter internals, I didn’t even know the three-object form was possible. Here’s what raise will look like in Python 3:
-
raise (with no arguments) is used to re-raise the active exception in an except block.
-
raise EXCEPTION is used to raise a new exception. This form has two sub-variants: EXCEPTION may be either an instance of BaseException or a subclass of BaseException (follows from PEP 352). If EXCEPTION is a subclass, it will be called with no arguments to obtain an exception instance.
To raise anything else is an error.
“But wait! That doesn’t allow me to supply a traceback!”. Never fear, PEP 344 is here. It specifies that exceptions will grow a __traceback__ attribute, and this is how we’ll be able to raise exceptions with arbitrary tracebacks. What looked like this in Python 2
raise Type, Value, Traceback
will look like this in Python 3
e = Type(Value)
e.__traceback__ = Traceback
raise e
Or possibly this (per a suggestion from Guido):
raise Type(Value).set_traceback(Traceback)
I’m also relying on PEP 344 to replace Python 2’s raise Type, Instance variant. This is most often used to “cast” an exception instance from one type to another, such as this example from distutils.bcppcompiler:
try:
self.spawn (['brcc32', '-fo', obj, src])
except DistutilsExecError, msg:
raise CompileError, msg
PEP 344 introduces a raise ... from ... statement and a corresponding __cause__ attribute. Taking advantage of these new tools, the above Python 2 snippet translates to
try:
self.spawn (['brcc32', '-fo', obj, src])
except DistutilsExecError as msg:
raise CompileError from msg
While the main thrust of this work is to reduce the size of the language — the number of details and nuances you have to keep track of — there’s a more tangible benefit, as pointed out by A. M. Kuchling:
PEP 8 doesn’t express any preference between the two forms of raise statements:
raise ValueError, 'blah'
raise ValueError('blah')
I like the second form better, because if the exception arguments are long or include string formatting, you don’t need to use line continuation characters because of the containing parens.
Less line noise, a smaller language; what’s not to like?
Wed 24 Jan 2007
Lately, I’ve been working on a PEP to change how Python 3’s “except” statements work. The highlights:
(Anyone wanting to discuss these should join the python-3000 list and comment there.)
-
The grammar for “except” statements will change from
except_clause: 'except' [test [',' test]]
in Python 2 to
except_clause: 'except' [test ['as' NAME]]
in Python 3. This is being done to eliminate a syntactic ambiguity where the parser can’t tell whether
except EXPRESSION, EXPRESSION:
should be interpreted as
except TYPE, TYPE:
or
except TYPE, TARGET:
Python 2 opts for the latter semantic, at the cost of requiring the former to be parenthesized.
Converting Python 2-style “except” statements to Python 3 can be handled automatically (for the most part) by Guido van Rossum’s 2to3 utility.
-
As specified in PEP 352, the ability to treat exceptions as tuples will be removed, meaning this code will no longer work:
except os.error, (errno, errstr):
Because the automatic unpacking will no longer be possible by default, the ability to use tuples as “except” targets at all will be removed.
-
PEP 344 specifies that exception instances in Python 3 will possess a __traceback__ attribute. The Open Issues section of that PEP includes a paragraph on garbage collection difficulties caused by this attribute, namely a “exception -> traceback -> stack frame -> exception” reference cycle, whereby all locals are kept in scope until the next GC run. Python 3 will resolve this issue by making sure the target name is deleted at the end of the “except” suite, thus breaking the cycle.
This will be done by having the compiler emit appropriate bytecode to translate
try:
try_body
except E as N:
except_body
...
to this (in Python 2.5 terms):
try:
try_body
except E, N:
try:
except_body
finally:
N = None
del N
...
An implementation of this has already been checked into the p3yk [sic] branch.
Thu 11 Jan 2007
A long time ago, in a blog post a few pages back in the archives, I spent a few paragraphs bemoaning Python’s unittest module and how it can’t be readily extended, nor can its extensions be easily composed. I gave as examples an extension that allows you to mark tests as “todo” and an extension that did reference counting around each test case (for C modules). While writing the extensions themselves was a little harder than I would have liked, the biggest problem was composing them — using both at the same time. Specifically, you can’t compose them, not without writing all-new code to merge the two functionalities. Consider:
TODO support:
140 lines (5 core classes, 4 support classes/funcs)
Refcounting support:
117 lines (4 core classes)
Composition:
197 lines (6 core classes, 4 support classes/funcs)
105 lines (3 classes of entirely new/rewritten code)
(All code snippets can be found in this directory. Code related to the old unittest design is in the before/ subdir, that related to the new design is in after/.)
test_harness, my new unittest package, was designed with flexibility and extensibility in mind. Using the same todo/refcounting examples from above:
TODO support:
61 lines (1 core class, 4 support classes/funcs)
Refcounting support:
36 lines (1 core class)
Composition:
5 lines (1 core class, 3 imports)
That’s right: todo and refcounting support, with results written to stdout in five lines. And one of those lines is blank.
Where the new design really shines is in output. Unlike the old design — where you’d have to rewrite everything — changing your logging scheme from to-console to XML means changing this
from test_harness import TextRunner
from refcounting import RefcountRunner
from todo import TodoRunner, TODO
class OurRunner(TextRunner, RefcountRunner, TodoRunner):
pass
to this:
from xmlrunner import XmlTestRunner
from refcounting import RefcountRunner
from todo import TodoRunner, TODO
class OurRunner(XmlTestRunner, RefcountRunner, TodoRunner):
pass
That’s a two line change. That would have required a complete rewrite with the old system. Want both XML and to-console logging? Stick with the old unittest design and you’re looking at yet another rewrite. test_harness allows you to do this:
from test_harness import TextRunner
from xmlrunner import XmlTestRunner
from refcounting import RefcountRunner
from todo import TodoRunner, TODO
class OurRunner(TextRunner, XmlTestRunner, RefcountRunner, TodoRunner):
pass
The biggest problem with the old unittest design is that, in trying to separate out the various concerns, it left the different components interconnected. TestCase objects depend on TestResult objects having certain methods; TestLoaders depend on your test case classes subclassing TestCase; TestRunners control which TestResult is used; etc. test_harness does away with this menagerie in favor of a single class: TestRunner. TestRunner objects are responsible for test suite iteration, running each individual test, collecting and categorizing any exceptions, and summarizing the results of the test run. Test loading/discovery is orthogonal to this process and as such is left to other packages, though rudimentary solutions are provided with the new package.
The biggest gripes about unittest I heard while researching unittest’s problems is that you a) have to subclass TestCase, and b) use TestCase methods to indicate test success/failure. In test_harness, there is no requirement to subclass TestCase (nor is there a TestCase class to subclass). Also, the usage of TestCase methods to signal failure — a consequence of the old TestCase/TestResult linkage — has been replaced with a test_harness.assertion submodule that contains functions like ok(), are_equal(), etc. Mapping old spellings to new:
self.failUnless() < = > ok()
self.assertEqual() < = > are_equal()
self.failIfEqual() < = > are_not_equal()
self.failUnlessAlmostEqual() < = > are_almost_equal()
self.assertRaises() < = > raises()
Anyone interested is encouraged to play around with the new design. Comments to collinw at gmail point com
Fri 5 Jan 2007
Posted by Collin under
python1 Comment
I made my first commit to Python today.
If that doesn’t get the girls, I don’t know what will.
Thu 4 Jan 2007
To anyone planning to email me about how much you hate the syntax for return value annotations: don’t. Guido wants the -> arrow, so the arrow is what we’re getting. Your ideas for using returns or return or whatever else have already occurred to others — namely me — and were rejected months ago.
Guido’s the one you have to convince, not me, and he’s already made up his mind.
Wed 3 Jan 2007
Based on this python-3000 thread and a number of off-list emails, I’m dropping my earlier objection to PEP 3107. I hadn’t been convinced that there was a sufficiently broad spectrum of use-cases for function annotations to justify changing Python’s syntax. A bunch of people came out of the woodwork with viable uses for annotations, which is what I was looking for. Accordingly, I’ll be working up a patch to PEP 3107 to include a “Use Cases” section.
Thanks to everyone who emailed or commented, especially Phillip J. Eby, who led the python-3000 effort to convince me.
Tue 2 Jan 2007
A blogified version of a python-3000 post, in which the author of PEP 3107 revels in situational irony.
I was explaining function annotations to a friend this past weekend and found that, even though I had written a PEP on the subject and spent months debating the little details of “how are we going to make annotations work?”, I was hard-pressed to answer the question of “why are we doing this?”
The biggest problem I faced — then and now — is justifying the use-cases for annotations. Here’re the use-cases I could come up with off the top of my head: information for typecheckers; doc strings for parameters; extra information for IDEs; extra information for static analysis tools like pylint. These can all be addressed together:
Are the users clamoring for these things? Do these address real problems that users are having?
Not to my knowledge.
-
Information for typecheckers
In a recent python-ideas post, Guido van Rossum said that “Collin’s existing type annotation library … could be made more elegant by attaching the types directly to the arguments”. As far as I can tell, the only gains in elegance are that you don’t have to repeat the names of a function’s parameters in the typechecking decorator. None of my users have ever complained about this tiny bit of repetition, and I’ve never felt it an undue burden in my own usage.
It could even be considered an advantage, since including the “annotations” in the typechecking decorator means all I have to do to remove typechecking from a function is delete a single line, rather than pick through a function’s declaration, removing the relevant bits.
-
Doc strings for parameters
def foo(a: 'the object to be frobnicated',
b=7: 'this controls the level of frobnication',
c='Rojo': 'the name of a color, in Spanish, that should
be applied to the
frobnicated thing') -> 'Returns an integer between
9 and 13':
''' Frobnicate an object in a Spanish way '''
...
What does this accomplish that can’t be achieved with any of the standard documentation syntaxes in existence today?
-
Type information for IDEs
I can see it being genuinely useful to be able to get parameter/return type information in a tooltip message. But IDLE can do this already, without annotations:
-
Type information for static analysis tools
Quoting Nick Coghlan, from August 2006: “annotations wouldn’t be useful for tools like pychecker … to be really useful for a tool like pychecker they’d have to be ubiquitous, and that’s really not Python any more”. Agreed.
I could say You Aren’t Going to Need It, but that gets the tense wrong; we’re getting along without annotations quite nicely here in the present. In short: I think that PEP 3107 be rejected as an overly-specific, unnecessary addition to the language.
Anyone with thoughts on or responses to this article should post them to the python-3000 mailing list.