Mon 20 Nov 2006
As I mentioned, I’ve recently taken over ownership of a project at work that makes heavy use of Google’s SOAP APIs, using Net::Google to handle the SOAP stuff. A major part of my initial marching orders were to fix one example of Net::Google’s curious definition of “error handling”.
There’s a long-standing problem with Google’s search API that it throws “502 Bad Gateway” errors around 20% of the time. Net::Google’s attitude toward this (and all other errors) is to simply carp and carry on. From our standpoint, we want a clean way of being able to differentiate what the error was: was the user’s API key invalid? Have they reached their query limit? Did we get yet another 502? Overriding Perl’s warning mechanism does not qualify as clean or pretty or…really, any positive adjective, so I downloaded Net::Google’s latest release and started reading.
I have only recently stopped screaming in terror.
It soon became clear that my task was larger than “throw errors when Google’s SOAP server craps out”. Much larger. So large that I’m now putting together a 2.0.0 release of this package. In the spirit of learning from mistakes, let’s work through this poor module and look at ways some of this stuff could be done better:
Do not ignore errors
I don’t care what language you’re using, On error resume next is never a good idea. In Net::Google’s case, any errors at the SOAP layer are trapped and reissued as warnings, continuing on its merry way as if nothing had happened. This includes things like:
- “502 Bad Gateway” - for when the SOAP service is down
- “Invalid authorization key” - when the user had given us a bad API key
- “Daily limit of 1000 queries exceeded” - try again tomorrow
I want to be able to trap these and respond differently to each scenario. If I get a 502 error, I might sleep for a few seconds, then try again. If I get an “invalid authorization key”, I should inform the user so they can give me a valid key. The point is that I want to be able to respond to odd — one might say “exceptional” — behavior. Having these messages emitted as warnings makes that a lot harder.
Do not leave useless code lying around
From Net::Google::Search:
use constant RESTRICT_ENCODING => qw [ arabic gb ... a dozen more ... cyrillic utf8 ]; use constant RESTRICT_LANGUAGES => qw [ ar zh-CN ... a dozen more ... sv tr ]; use constant RESTRICT_COUNTRIES => qw [ AD ... dozens and dozens more ... ZA ZM ZR ]; use constant RESTRICT_TOPICS => qw [ unclesam linux mac bsd ]; use constant WATCH => "__estimatedTotalResultsCount";
Guess how many of these were used in the following code? That’s right, zero. None of them. Some of these arrays had several dozen elements, just taking up space to no effect. Several other submodules had similar sections, uselessly defining constants that would never be used in the code.
Worse, they took up brain cycles and time while I tried to figure out a) what they were for, then b) if I could safely remove them without breaking anything.
Do not modify constants
*sigh*. You’d think it would be obvious; these things are called constants for a reason. You don’t change them. You can’t change them. Ever. Yet what do I find in Net::Google::Service…
use constant SERVICE_CACHE => {};
followed a little later by
&SERVICE_CACHE->{$service} = "$dir/Net/Google/Services/".&SERVICES->{$service};
That’s right: modifying a constant. What makes this even better The Daily WTF fodder is the comment above this line, complaining about how using the & sigil is a work-around for a bug in perl 5.00502. In the spirit of the old “It hurts when I do this”-”Don’t do that” joke, here’s my advice about this whole mess: don’t do it. I have trouble imagining the thought process that concluded, “You know, a hash just won’t do”.
“Invalid input” does not mean “guess”
There’s an old saying ’round these parts, “in the face of ambiguity, refuse the temptation to guess“. When it comes to programming, you can swap out “ambiguity” for any number of things and generate valid maxims all day long. In this situation, I’ll take “an error”: in the face of an error, refuse the temptation to guess. Here’s one such offender, taken from Net::Google::Search:
if (int($max) < 1) {
carp "'$max' must be a int greater than 0";
$max = 1;
}
Bonus points: notice that the warnings produced look like “‘-5′ must be an int greater than 0″. a) the quote marks around -5 make it look like a string, and b) a much more helpful message would be “max_results must be an integer greater than 0″, i.e., don’t just tell me how I screwed up, tell me what I screwed up, too.
Documenting it doesn’t make it so
Unmodified:
=head2 $obj->starts_at($at)
Returns an int. Default is 0.
Returns undef if there was an error.
=cut
sub starts_at {
my $self = shift;
my $at = shift;
if (defined($at)) {
$self->{'_starts_at'} = $at;
}
return $self->{'_starts_at'};
}
Can someone please explain to me how this method “[r]eturns undef if there was an error”? No? I can’t either. Nor can I figure out what kind of errors could possibly arise in this code.
If you want a piece of code to do X, it helps to actually make it do X. If your docs say one thing and the code does another, the code is generally going to prevail when push comes to shove. It’s important to make sure your code does what you want it to; that’s where tests come in. Maybe this accessor method at one time had some kind of input validation and would in fact return undef if that validation failed, and just maybe that code accidentally got wiped out at some point. A test suite would let you know that.
My, what a nice segue for the next point..
Good tests are your friends
As shipped, Net::Google 1.0.1 has…let me count them…8 tests (12 if you include duplicates). And those 8 aren’t exactly worth much. Here’s one from t/002-cache.t:
my $search = $google->search(); isa_ok($search, 'Net::Google::Search'); $search->query(QUERY); $search->max_results(MAXRESULTS); $search->filter(1); my $results = $search->results(); is(ref($results), 'ARRAY', 'Got results for '.QUERY);
That’s the full extent of the testing for the interface to Google’s search API. Is $results tested to make sure it has elements? Why waste time! Are the elements of $results tested to make sure they’re filled out? Poppycock! Do we ever make sure that the accessors on $search work correctly? Hells no!
I know testing isn’t fun and isn’t glamorous. No-one ever got laid or got their picture in the paper for working on a test suite. That doesn’t mean you shouldn’t do it, though, ’cause you know what else doesn’t get you laid? Writing poorly-specified, untested software that’s incredibly difficult for other people to modify because they have no idea what it was supposed to do in the first place. Call it karma.
“Method X exists” is not documentation
One last documentation gem before we go:
=head2 $obj->filter($bool) Returns true or false. Returns undef if there was an error. =cut
Thank $diety the docs don’t tell me what this property controls or does; it’s not like I may need to use it. Dear reader: if you want me to use your software, please tell me what it does. If I have to read the code to figure out what an accessor method controls, you lose; I’ll use someone else’s package. I just don’t care that much.
Next time we’ll go over what Net::Google teaches us about accessor methods. There was so much material here, it took up its own post.
November 22nd, 2006 at 05:20
re returning undef: umm, this *is* perl, not python; return $self->{’_starts_at’};
returns undef if self doesn’t have an _starts_at member initialized, which is probably the error they’re talking about - you only get ‘Use of uninitialized value’ with -w, where in python you’d get an AttributeError or KeyError (depending on how you translated this code.)
November 22nd, 2006 at 15:05
I wasn’t aware that asking for the value of an uninitialized member is considered an error.