8.1. Typo Pathologies

Later we'll tackle the more complex issues of errors committed when you typed exactly what you intended to type, but the program still fails to work. This chapter discusses the humble typo: if you could see it, you'd know it was a mistake, and how to fix it.

The trick is to find the typo in the first place. We will unfurl an array of techniques for zeroing in on the miscreant. Fortunately, the scope of the typo is not unlimited; seldom does someone type a word entirely different from the one they were thinking of, for instance. The following quote from the Cambridge doctoral thesis of Stephen Moss appositely states the possible choices, although he was referring to the effect of channel errors on textual transmission:

  1. Deletion: “The Prime Minister spent the weekend in the country shooting peasants.”

  2. Insertion: “The walkway across the trout hatchery was supported on concrete breams.”

  3. Alteration: “Say it with glowers”; “For sale: Volvo 144 with overdrive, fuel infection, etc.”

  4. Transposition: “Yet, down the road, you will still find the corner shop where the lady behind the counter will lovingly warp your presents.”

Let's consider examples of each category of typo, and we'll show the strategy we followed to hunt them down and destroy them.

8.1.1. Deletion

The most common example of deletion must be the failure to place a semicolon between two statements. (Perl requires semicolons as statement separators, not statement terminators; the difference is that the last statement in a block does not require a trailing semicolon. We present this piece of wisdom to help you understand programs you may inherit that were written by people obsessed with minimizing typing: we recommend that you insert a semicolon after every statement, even at the end of a block, since you never know when you might come back later and insert another statement after it.)

$message = "Hello World";
print "$message
"
exit;

Perl responds with

syntax error at line 3, next token "exit"

The Perl compiler, like every other compiler we know, often goes somewhat further in your program than the actual error before reporting a problem. Since we've never met a compiler that indicated an error for a line number preceding the first one containing an error, we can use Perl's report of the line number as an upper bound and work backwards from it.

The actual syntax error in your program could occur not just on the line reported by Perl, but on any line preceding it.


In this case, the line it reported was the last line of the file anyway (which often happens), but them's the breaks.

The strategy for fixing this problem: “syntax error” tells you that Perl has determined the problem to be a straightforward typo (count your blessings), so you know what you're looking for—something that doesn't look like valid Perl, on or before line 3. Perl gives us an additional piece of information, that the next token is exit. If we're lucky, that means that the error occurred immediately before that word, so that's where we start looking. Immediately, we notice a missing semicolon on line 2.

Success! Problem located and identified. Case closed. (No, there's no substitute for looking over the code with a Mark I eyeball. However, if you have reams of code to search, you can reduce the amount you have to check by removing or commenting out parts of it selectively until you see the error message change.)

Now let's consider another missing semicolon that's far more insidious:

use strict
my $message = "Hello World
";
print $message;
exit;

This program compiles fine; it even runs without error. Unfortunately, it also runs without output. What happened to the contents of $message? A long or a short way can be used to debug this problem. Let's try the long way first (the reason for this choice will become apparent in due course).

We'll fire up the debugger (see Chapter 7). The first (meaningful) output from the debugger is

main::(-:3):    print $message;
DB<1>

Hmm. Something's wrong here. The debugger halted at the first executable line of code. (A use statement is a compile time directive, so we don't expect to see that in the debugger.) Why aren't we seeing the assignment to $message? No wonder when we examine $message there's nothing there:

DB<1> x
						$message
 0  undef

For some reason, line 2 isn't considered executable. It isn't commented out, so it must be being executed at compile time instead. Hmm again. That rings a bell. Wait a minute, it comes right after a use statement. Could it be that Perl is considering it to be part of the use statement?

At this point, we discover the missing semicolon and probably are too relieved to care about why it happened. But for the curious among you, here's an explanation: the absence of the semicolon and the fact that all white space (including newlines) looks the same to Perl means that lines 1 and 2 parse as

use strict my $message = "Hello World
";

If we look at the specification for the use statement, we see that it is

use Module LIST
					

So the name of the module can be followed by a list—and an assignment is a valid member of a list! Because evaluation occurs in the scope of the use statement, my $message = "Hello Perl World "; ends up scoped inside the implicit BEGIN block created by use, and hence disappears by the next line because it was a lexical variable.

The debugger demonstration shows the long way to find our typo, but in this case a shorter error analysis method exists. The -w flag produces warnings for the more common Perl programming problems. Run the program with the -w flag and Perl returns

Name "main::message" used only once: possible typo at typo.pl
line 3.
Use of uninitialized value at typo.pl line 3.

As we now know, the assignment of $message happens in the wrong place due to the typo. The warning alerts us to the fact that when $message is used, it doesn't yet contain a defined value.

This example should trouble you. If you put use strict in all your programs (and if you aren't doing so already, you will by the time you finish this book), you should be wondering now whether the strictness is in effect. To check, remove my from the $message assignment. If strictness is in effect, Perl will complain that $message lacks an explicit package name.

use strict
$message = "Hello World";
print "$message
";
exit;

On running, Perl returns: Hello World. Surprise! The typo prevents strictness from becoming enabled. The absence of the my keyword allows the assignment to succeed. (Since it's global, the fact that the assignment happens in an invisible BEGIN block doesn't matter.)

This is one of the most insidious errors conceivable: a single character mistake that causes no warning and no error yet prevents the very error checking the programmer was trying to enable.

8.1.2. A Puzzle

Here's another example of deletion. Can you figure out what's wrong?

use strict;
sub foo
   {
   my $sref = shift;
   foreach my $d (@)
      {
      find ($sref, $d);
      }
   }

sub find { }

The result:

Global symbol "$d" requires explicit package name at typo.pl
line 6.
syntax error at typo.pl line 7, near "}"
syntax error at typo.pl line 8, near "}"
Execution of typo.pl aborted due to compilation errors.

foreach my $d (@) should instead be foreach my $d (@_). Well, @) is a valid variable in Perl—a strange variable to be sure, and not the kind of thing you want to be using intentionally—but a variable nevertheless. This robbed the foreach list of its closing parenthesis and caused a wonderful cascade of nonsense.

8.1.3. Insertion

Insertion typos probably occur as often as deletion typos, though with different pathologies. An additional character in a variable name is flagged by Perl (if you used -w), and a misspelled command generates a run-time exception. An insertion typo becomes dangerous when the typo is still a valid Perl statement but executes an unexpected function.

Consider a program with one extraneous character:

use strict;
my $x = 1;
my $y = 2;
$x == $y;
print "$x,$y
";

Run the program and Perl prints: 1,2. No errors or warnings are given; this is expected, since each program statement is valid Perl. Now consider a program that differs from the previous one by one character:

use strict;
my $x = 1;
my $y = 2;
$x = $y;
print "$x,$y
";

When run, Perl prints: 2,2. Output from the second program shows the value of $x set to the value of $y, whereas the first performs no such operation. What did the first program do? In line 4, $x=$y was intended, but $x==$y was coded. The former is an assignment to modify $x, the latter a Boolean test that modifies neither $x nor $y—a particularly nasty problem as both commands are correct Perl syntax, and both execute without a complaint, with or without use strict.

All this is very elementary, of course, until you wonder “What if I use -w?” Run our first program with -w and Perl returns:

Useless use of numeric eq in a void context.
File 'typo1.pl'; Line 4.

This reveals yet another reason to use -w in every program: with it, Perl can warn us that while our mistake resulted in syntactically valid Perl, it was odd Perl. It can't raise a red flag stating “An extra = on line 4!” but it can point out that we've executed an essentially useless operation.

8.1.4. Alteration

Let's examine another single typo program:

use strict;
$message = "Hello World';
print "$message
";
exit;

On running, Perl returns the following:

Scalar found where operator expected at typo.pl line 3, at end
of line (Might be a runaway multi-line "" string starting on
line 2) (Do you need to predeclare print?)
syntax error at typo.pl line 3, near "print "$message "
Global symbol "message" requires explicit package name at
typo.pl line 3.
Backslash found where operator expected at typo.pl line 3, near
"$message " (Missing operator before ?)
Bareword "n" not allowed while "strict subs" in use at typo.pl
line 3.
String found where operator expected at typo.pl line 3, at end
of line (Missing semicolon on previous line?)
Can't find string terminator '"' anywhere before EOF at typo.pl
line 3.

Seven error messages from one typo! Let's concentrate on the first error and ignore the rest for now. The first error complains about a scalar on line 3 instead of a typo. Luckily, we get a hint about the error:

Might be a runaway multi-line "" string starting on line 2

This suggests paying special attention to the use of quote marks on line 2, where we find the string delimited by a double quote (") and a single quote ('). String delimiters must match at the beginning and end. Since there are no interpolated variables or digraphs in this string, either kind of quote will work, as long as we use the same one at each end of the string.

Handle only the first warning or error message output by Perl; don't bother reading the others, just recompile.


Error messages after the first one may be a cascade effect and therefore may be eliminated by removing the cause of the first message. It's usually not worth the time to analyze each message to determine whether this is the case.

8.1.5. Transposition

Regular expressions provide a fertile breeding ground for typos. Suppose you want to match and save the last string of letters preceded by white space on a line:

/s([a-z])+$/i

Looks right, eh? Try it on an example, though:

while (<DATA>)
   {
   print "Match = $1
" if /s([a-z])+$/i;
   }
__END__
The boy stood on the burning duck

What comes out but

Match = k

Oops. We wanted the last word, not the last letter. Here's an approach to debugging this problem: We printed $1, which is the text saved between parentheses. We can ignore whatever else the regex contains and focus on what's between the parens, i.e., [a-z], which we immediately recognize as a character class that hasn't been qualified with a quantifier. Therefore it can represent only one character, which is what we got. (When saving parentheses match more than once in a regex because of a quantifier, only the last match gets saved. One might wish for a switch that would return all the matches in a list.)

As with most typos, identifying it is 90% of the goal. The fix is easy to predict: /s([a-z]+)$/i.

Here's a higher-level sort of transposition. Can you tell what it is before reading the explanation?

#!/usr/bin/perl -w
use strict;

my $fmt = "%10s "x5;
printf "$fmt
", qw(Kelvin Celsius Rankine
                    Fahrenheit Reaumur);
$fmt = "%10.2f "x5;
for (my $kelvin = 0; $kelvin += 10; $kelvin < 500)
   {
   my $celsius    = $kelvin - 273.15;
   my $rankine    = $kelvin * 9 / 5;
   my $fahrenheit = $rankine - 459.67;
   my $reaumur    = $celsius * 4 / 5;
   printf "$fmt
",$kelvin,$celsius,$rankine,$fahrenheit, $reaumur;
   }

Run this, and it does not terminate! You will be calculating temperatures beyond those experienced in the Big Bang. But the very first line of output is

Useless use of numeric lt (<) in void context at temp.pl line 13.

Weird. Line 13 is the printf statement, and it has no <. According to Perl of Wisdom #24, we should start scanning backwards until we find one. There it is at the beginning of the for statement: forgivably, we transposed the test and iterative clauses.

If you'd like to know why Perl fingered line 13 and not line 7 (the for statement), you'll find an explanation in Chapter 10.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset