Chapter 18. Testing and Debugging

Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it.

Brian Kernighan

Most people recognize that testing and debugging are somehow related; that debugging is the natural consequence of testing, and that testing is a natural tool during debugging.

But, when used correctly, testing and debugging are actually antagonistic: the better your testing, the less you'll need to debug. Better testing habits repay themselves many times over, by reducing the effort required to diagnose, locate, and fix bugs.

Testing and debugging are huge topics, and a single chapter like this can only outline the simplest and most universal practices. For much deeper explorations of the possibilities, see Perl Testing: A Developer's Notebook (O'Reilly, 2005), Perl Debugged (Addison Wesley, 2001), and Perl Medic (Addison Wesley, 2004).

Test Cases

Write the test cases first.

Probably the single best practice in all of software development is writing your test suite first.

A test suite is an executable, self-verifying specification of the behaviour of a piece of software. If you have a test suite, you can—at any point in the development process—verify that the code works as expected. If you have a test suite, you can—after any changes during the maintenance cycle—verify that the code is still working as expected.

So write the tests first. Write them as soon as you know what your interface will be (see "Interfaces" in Chapter 17). Write them before you start coding your application or module. Because unless you have tests, you have no unequivocal specification of what the software is supposed to do, and no way of knowing whether it does it.

Modular Testing

Standardize your tests with Test::Simple or Test::More.

Writing tests always seems like a chore, and an unproductive chore at that: you don't have anything to test yet, so why write tests? And yet, most developers will—almost automatically—write driver software to test their new module in an ad hoc way:

> cat try_inflections.pl

# Test my shiny new English inflections module...
use Lingua::EN::Inflect qw( inflect );

# Try some plurals (both standard and unusual inflections)...
my %plural_of = (
    'house'         => 'houses',
    'mouse'         => 'mice',
    'box'           => 'boxes',
    'ox'            => 'oxen',
    'goose'         => 'geese',
    'mongoose'      => 'mongooses',
    'law'           => 'laws',
    'mother-in-law' => 'mothers-in-law',
);

# For each of them, print both the expected result and the actual inflection...
for my $word ( keys %plural_of ) {
    my $expected = $plural_of{$word};
    my $computed = inflect( "PL_N($word)" );

    print "For $word:
",
          "	Expected: $expected
",
          "	Computed: $computed
";
}

A driver like that is actually harder to write than a test suite, because you have to worry about formatting the output in a way that is easy to read. And it's much harder to use the driver than it would be to use a test suite, because every time you run it you have to wade though that formatted output and verify "by eye" that everything is as it should be:

> perl try_inflections.pl
For house:
    Expected: houses
    Computed: houses
For law:
    Expected: laws
    Computed: laws
For mongoose:
    Expected: mongooses
    Computed: mongeese
For goose:
    Expected: geese
    Computed: geese
For ox:
    Expected: oxen
    Computed: oxen
For mother-in-law:
    Expected: mothers-in-law
    Computed: mothers-in-laws
For mouse:
    Expected: mice
    Computed: mice
For box:
    Expected: boxes
    Computed: boxes

That's also error-prone; eyes are not optimized for picking out small differences in the middle of large amounts of nearly identical text.

Rather than hacking together a driver program, it's easier to write a test program using the standard Test::Simple module. Instead of print statements showing what's being tested, you just write calls to the ok( ) subroutine, specifying as its first argument the condition under which things are okay, and as its second argument a description of what you're actually testing:

> cat inflections.t

use Lingua::EN::Inflect qw( inflect );
use Test::Simple qw( no_plan );

my %plural_of = (
    'mouse'         => 'mice',
    'house'         => 'houses',
    'ox'            => 'oxen',
    'box'           => 'boxes',
    'goose'         => 'geese',
    'mongoose'      => 'mongooses',
    'law'           => 'laws',
    'mother-in-law' => 'mothers-in-law',
);

for my $word ( keys %plural_of ) {
    my $expected = $plural_of{$word};
    my $computed = inflect( "PL_N($word)" );

    ok( $computed eq $expected, "$word -> $expected" );}

Test programs like this should be kept in files with a .t suffix (inflections.t, conjunctions.t, articles.t, ) and stored in a directory named t/ within your development directory for the application or module. If you set up your development directory using Module::Starter or Module::Starter::PBP (see "Creating Modules" in Chapter 17), this test directory will be set up for you automatically, with some standard .t files already provided.

Note that Test::Simple is loaded with the argument qw( no_plan ). Normally that argument would be tests => count, indicating how many tests are expected, but here the tests are generated from the %plural_of table at run time, so the final count will depend on how many entries are in that table. Specifying a fixed number of tests when loading the module is useful if you happen know that number at compile time, because then the module can also "meta-test": verify that you carried out all the tests you expected to.

The Test::Simple program is slightly more concise and readable than the original driver code, and the output is much more compact and informative:

> perl inflections.t

ok 1 - house -> houses
ok 2 - law -> laws
not ok 3 - mongoose -> mongooses
#     Failed test (inflections.t at line 21)
ok 4 - goose -> geese
ok 5 - ox -> oxen
not ok 6 - mother-in-law -> mothers-in-law
#     Failed test (inflections.t at line 21)
ok 7 - mouse -> mice
ok 8 - box -> boxes
1..8# Looks like you failed 2 tests of 8.

More importantly, this version requires far less effort to verify the correctness of each test. You just scan down the left margin looking for a "not" and a comment line.

You might prefer to use the Test::More module instead of Test::Simple. Then you can specify the actual and expected values separately, by using the is( ) subroutine, rather than ok( ):

use Lingua::EN::Inflect qw( inflect );
use Test::More qw( no_plan );     # Now using more advanced testing tools

my %plural_of = (
    'mouse'         => 'mice',
    'house'         => 'houses',
    'ox'            => 'oxen',
    'box'           => 'boxes',
    'goose'         => 'geese',
    'mongoose'      => 'mongooses',
    'law'           => 'laws',
    'mother-in-law' => 'mothers-in-law',
);

for my $word ( keys %plural_of ) {
    my $expected = $plural_of{$word};
    my $computed = inflect( "PL_N($word)" );

    # Test expected and computed inflections for string equality...
    is( $computed, $expected, "$word -> $expected" );}

Apart from no longer having to type the eq yourself[117], this version also produces more detailed error messages:

> perl inflections.t

ok 1 - house -> houses
ok 2 - law -> laws
not ok 3 - mongoose -> mongooses
#     Failed test (inflections.t at line 20)
#          got: 'mongeese'
#     expected: 'mongooses'
ok 4 - goose -> geese
ok 5 - ox -> oxen
not ok 6 - mother-in-law -> mothers-in-law
#     Failed test (inflections.t at line 20)
#          got: 'mothers-in-laws'
#     expected: 'mothers-in-law'
ok 7 - mouse -> mice
ok 8 - box -> boxes
1..8# Looks like you failed 2 tests of 8.

The Test::Tutorial documentation that comes with Perl 5.8 provides a gentle introduction to both Test::Simple and Test::More.

Test Suites

Standardize your test suites with Test::Harness.

Once you've written your tests using one of the Test:: modules, in a series of .t files in the t/ subdirectory (as described in the previous guideline, "Modular Testing"), you can use the Test::Harness module to make it easier to run all the test files in your test suite.

The module is specifically designed to understand and summarize the output format used by Test::Simple and Test::More. It comes with an invaluable utility program named prove, which makes it trivially easy to run all the tests in your t/ directory and have the results summarized for you:

> prove -r

t/articles........ok
t/inflections.....NOK 3#     Failed test (inflections.t at line 21)
t/inflections.....NOK 6#     Failed test (inflections.t at line 21)
t/inflections.....ok 8/0# Looks like you failed 2 tests of 8.
t/inflections.....dubious
t/other/conjunctions....ok
t/verbs/participles.....okFailed 1/4 test scripts, 75.00% okay. 2/119 subtests failed, 98.32% okay.

The -r option tells prove to recursively search through subdirectories looking for .t files to test. You can also specify precisely where to look for tests by explicitly telling prove the directory or file:

> prove t/other

t/other/conjunctions....okAll tests successful.

The utility has many other options that allow you to preview which tests will be run (without actually running them), change the file extension that is searched for, run tests in a random order (to catch any order dependencies), run tests in taint mode (see the perlsec manpage), or see the individual results of every test rather than just a summary.

Using a standard testing setup and a coordinating utility like this, it's trivial to regression test each modification you make to a module or application. Every time you modify the source of your module or application, you simply type prove -r. Instantly, you can see whether your modification fixed what it was supposed to, and whether that fix broke anything else.

Failure

Write test cases that fail.

Testing is not actually about ensuring correctness; it's about discovering mistakes. The only successful test is one that fails, and thereby reveals a bug.

To use testing effectively, it's vital to get into the right (i.e., slightly counterintuitive) mindset when writing tests. You need to get to the point where you're mildly disappointed if the test suite runs without reporting a problem.

The logic behind that disappointment is simple. All non-trivial software has bugs. Your test suite's job is to find those bugs. If your software passes your test suite, then your test suite isn't doing its job.

Of course, at some point in the development process you have to decide that the code is finally good enough to deploy (or ship). And, at that point, you definitely want that code to pass its test suite before you send it out. But always remember: it's passing the test suite because you decided you'd found all the bugs you cared to test for, not because there were no more bugs to find.

What to Test

Test both the likely and the unlikely.

Having a test suite that fails to fail might not be a major problem, so long as your tests cover the most common ways in which your software will actually be used. The single most important practice here is to run your tests on real-world cases.

That is, if you're building software to handle particular datasets or data streams, test it using actual samples of that data. And make sure those samples are of a similar size to the data on which the software will eventually need to operate.

Play-testing (see Chapter 17) can also come in handy here. If you (or other prospective users) have prototyped the kind of code you expect to write, then you should test the kinds of activities that your exploratory code implements, and the kinds of errors that you made when writing that code. Better yet, just write your hypothetical code as a test suite, using one of the Test:: modules. Then, when you're ready to implement, your test suite will already be in place.

Testing the most likely uses of your software is essential, but it's also vital to write tests that examine both edge-cases (i.e., one parameter with an extreme or unusual value) and corner-cases (i.e., several parameters with an extreme or unusual value).

Good places to hunt for bad behaviour include:

  • The minimum and maximum possible values

  • Slightly less than the minimum possible value and slightly more than the maximum possible value

  • Negative values, positive values, and zero

  • Very small positive and negative values

  • Empty strings and multiline strings

  • Strings with control characters (including "")

  • Strings with non-ASCII characters (e.g., Latin-1 or Unicode)

  • undef, and lists of undef

  • '0', '0E0', '0.0', and '0 but true'

  • Empty lists, arrays, and hashes

  • Lists with duplicated and triplicated values

  • Input values that "will never be entered" (but which are)

  • Interactions with resources that "will never be missing" (but which are)

  • Non-numeric input where a number is expected, and vice versa

  • Non-references where a reference is expected, and vice versa

  • Missing arguments to a subroutine or method

  • Extra arguments to a subroutine or method

  • Positional arguments that are out of order

  • Key/value arguments that are mislabeled

  • Loading the wrong version of a module, where multiple versions are installed on your system

  • Every bug you ever actually encounter (see the following guideline, "Debugging and Testing")

Debugging and Testing

Add new test cases before you start debugging.

The first step in any debugging process is to isolate the incorrect behaviour of the system, by producing the shortest demonstration of it that you reasonably can. If you're lucky, this may even have been done for you:

To: [email protected]
From: [email protected]
Subject: Bug in inflect module

Zdravstvuite,

I have been using your Lingua::EN::Inflect module to normalize terms in a
data-mining application I am developing, but there seems to be a bug in it,
as the following example demonstrates:

    use Lingua::EN::Inflect qw( PL_N );

    print PL_N('man'), "
";       # Prints "men", as expected    print PL_N('woman'), "
";     # Incorrectly prints "womans"

Once you have distilled a short working example of the bug, convert it to a series of tests, such as:

use Lingua::EN::Inflect qw( PL_N );
use Test::More qw( no_plan );

is(PL_N('man') ,  'men',   'man -> men'     );is(PL_N('woman'), 'women', 'woman -> women' );

Don't try to fix the problem straightaway. Instead, immediately add those tests to your test suite. If that testing has been well set up, that can often be as simple as adding a couple of entries to a table:

my %plural_of = (
    'mouse'         => 'mice',
    'house'         => 'houses',
    'ox'            => 'oxen',
    'box'           => 'boxes',
    'goose'         => 'geese',
    'mongoose'      => 'mongooses',
    'law'           => 'laws',
    'mother-in-law' => 'mothers-in-law',

    # Sascha's bug, reported 27 August 2004...
    'man'           => 'men',
    'woman'         => 'women',);

The point is: if the original test suite didn't report this bug, then that test suite was broken. It simply didn't do its job (i.e., finding bugs) adequately. So fix the test suite first…by adding tests that cause it to fail:

> perl inflections.t

ok 1 - house -> houses
ok 2 - law -> laws
ok 3 - man -> men
ok 4 - mongoose -> mongooses
ok 5 - goose -> geese
ok 6 - ox -> oxen
not ok 7 - woman -> women
#     Failed test (inflections.t at line 20)
#          got: 'womans'
#     expected: 'women'
ok 8 - mother-in-law -> mothers-in-law
ok 9 - mouse -> mice
ok 10 - box -> boxes
1..10# Looks like you failed 1 tests of 10.

Once the test suite is detecting the problem correctly, then you'll be able to tell when you've correctly fixed the actual bug, because the tests will once again stop failing.

This approach to debugging is most effective when the test suite covers the full range of manifestations of the problem. When adding test cases for a bug, don't just add a single test for the simplest case. Make sure you include the obvious variations as well:

my %plural_of = (
    'mouse'         => 'mice',
    'house'         => 'houses',
    'ox'            => 'oxen',
    'box'           => 'boxes',
    'goose'         => 'geese',
    'mongoose'      => 'mongooses',
    'law'           => 'laws',
    'mother-in-law' => 'mothers-in-law',

    # Sascha's bug, reported 27 August 2004...
    'man'           => 'men',
    'woman'         => 'women',
    'human'         => 'humans',
    'man-at-arms'   => 'men-at-arms',
    'lan'           => 'lans',
    'mane'          => 'manes',
    'moan'          => 'moans',);

The more thoroughly you test the bug, the more completely you will fix it.

Strictures

Always use strict.

Making use strict your default will help perl (the interpreter) pick up a range of frequently made mistakes caused by Perl (the language) being overhelpful. For example, use strict detects and reports—at compile time—the common error of writing:

my $list = get_list(  );
# and later...
print $list[-1];             # Oops! Wrong variable

instead of:

my $list_ref = get_list(  );

# and later...print $list_ref->[-1];

But it's also important not to rely too heavily on use strict, or to assume that it's infallible. For example, it won't pick up that incorrect array access in the following example:

my @list;

# and later in the same scope...

my $list = get_list(  );

# and later...
print $list[-1];

That's because now the problem with $list[-1] isn't just that someone forgot the arrow; it's that they're referring to the wrong (valid) variable.

Similarly, the following code contains both symbolic references and unqualified package variables, both of which use strict is supposed to prevent. Yet it compiles without even a warning:

use strict;
use warnings;
use Data::Dumper;

use Readonly;
Readonly my $DUMP => 'Data::Dumper::Dumper';
Readonly my $MAX  => 10;

# and later...

sub dump_at {
    my $dump = &{$DUMP};                  # Symbolic reference

    my @a = (0..$MAX);

    for my $i (0..$#a) {
        $a->[$MAX-$i] = $a->[$i];          # Oops! Wrong variables
        print $dump->($a[$i]);
    }
    return;
}

The uncaught symbolic reference is in &{$DUMP}, where $DUMP contains a string, not a subroutine reference. The symbolic access is ignored because dump_at( ) is never called, so use strict never gets the chance to detect the symbolic reference.

The uncaught package variable is the scalar $a in $a->[$i] and $a->[$MAX-$i]. That's almost certainly supposed to be $a[$i], as it is in the print statement. Perhaps the:

my @a = (0..$MAX);

line was originally:

my $a = [0..$MAX];

and, when it was changed, the rest of the subroutine was incompletely updated. After all, use strict will point out any uses of $a that might have been missed, won't it?

In this particular case, it doesn't. The package variables $a and $b are exempt from use strict qw( vars ), because they're frequently required in sort blocks, and no-one wants to have to write:

@ordered_results = sort { our ($a, $b); $b <=> $a } @results;

And they're not the only variables that are invulnerable to use strict. Other "stealth" package variables include $ARGV, @ARGV, @INC, %INC, %ENV, %SIG, and occasionally @F (in the main package under the -a flag). Moreover, because use strict exempts the entire symbol table entry for each of the previous variables, none of the following are caught either: %ARGV, $INC, @ENV, $ENV, @SIG, and $SIG.

This doesn't mean that using strict isn't a good practice; it most definitely is. But it's critical to think of it as a tool, not as a crutch.

Warnings

Always turn on warnings explicitly.

If you're developing under Perl 5.6 or later, always use warnings at the start of each file. Under earlier versions of Perl, always use the -w command-line flag, or set the $WARNING variable (available from use English) to a true value.

Perl's warning system is invaluable. It can detect more than 200 different questionable programming practices, including common errors like using the wrong sigil on an array access; trying to read an output stream (or vice versa); leaving parentheses off ambiguous declarations; runaway strings with no closing delimiter; dyslexic assignment operators (=-, =+, etc.); using non-numbers as numbers; using | instead of ||, or || instead of or; misspelling a package or class name; mixing up 1 and $1 in a regex; ambiguous subroutine/function calls; and improbable control flow (e.g., returning from a subroutine via a call to next).

Some of these warnings are enabled by default, but all of them are worth enabling.

Not taking advantage of these warnings can result in code like this, which compiles without complaint, even though it has (at least) nineteen distinct problems:

my $n = 9;
my $list = (1..$n);

my $n = <TTY>;

print ("
" x lOO, keys %$list), "
";
print $list[$i];

sub keys ($list) {
    $list ||= $_[1], @default_list;
    push digits, @{$list} =~ m/([A-Za-d])/g;
    return uc 1;
}

Under use warnings the awful truth can be revealed:

"my" variable $n masks earlier declaration in same scope at caveat.pl line 4.
print (...) interpreted as function at caveat.pl line 6.
Illegal character in prototype for main::keys : $list at caveat.pl line 9.
Unquoted string "digits" may clash with future reserved word at caveat.pl line 11.
False [] range "a-d" in regex; marked by <-- HERE in m/([A-Za-d <-- HERE ])/
at caveat.pl line 11.
Applying pattern match (m//) to @array will act on scalar(@array) at
caveat.pl line 11.
Array @digits missing the @ in argument 1 of push(  ) at caveat.pl line 11.
Useless use of reference constructor in void context at caveat.pl line 10.
Useless use of a constant in void context at caveat.pl line 6.
Name "main::list" used only once: possible typo at caveat.pl line 7.
Name "main::default_list" used only once: possible typo at caveat.pl line 10.
Name "main::TTY" used only once: possible typo at caveat.pl line 4.
Name "main::digits" used only once: possible typo at caveat.pl line 11.
Name "main::i" used only once: possible typo at caveat.pl line 7.
Use of uninitialized value in range (or flip) at caveat.pl line 2.
readline(  ) on unopened filehandle TTY at caveat.pl line 4.
Argument "lOO" isn't numeric in repeat (x) at caveat.pl line 6.
Use of uninitialized value in array element at caveat.pl line 7.Use of uninitialized value in print at caveat.pl line 7.

Note that it may still be appropriate to comment out the use warnings line when your application or module is deployed, especially if non-technical users will interact with it, or if it will run in a CGI or other embedded environment. Issuing warnings in these contexts can needlessly alarm users, or cause server errors.

Don't remove the use warnings completely, though; when something goes wrong, you'll want to uncomment it again so the reinstated warnings can help you locate and fix the problem.

Correctness

Never assume that a warning-free compilation implies correctness.

use strict and use warnings are powerful developments aids, whose insights into the foibles of the typical programmer sometimes border on the magical. It is a serious mistake not to use them at all times.

But, as the examples in the previous guidelines illustrate, they are neither infallible nor omniscient. It may seem counterintuitive, but Perl's extensive list of warnings and strictures can sometimes result in code that is less robust than it otherwise might have been. The comforting knowledge that "use strict will pick up any problems" often engenders a false sense of security, and promotes the illusion that a silent compilation implies a correct compilation.

But no Perl pragma will ever be able to pick out the serious bug in this subroutine:

sub is_monotonic_increasing {
    my ($data_ref) = @_;
    for my $i (1..$#{$data_ref}) {
        return 0 unless $data_ref->[$i-1] > $data_ref->[$i];
    }
    return 1;
}

It's foolish not to make use of the very real protections that use strict and use warnings provide. Just don't let those protections make you complacent[118].

Overriding Strictures

Turn off strictures or warnings explicitly, selectively, and in the smallest possible scope.

Sometimes you really do need to implement something arcane; something that would cause use strict or use warnings to complain. In this case, because you'll always be using both those pragmas (see the previous three guidelines, "Strictures", "Warnings", and "Correctness"), you'll need to turn them off temporarily.

The key to doing that without compromising the robustness of your code is to turn off warnings and strictures in the smallest possible scope. And to turn off only the particular warnings you intend to cause or those specific strictures that you're intentionally violating.

For example, suppose you needed a Sub::Tracking module that, when passed the name of a subroutine, would modify that subroutine so that any subsequent call to it was logged. For example:

use Digest::SHA qw( sha512_base64 );

use Sub::Tracking qw( track_sub );
track_sub('sha512_base64'),

# and later...

my $text_key    = sha512_base64($original_text);  # Use of subroutine automatically logged

Such a module might be implemented as in Example 18-1.

Example 18-1. A module for tracking subroutine calls

package Sub::Tracking;

use version; our $VERSION = qv(0.0.1);

use strict;
use warnings;
use Carp;
use Perl6::Export::Attrs;
use Log::Stdlog {level => 'trace'};
# Utility to create a tracked version of an existing subroutine...
sub _make_tracker_for {
    my ($sub_name, $orig_sub_ref) = @_;

    # Return a new subroutine...
    return sub {

        # ...which first determines and logs its call context
        my ($package, $file, $line) = caller;
        print {*STDLOG} trace =>
            "Called $sub_name(@_) from package $package at '$file' line $line";

        # ...and then transforms into a call to the original subroutine
        goto &{$orig_sub_ref};
    }
}

# Replace an existing subroutine with a tracked version...
sub track_sub : Export {
    my ($sub_name) = @_;  

    # Locate the (currently untracked) subroutine in the caller's symbol table...
    my $caller = caller;
    my $full_sub_name = $caller.'::'.$sub_name;
    my $sub_ref = do { no strict 'refs'; *{$full_sub_name}{CODE} };

    # Or die trying...
    croak "Can't track nonexistent subroutine '$full_sub_name'"
        if !defined $sub_ref;

    # Then build a tracked version of it...
    my $tracker_ref = _make_tracker_for($sub_name, $sub_ref);

    # And install that version back in the caller's symbol table...
    {
        no strict 'refs';
        *{$full_sub_name} = $tracker_ref;
    }

    return;
}1; # Magic true value required at end of module

The _make_tracker_for( ) utility subroutine creates a new anonymous subroutine that first logs the fact that it has been called:

print {*STDLOG} trace =>
    "Called $sub_name(@_) from package $package at '$file' line $line";

then turns itself into the original subroutine instead[119]:

goto &{$orig_sub_ref};

The Sub::Tracking::track_sub( ) subroutine expects to be passed the name of the subroutine to be tracked. It takes that name, prepends the caller's package name ($caller.'::'.$sub_name), and then looks up that fully qualified name to see if there is a corresponding subroutine entry in the caller's symbol table (*{$full_sub_name}{CODE}). The result of this look-up will be either a reference to the named subroutine or undef (if no such subroutine exists).

track_sub( ) then creates a new tracking version of the subroutine:

my $tracker_ref = _make_tracker_for($sub_name, $sub_ref);

and installs it back in the caller's symbol table:

*{$full_sub_name} = $tracker_ref;

The problem here is that both the symbol table look-up and the symbol table assignment use a string ($full_sub_name) as the name of the symbol table entry, rather than a hard reference to it. Using a string instead of a real reference would normally incur the wrath of use strict, but the no strict 'refs' declarations tell the compiler to turn a blind eye.

Of course, it's particularly tedious to have to set up those tiny block scopes to contain the no strict declarations, especially when you could get the same effect simply by omitting the use strict at the start of the module:

package Sub::Tracking;
# use strict  -- Disabled because symbolic references needed below
use warnings;
use Carp;
use Stdlog;
use version; our $VERSION = qv(0.0.1);# etc.

But that's a bad practice, because it would remove the strictures not only from the two lines where they're not wanted, but from every other line as well. That could easily mask other strictness violations that you would still like to be informed of.

Nor would it have been acceptable to turn off strict references throughout the track_sub( ) subroutine:

sub track_sub : Export {
    my ($sub_name) = @_;
    no strict 'refs';

    # Locate the (currently untracked) subroutine in the caller's symbol table...
    my $caller = caller;
    my $full_sub_name = $caller.'::'.$sub_name;
    my $sub_ref = *{$full_sub_name}{CODE};

    # Or die trying...
    croak "Can't track nonexistent subroutine '$full_sub_name'"
        if !defined $sub_ref;

    # Then build a tracked version of it...
    my $tracker_ref = _make_tracker_for($sub_name, $sub_ref);

    # And install that version back in the caller's symbol table...
    *{$full_sub_name} = $tracker_ref;

    return;
}

That would still exclude far more code from strictness-checking than was (ahem) strictly necessary.

Wrapping extra do blocks or raw blocks tightly around any statement that is deliberately violating strictness is tedious, but not as tedious as spending an hour debugging some unexpected symbolic reference, unauthorized package variable, or undeclared subroutine that use strict would otherwise have caught.

The Debugger

Learn at least a subset of the perl debugger.

Perl's integrated debugger makes it very easy to watch your program's internal state change as it executes. At the very least, you should be familiar with the basic features summarized in Table 18-1.

Table 18-1. Debugger basics

Debugging task

Debugger command

To run a program under the debugger

> perl -d program.pl

To set a breakpoint at the current line

DB<1> b

To set a breakpoint at line 42

DB<1> b 42

To continue executing until the next break-point is reached

DB<1> c

To continue executing until line 86

DB<1> c 86

To continue executing until subroutine foo is called

DB<1> c foo

To execute the next statement

DB<1> n

To step into any subroutine call that's part of the next statement

DB<1> s

To run until the current subroutine returns

DB<1> r

To examine the contents of a variable

DB<1> x $variable

To have the debugger watch a variable or expression, and inform you whenever it changes

DB<1> w $variable

DB<1> w expr($ess)*$ion

To view where you are in the source code

DB<1> v

 

To view line 99 of the source code

DB<1> v 99

 

To get helpful hints on the many other features of the debugger

DB<1> |h h

 

The standard perldebug and perldebtut documentation provide much more detail on using the debugger. You can also download and print out a handy free summary of the most commonly used commands from http://www.perl.com/2004/11/24/debugger_ref.pdf.

Manual Debugging

Use serialized warnings when debugging "manually".

Many developers prefer not to use the debugger. Maybe they don't like the command-line interface, or the way the debugger slows down the execution of their code, or the fact that it actually changes the code it's debugging[120]. Perhaps they just dislike the tedium of stepping through a program statement by statement.

The most popular alternative to using the debugger is to manually insert print statements at relevant points in the code. This has the distinct advantage of altering the code being debugged in limited and predictable ways.

But, if you're going to debug manually, don't use print for your print statements:

my $results  = $scenario->project_outcomes(  );

print "$results: $results
";  # debugging only

Use warn instead:

my $results  = $scenario->project_outcomes(  );

warn "$results: $results";

Because warn statements will not be used anywhere else in your code (see "Reporting Failure" in Chapter 13), using them for debugging makes it very easy to subsequently find your debugging statements. Using warn also conveniently ensures that debugging messages are printed to *STDERR, rather than *STDOUT.

In addition, it's a good practice always to serialize the data structure you're reporting, using Data::Dumper:

my $results  = $scenario->project_outcomes(  );

use Data::Dumper qw( Dumper );warn '$results:', Dumper($results);

By printing the value you're reporting in a structured format, you maximize the information that's subsequently available to help you debug. For example, if the project_outcomes( ) method was expected to return an Achievements object, then debugging with:

warn "$results: $results
";

might print:

$results: Achievements=SCALAR(0x811130)

It looks like the method is working correctly, sending back an inside-out object of class Achievements. However, adding Data::Dumper serialization to the debugging statement:

warn '$results: ', Dumper($results);

reveals a subtle problem:

$results: $VAR1 = 'Achievements=SCALAR(0x811130)'

That is, instead of returning an actual Achievements object, the call to project_outcomes( ) is returning a string instead. The expected object reference is undergoing a spurious stringification before being returned. If the method had been behaving properly, the serialized output would have indicated that there was a real object in $results:

$results: $VAR1 = bless( do{(my $o = undef)}, 'Achievements' )

So always serialize any data structure you're debugging[121].

If you prefer this kind of manual debugging, you may find it useful to set up a macro in your editor to insert suitably serialized print statements automatically. For example, adding the following in vim:

:iab dbg use Data::Dumper qw( Dumper );^Mwarn Dumper [];^[hi

replaces any instance of 'dbg' you might insert in your program with:

use Data::Dumper qw( Dumper );
warn Dumper [_];

The macro then repositions the insertion point between the square brackets (represented by the underscore in the previous example) and allows you to continue to insert the data structure you want to debug. You can achieve a similar effect in Emacs with an entry in the global abbreviation table of your ~/.abbrev_defs file:

(define-abbrev-table 'global-abbrev-table '(

    ("pdbg"  "use Data::Dumper qw( Dumper );
warn Dumper[];"  nil  1)  ))

Semi-Automatic Debugging

Consider using "smart comments" when debugging, rather than warn statements.

Serialized warnings work well for manual debugging, but they can be tedious to code correctly[122]. And, even with the editor macro suggested earlier, the output of a statement like:

warn 'results: ', Dumper($results);

still leaves something to be desired in terms of readability:

results: $VAR1 = bless( do{(my $o = undef)}, 'Achievements' )

The Smart::Comments module (previously described under "Automatic Progress Indicators" in Chapter 10) supports a form of smart comment that can help your debugging. For example, instead of:

use Data::Dumper qw( Dumper );

my $results  = $scenario->project_outcomes(  );

warn '$results: ', Dumper($results);

you could just write:

use Smart::Comments;

my $results = $scenario->project_outcomes(  );### $results

which would then output either:

### $results: <opaque Achievements object (blessed scalar)>

or:

### $results: 'Achievements=SCALAR(0x811130)'

depending on whether $results is an actual object reference or merely its stringification.

Smart::Comments also supports comment-based assertions:

### check: @candidates >= @elected

which issue warnings when the specified condition is not met. For example, the previous comment might print:

### @candidates >= @elected was not true at ch18/Ch18.049_Best line 23.
###     @candidates was: [
###                        'Smith',
###                        'Nguyen',
###                        'Ibrahim'
###                      ]
###     @elected was: [
###                     'Smith',
###                     'Nguyen',
###                     'Ibrahim',
###                     'Nixon'###                   ]

The module also supports stronger assertions:

### require: @candidates >= @elected

which prints the same warning as the ### check:, but then immediately terminates the program.

Apart from producing more readable debugging messages, the major advantage of this approach is that you can later switch off all these comment-based debugging statements simply by removing (or commenting out) the use Smart::Comments line. When Smart::Comments isn't loaded, those smart comments become regular comments, which means you can leave the actual debugging statements in your source code[123] without incurring any performance penalty.



[117] The ok subroutine is still available from Test::More if you do want to specify your own comparisons.

[118] By the way, that serious bug was that the conditions under which a return 0 occurs is the wrong way round. The unless should be an if

[119] This is known as a "magic goto". It replaces the current subroutine call with a call to whatever subroutine you tell it to go to. It's very useful when you're installing a wrapper around an existing subroutine. Your wrapper call can do whatever it needs to do, then silently transform itself into a call to the wrapped subroutine. After which, even caller won't be able to tell the difference. See the entry for goto in perlfunc.

[120] The subtle changes that the debugger surreptitiously makes to any code it's executing usually pass unnoticed. However, very occasionally those manipulations can actually make debugging even more difficult, by introducing arcane phenomena like heisenbugs (errors that vanish when you try to debug them), schrödinbugs (errors that manifest only when you're trying to debug something else), and mandelbugs (complex errors that seem to fluctuate more and more chaotically, the closer you look at them).

[121] For the same reasons, it's also a mistake to use the p (print) command in the debugger. Always use the x (examine) command instead; it serializes its output.

[122] Which is vital. If there's anything less enjoyable than beating your head against a bug for several hours, it's finally discovering that your debugging print statement was itself buggy, and the problem isn't anywhere near where you thought it was. This is presumably a homerbug

[123] If you needed them once, you'll almost certainly need them again.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset