There's a right time to think about everything; sometimes that time is beforehand, and sometimes it's after. Sometimes it's somewhere in the middle. Perl doesn't presume to know when it's the right time to think, so it gives the programmer a number of options for telling it when to think. Other times it knows that some sort of thinking is necessary but doesn't have any idea what it ought to think, so it needs ways of asking your program. Your program answers these kinds of questions by defining subroutines with names appropriate to what Perl is trying to find out.
Not only can the compiler call into the interpreter when it wants to be forward thinking, but the interpreter can also call back to the compiler when it wants to revise history. Your program can use several operators to call back into the compiler. Like the compiler, the interpreter can also call into named subroutines when it wants to find things out. Because of all this give and take between the compiler, the interpreter, and your program, you need to be aware of what things happen when. First we'll talk about when these named subroutines are triggered.
In Chapter 10, we saw how a
package's AUTOLOAD
subroutine is triggered when an
undefined function in that package is called. In Chapter 12, we met the
DESTROY
method which is invoked when an object's
memory is about to be automatically reclaimed by Perl. And in Chapter 14, we encountered the many
functions implicitly called when a tied variable is accessed.
These subroutines all follow the convention that, if a
subroutine is triggered automatically by either the compiler or the
interpreter, we write its name in uppercase. Associated with the
different stages of your program's lifetime are four other such
subroutines, named BEGIN
, CHECK
,
INIT
, and END
. The
sub
keyword is optional before their declarations.
Perhaps they are better called "blocks", because they're in some ways
more like named blocks than real subroutines.
For instance, unlike regular subroutines, there's no harm in
declaring these blocks multiple times, since Perl keeps track of when
to call them, so you never have to call them by name. (They are also
unlike regular subroutines in that shift
and
pop
act as though you were in the main program, and
so they act on @ARGV
by default, not
@_
.)
These four block types run in this order:
BEGIN
Runs ASAP (as soon as parsed) whenever encountered during compilation, before compiling the rest of the file.
CHECK
Runs when compilation is complete, but before the program
starts. (CHECK
can mean "checkpoint" or
"double-check" or even just "stop".)
INIT
Runs at the beginning of execution right before the main flow of your program starts.
END
Runs at the end of execution right after the program finishes.
If you declare more than one of these by the same name, even in
separate modules, the BEGIN
s all run before any
CHECK
s, which all run before any
INIT
s, which all run before any
END
s--which all run dead last, after your main
program has finished. Multiple BEGIN
s and
INIT
s run in declaration order (FIFO), and the
CHECK
s and END
s run in inverse
declaration order (LIFO).
This is probably easiest to see in a demo:
#!/usr/bin/perl -l print "start main running here"; die "main now dying here "; die "XXX: not reached "; END { print "1st END: done running" } CHECK { print "1st CHECK: done compiling" } INIT { print "1st INIT: started running" } END { print "2nd END: done running" } BEGIN { print "1st BEGIN: still compiling" } INIT { print "2nd INIT: started running" } BEGIN { print "2nd BEGIN: still compiling" } CHECK { print "2nd CHECK: done compiling" } END { print "3rd END: done running" }
When run, that demo program produces this output:
1st BEGIN: still compiling 2nd BEGIN: still compiling 2nd CHECK: done compiling 1st CHECK: done compiling 1st INIT: started running 2nd INIT: started running start main running here main now dying here 3rd END: done running 2nd END: done running 1st END: done running
Because a BEGIN
block executes
immediately, it can pull in subroutine declarations, definitions, and
importations before the rest of the file is even compiled. These can
alter how the compiler parses the rest of the current file,
particularly if you import subroutine definitions. At the very least,
declaring a subroutine lets it be used as a list operator, making
parentheses optional. If the imported subroutine is declared with a
prototype, calls to it can be parsed like built-ins and can even
override built-ins of the same name in order to give them different
semantics. The use
declaration is just a
BEGIN
block with an attitude.
END
blocks, by contrast, are
executed as late as possible: when your program
exits the Perl interpreter, even if as a result of an untrapped
die
or other fatal exception. There are two
situations in which an END
block (or a
DESTROY
method) is skipped. It isn't run if,
instead of exiting, the current process just morphs itself from one
program to another via exec
. A process blown out of
the water by an uncaught signal also skips its END
routines. (See the use sigtrap
pragma described in
Glossary, for an easy way to
convert catchable signals into exceptions. For general information on
signal handling, see "Signals" in Chapter 16.) To avoid all
END
processing, you can call POSIX::_exit
,
say kill -9, $$
, or just exec
any innocuous program, such as /bin/true on Unix
systems.
Inside an END
block,
$?
contains the status the program is going to
exit
with. You can modify $?
from within the END
block to change the exit value
of the program. Beware of changing $?
accidentally
by running another program with system
or
backticks.
If you have several END
blocks within a file,
they execute in reverse order of their
definition. That is, the last END
block defined is
the first one executed when your program finishes. This reversal
enables related BEGIN
and END
blocks to nest the way you'd expect, if you pair them up. For example,
if the main program and a module it loads both have their own paired
BEGIN
and END
subroutines, like
so:
BEGIN { print "main begun" } END { print "main ended" } use Module;
and in that module, these declarations:
BEGIN { print "module begun" } END { print "module ended" }
then the main program knows that its BEGIN
will always happen first, and its END
will always
happen last. (Yes, BEGIN
is really a compile-time
block, but similar arguments apply to paired INIT
and END
blocks at run time.) This principle is
recursively true for any file that includes another when both have
declarations like these. This nesting property makes these blocks work
well as package constructors and destructors. Each module can have its
own set-up and tear-down functions that Perl will call automatically.
This way the programmer doesn't have to remember that if a particular
library is used, what special initialization or clean-up code ought to
be invoked, and when. The module's declarations assure these
events.
If you think of an eval
STRING
as a call back
from the interpreter to the compiler, then you might think of a
BEGIN
as a call forward from
the compiler into the interpreter. Both temporarily put the current
activity on hold and switch modes of operation. When we say that a
BEGIN
block is executed as early as possible, we
mean it's executed just as soon as it is completely defined, even
before the rest of the containing file is parsed.
BEGIN
blocks are therefore executed during compile
time, never during run time. Once a BEGIN
block has
run, it is immediately undefined and any code it used is returned to
Perl's memory pool. You couldn't call a BEGIN
block
as a subroutine even if you tried, because by the time it's there,
it's already gone.
Similar to BEGIN
blocks,
INIT
blocks are run just before the Perl run time
begins execution, in "first in, first out" (FIFO) order. For example,
the code generators documented in perlcc make use
of INIT
blocks to initialize and resolve pointers
to XSUBs. INIT
blocks are really just like
BEGIN
blocks, except they let the programmer
distinguish construction that must happen at compile phase from
construction that must happen at run phase. When you're running a
script directly, that's not terribly important because the compiler
gets invoked every time anyway; but when compilation is separate from
execution, the distinction can be crucial. The compiler may only be
invoked once, and the resulting executable may be invoked many
times.
Similar to END
blocks,
CHECK
blocks are run just after the Perl compile
phase ends but before run phase begins, in LIFO order.
CHECK
blocks are useful for "winding down" the
compiler just as END
blocks are useful for winding
down your program. In particular, the backends all use
CHECK
blocks as the hook from which to invoke their
respective code generators. All they need to do is put a
CHECK
block into their own module, and it will run
at the right time, so you don't have to install a
CHECK
into your program. For this reason, you'll
rarely write a CHECK
block yourself, unless you're
writing such a module.
Putting it all together, Table 18.1 lists various
constructs with details on when they compile and when they run the
code represented by "…
".
Table 18-1. What Happens When
Block or Expressions | Compiles During Phase | Traps Compile Errors | Runs During Phase | Traps Run Errors | Call Trigger Policy |
---|---|---|---|---|---|
use … | C | No | C | No | Now |
no … | C | No | C | No | Now |
BEGIN {…} | C | No | C | No | Now |
CHECK {…} | C | No | C | No | Late |
INIT {…} | C | No | R | No | Early |
END {…} | C | No | R | No | Late |
eval {…} | C | No | R | Yes | Inline |
eval "… " | R | Yes | R | Yes | Inline |
foo(…) | C | No | R | No | Inline |
sub foo {…} | C | No | R | No | Call anytime |
eval "sub {…} " | R | Yes | R | No | Call later |
s/pat/…/e | C | No | R | No | Inline |
s/pat/"…"/ee | R | Yes | R | Yes | Inline |
Now that you know the score, we hope you'll be able to compose and perform your Perl pieces with greater confidence.