This chapter is about aiming Perl in the right direction before you fire it off. There are various ways to aim Perl, but the two primary ways are through switches on the command line and through environment variables. Switches are the more immediate and precise way to aim a particular command. Environment variables are more often used to set general policy.
It is fortunate that Perl grew up in the Unix world, because that means its invocation syntax works pretty well under the command interpreters of other operating systems, too. Most command interpreters know how to deal with a list of words as arguments and don't care if an argument starts with a minus sign. There are, of course, some sticky spots where you'll get fouled up if you move from one system to another. You can't use single quotes under MS-DOS as you do under Unix, for instance. And on systems like VMS, some wrapper code has to jump through hoops to emulate Unix I/O redirection. Wildcard interpretation is a wildcard. Once you get past those issues, however, Perl treats its switches and arguments much the same on any operating system.
Even when you don't have a command interpreter per se, it's easy
to execute a Perl program from another program written in any
language. Not only can the calling program pass arguments in the
ordinary way, it can also pass information via environment variables
and, if your operating system supports them, inherited file
descriptors (see "Passing Filehandles" in Chapter 16. Even exotic
argument-passing mechanisms can easily be encapsulated in a module,
then brought into your Perl program via a simple
use
directive.
Perl parses command-line switches in the standard
fashion.[1] That is, it expects any switches (words beginning with a
minus) to come first on the command line. After that usually comes the
name of the script, followed by any additional arguments to be passed
into the script. Some of these additional arguments may themselves
look like switches, but if so, they must be processed by the script,
because Perl quits parsing switches as soon as it sees a nonswitch, or
the special "--
" switch that says, "I am the last
switch."
Perl gives you some flexibility in where you place the source code for your program. For small, quick-and-dirty jobs, you can program Perl entirely from the command line. For larger, more permanent jobs, you can supply a Perl script as a separate file. Perl looks for a script to compile and run in any one of these three ways:
Specified line by line via
-e
switches on the command line. For
example:
%
perl -e "print 'Hello, World.'"
Hello, World.
Contained in the file specified by the first
filename on the command line. Systems supporting the
#!
notation on the first line of an executable
script invoke interpreters this way on your behalf.
Passed in implicitly via standard input. This
method works only when there are no filename arguments; to pass
arguments to a standard-input script you must use method 2,
explicitly specifying a "-
" for the script
name. For example:
%
echo "print qq(Hello, @ARGV.)" | perl - World
Hello, World.
With methods 2 and 3, Perl starts parsing the input
file from the beginning--unless you've specified a
-x
switch, in which case it scans for the first
line starting with #!
and containing the word
"perl
", and starts there instead. This is useful
for running a script embedded in a larger message. If so, you might
indicate the end of the script using the __END__
token.
Whether or not you use -x
, the
#!
line is always examined for switches when the
line is parsed. That way, if you're on a platform that allows only one
argument with the #!
line, or worse, doesn't even
recognize the #!
line as special, you can still get
consistent switch behavior regardless of how Perl was invoked, even if
-x
was used to find the beginning of the
script.
Warning: because older versions of Unix silently chop
off kernel interpretation of the #!
line after 32
characters, some switches may end up getting to your program intact,
and others not; you could even get a "-
" without
its letter, if you're not careful. You probably want to make sure that
all your switches fall either before or after that 32-character
boundary. Most switches don't care whether they're processed
redundantly, but getting a "-
" instead of a
complete switch would cause Perl to try to read its source code from
the standard input instead of from your script. And a partial
-I
switch could also cause odd results.
However, some switches do care if they are processed twice, like
combinations of -l
and
-0
. Either put all the switches after the
32-character boundary (if applicable), or replace the use of
-0
DIGITS
with
BEGIN{ $/ =
"
DIGITS
"; }
.
Of course, if you're not on a Unix system, you're guaranteed not to
have this particular problem.
Parsing of #!
switches starts from where
"perl
" is first mentioned in the line. The
sequences "-*
" and "-
" are
specifically ignored for the benefit of emacs
users, so that, if you're so inclined, you can say:
#!/bin/sh -- # -*- perl -*- -p eval 'exec perl -S $0 ${1+"$@"}' if 0;
and Perl will see only the -p
switch. The
fancy "-*- perl -*-
" gizmo tells
emacs to start up in Perl mode; you don't need it
if you don't use emacs. The
-S
mess is explained later under the
description of that switch.
A similar trick involves the env (1) program, if you have it:
#!/usr/bin/env perl
The previous examples use a relative path to the Perl
interpreter, getting whatever version is first in the user's path. If
you want a specific version of Perl, say,
perl5.6.1, place it directly in the
#!
line's path, whether with the
env program, with the -S
mess, or with a regular #!
processing.
If the #!
line does not
contain the word "perl
", the program named after
the #!
is executed instead of the Perl interpreter.
For example, suppose you have an ordinary Bourne shell script out
there that says:
#!/bin/sh echo "I am a shell script"
If you feed that file to Perl, then Perl will run
/bin/sh for you. This is slightly bizarre, but it
helps people on machines that don't recognize #!
,
because--by setting their SHELL
environment
variable--they can tell a program (such as a mailer) that their shell
is /usr/bin/perl, and Perl will then dispatch the
program to the correct interpreter for them, even though their kernel
is too stupid to do so.
But back to Perl scripts that are really Perl scripts. After
locating your script, Perl compiles the entire program into an
internal form (see Chapter 18).
If any compilation errors arise, execution does not even begin. (This
is unlike the typical shell script or command file, which might run
part-way through before finding a syntax error.) If the script is
syntactically correct, it is executed. If the script runs off the end
without hitting an exit
or die
operator, an implicit exit(0)
is supplied by Perl
to indicate successful completion to your caller. (This is unlike the
typical C program, where you're likely to get a random exit status if
your program just terminates in the normal way.)
Unix's #!
technique can be
simulated on other systems:
A Perl program on a Macintosh will have the appropriate Creator and Type, so that double-clicking them will invoke the Perl application.
Create a batch file to run your program, and codify it
in ALTERNATIVE_SHEBANG
. See the
dosish.h file in the top level of the
Perl source distribution for more information about this.
Put this line:
extproc perl -S -your_switches
as the first line in *.cmd file
(-S
works around a bug in
cmd.exe's "extproc
"
handling).
% perl -mysw
'f$env("procedure")' 'p1' 'p2' 'p3' 'p4' 'p5' 'p6' 'p7' 'p8' !
$ exit++ + ++$status != 0 and $exit = $status = undef;
at the top of your program, where
-mysw
are any command-line switches
you want to pass to Perl. You can now invoke the program
directly by typing perl program
, as a DCL
procedure by saying @program
, or implicitly
via DCL$PATH
by using just the name of the
program. This incantation is a bit much to remember, but Perl
will display it for you if you type in perl
"-V:startperl
". If you can't remember that--well,
that's why you bought this book.
When using the ActiveState distribution of Perl under some variant of Microsoft's Windows suite of operating systems (that is, Win95, Win98, Win00,[2] WinNT, but not Win3.1), the installation procedure for Perl modifies the Windows Registry to associate the .pl extension with the Perl interpreter.
If you install another port of Perl, including the one in the Win32 directory of the Perl distribution, then you'll have to modify the Windows Registry yourself.
Note that using a .pl extension means you can no longer tell the difference between an executable Perl program and a "perl library" file. You could use .plx for a Perl program instead to avoid this. This is much less of an issue these days, since most Perl modules are now in .pm files.
Command interpreters on non-Unix systems often have
extraordinarily different ideas about quoting than Unix shells have.
You'll need to learn the special characters in your command
interpreter (*
, , and " are
common) and how to protect whitespace and these special characters
to run one-liners via the
-e
switch. You
might also have to change a single %
to a
%%
, or otherwise escape it, if that's a special
character for your shell.
On some systems, you may have to change single quotes to double quotes. But don't do that on Unix or Plan9 systems, or anything running a Unix-style shell, such as systems from the MKS Toolkit or from the Cygwin package produced by the Cygnus folks, now at Redhat. Microsoft's new Unix emulator called Interix is also starting to look, ahem, interixing.
For example, on Unix and Mac OS X, use:
%
perl -e 'print "Hello world "'
On Macintosh (pre Mac OS X), use:
print "Hello world "
then run "Myscript" or Shift-Command-R.
On VMS, use:
$
perl -e "print ""Hello world """
or again with qq//
:
$
perl -e "print qq(Hello world )"
And on MS-DOS et al., use:
A:> perl -e "print "Hello world
""
or use qq//
to pick your own quotes:
A:> perl -e "print qq(Hello world
)"
The problem is that neither of those is reliable: it depends on the command interpreter you're using there. If 4DOS were the command shell, this would probably work better:
perl -e "print <Ctrl-x>"Hello world <Ctrl-x>""
The CMD.EXE program seen on Windows NT seems to have slipped a lot of standard Unix shell functionality in when nobody was looking, but just try to find documentation for its quoting rules.
On the Macintosh,[3] all this depends on which environment you are using. The MacPerl shell, or MPW, is much like Unix shells in its support for several quoting variants, except that it makes free use of the Macintosh's non-ASCII characters as control characters.
There is no general solution to all of this. It's just a mess. If you aren't on a Unix system but want to do command-line things, your best bet is to acquire a better command interpreter than the one your vendor supplied you, which shouldn't be too hard.
Or just write it all in Perl, and forget the one-liners.
Although this may seem obvious, Perl is useful only
when users can easily find it. When possible, it's good for both
/usr/bin/perl and
/usr/local/bin/perl to be symlinks to the
actual binary. If that can't be done, system administrators are
strongly encouraged to put Perl and its accompanying utilities into
a directory typically found along a user's standard
PATH
, or in some other obvious and convenient
place.
In this book, we use the standard
#!/usr/bin/perl
notation on the first line of the
program to mean whatever particular mechanism works on your system.
If you care about running a specific version of Perl, use a specific
path:
#!/usr/local/bin/perl5.6.0
If you just want to be running at least some version number, but don't mind higher ones, place a statement like this near the top of your program:
use v5.6.0;
(Note: earlier versions of Perl use numbers like 5.005 or 5.004_05. Nowadays we would think of those as 5.5.0 and 5.4.5, but versions of Perl older than 5.6.0 won't understand that notation.)