Computer processes have almost as many ways of communicating as people do. The difficulties of interprocess communication should not be underestimated. It doesn't do you any good to listen for verbal cues when your friend is using only body language. Likewise, two processes can communicate only when they agree on the means of communication, and on the conventions built on top of that. As with any kind of communication, the conventions to be agreed upon range from lexical to pragmatic: everything from which lingo you'll use, up to whose turn it is to talk. These conventions are necessary because it's very difficult to communicate bare semantics in the absence of context.
In our lingo, interprocess communication is usually pronounced IPC. The IPC facilities of Perl range from the very simple to the very complex. Which facility you should use depends on the complexity of the information to be communicated. The simplest kind of information is almost no information at all: just the awareness that a particular event has happened at a particular point in time. In Perl, these events are communicated via a signal mechanism modeled on the Unix signal system.
At the other extreme, the socket facilities of Perl allow you to communicate with any other process on the Internet using any mutually supported protocol you like. Naturally, this freedom comes at a price: you have to go through a number of steps to set up the connections and make sure you're talking the same language as the process on the other end. This may in turn require you to adhere to any number of other strange customs, depending on local conventions. To be protocoligorically correct, you might even be required to speak a language like XML, or Java, or Perl. Horrors.
Sandwiched in between are some facilities intended primarily for communicating with processes on the same machine. These include good old-fashioned files, pipes, FIFOs, and the various System V IPC syscalls. Support for these facilities varies across platforms; modern Unix systems (including Apple's Mac OS X) should support all of them, and, except for signals and SysV IPC, most of the rest are supported on any recent Microsoft operating systems, including pipes, forking, file locking, and sockets.[1]
More information about porting in general can be found in the standard Perl documentation set (in whatever format your system displays it) under perlport. Microsoft-specific information can be found under perlwin32 and perlfork, which are installed even on non-Microsoft systems. For textbooks, we suggest the following:
The Perl Cookbook, by Tom Christiansen and Nathan Torkington (O'Reilly and Associates, 1998), chapters 16 through 18.
Advanced Programming in the UNIX Environment, by W. Richard Stevens (Addison-Wesley, 1992).
TCP/IP Illustrated, by W. Richard Stevens, Volumes I-III (Addison-Wesley, 1992-1996).
Perl uses a simple signal-handling model: the
%SIG
hash contains references (either symbolic or
hard) to user-defined signal handlers. Certain events cause the
operating system to deliver a signal to the affected process. The
handler corresponding to that event is called with one argument
containing the name of the signal that triggered it. To send a signal
to another process, you use the kill
function.
Think of it as sending a one-bit piece of information to the other
process.[2] If that process has installed a signal handler for that
signal, it can execute code when it receives the signal. But there's
no way for the sending process to get any sort of return value, other
than knowing that the signal was legally sent. The sender receives no
feedback saying what, if anything, the receiving process did with the
signal.
We've classified this facility as a form of IPC, but in fact, signals can come from various sources, not just other processes. A signal might also come from your own process, or it might be generated when the user at the keyboard types a particular sequence like Control-C or Control-Z, or it might be manufactured by the kernel when a special event transpires, such as when a child process exits, or when your process runs out of stack space or hits a file size or memory limit. But your own process can't easily distinguish among these cases. A signal is like a package that arrives mysteriously on your doorstep with no return address. You'd best open it carefully.
Since entries in the %SIG
array can
be hard references, it's common practice to use anonymous functions
for simple signal handlers:
$SIG{INT} = sub { die " Outta here! " }; $SIG{ALRM} = sub { die "Your alarm clock went off" };
Or you could create a named function and assign its name or reference to the appropriate slot in the hash. For example, to intercept interrupt and quit signals (often bound to Control-C and Control- on your keyboard), set up a handler like this:
sub catch_zap { my $signame = shift; our $shucks++; die "Somebody sent me a SIG$signame!"; } $shucks = 0; $SIG{INT} = 'catch_zap'; # always means &main::catch_zap $SIG{INT} = &catch_zap; # best strategy $SIG{QUIT} = &catch_zap; # catch another, too
Notice how all we do in the signal handler is set a global
variable and then raise an exception with die
.
Whenever possible, try to avoid anything more complicated than that,
because on most systems the C library is not re-entrant. Signals are
delivered asynchronously,[3] so calling any print
functions (or
even anything that needs to malloc(3) more
memory) could in theory trigger a memory fault and subsequent core
dump if you were already in a related C library routine when the
signal was delivered. (Even the die
routine is a
bit unsafe unless the process is executing within an
eval
, which suppresses the I/O from
die
, which keeps it from calling the C library.
Probably.)
An even easier way to trap signals is to use the
sigtrap
pragma to install simple, default signal
handlers:
use sigtrap qw(die INT QUIT); use sigtrap qw(die untrapped normal-signals stack-trace any error-signals);
The pragma is useful when you don't want to bother
writing your own handler, but you still want to catch dangerous
signals and perform an orderly shutdown. By default, some of these
signals are so fatal to your process that your program will just stop
in its tracks when it receives one. Unfortunately, that means that any
END
functions for at-exit handling and
DESTROY
methods for object finalization are not
called. But they are called on ordinary Perl
exceptions (such as when you call die
), so you can
use this pragma to painlessly convert the signals into exceptions.
Even though you aren't dealing with the signals yourself, your program
still behaves correctly. See the description of use
sigtrap
in Glossary,
for many more features of this pragma.
You may also set the %SIG
handler to
either of the strings "IGNORE
" or
"DEFAULT
", in which case Perl will try to discard
the signal or allow the default action for that signal to occur
(though some signals can be neither trapped nor ignored, such as the
KILL
and STOP
signals; see
signal (3), if you have it, for a list
of signals available on your system and their default behaviors).
The operating system thinks of signals as numbers
rather than names, but Perl, like most people, prefers symbolic names
to magic numbers. To find the names of the signals, list out the keys
of the %SIG
hash, or use the kill
-l command if you have one on your system. You can also use
Perl's standard Config
module to determine your
operating system's mapping between signal names and signal numbers.
See Config (3) for an example of this.
Because %SIG
is a global hash,
assignments to it affect your entire program. It's often more
considerate to the rest of your program to confine your signal
catching to a restricted scope. Do this with a
local
signal handler assignment, which goes out of
effect once the enclosing block is exited. (But remember that
local
values are visible in functions called from
within that block.)
{ local $SIG{INT} = 'IGNORE'; … # Do whatever you want here, ignoring all SIGINTs. fn(); # SIGINTs ignored inside fn() too! … # And here. } # Block exit restores previous $SIG{INT} value. fn(); # SIGINTs not ignored inside fn() (presumably).
Processes (under Unix, at least) are organized into
process groups, generally corresponding to an entire job. For
example, when you fire off a single shell command that consists of a
series of filter commands that pipe data from one to the other,
those processes (and their child processes) all belong to the same
process group. That process group has a number corresponding to the
process number of the process group leader. If you send a signal to
a positive process number, it just sends the signal to the process,
but if you send a signal to a negative number, it sends that signal
to every process whose process group number is the corresponding
positive number, that is, the process number of the process group
leader. (Conveniently for the process group leader, the process
group ID is just $$
.)
Suppose your program wants to send a hang-up signal
to all child processes it started directly, plus any grandchildren
started by those children, plus any greatgrandchildren started by
those grandchildren, and so on. To do this, your program first calls
setpgrp(0,0)
to become the leader of a new
process group, and any processes it creates will be part of the new
group. It doesn't matter whether these processes were started
manually via fork
, automaticaly via piped
open
s, or as backgrounded jobs with
system("cmd &")
. Even if those processes had
children of their own, sending a hang-up signal to your entire
process group will find them all (except for processes that have set
their own process group or changed their UID to give themselves
diplomatic immunity to your signals).
{ local $SIG{HUP} = 'IGNORE'; # exempt myself kill(HUP, -$$); # signal my own process group }
Another interesting signal is signal number
0
. This doesn't actually affect the target
process, but instead checks that it's alive and hasn't changed its
UID. That is, it checks whether it's legal to send a signal, without
actually sending one.
unless (kill 0 => $kid_pid) { warn "something wicked happened to $kid_pid"; }
Signal number 0
is the only signal
that works the same under Microsoft ports of Perl as it does in
Unix. On Microsoft systems, kill
does not
actually deliver a signal. Instead, it forces the target process to
exit with the status indicated by the signal number. This may be
fixed someday. The magic 0
signal, however, still
behaves in the standard, nondestructive fashion.
When a process exits, its parent is sent a
CHLD
signal by the kernel and the process becomes
a zombie[4] until the parent calls wait
or
waitpid
. If you start another process in Perl
using anything except fork
, Perl takes care of
reaping your zombied children, but if you use a raw
fork
, you're expected to clean up after yourself.
On many but not all kernels, a simple hack for autoreaping zombies
is to set $SIG{CHLD}
to
'IGNORE
'. A more flexible (but tedious) approach
is to reap them yourself. Because more than one
child may have died before you get around to dealing with them, you
must gather your zombies in a loop until there aren't any
more:
use POSIX ":sys_wait_h"; sub REAPER { 1 until waitpid(-1, WNOHANG) == -1) }
To run this code as needed, you can either set a
CHLD
signal handler for it:
$SIG{CHLD} = &REAPER;
or, if you're running in a loop, just arrange to call the reaper every so often. This is the best approach because it isn't subject to the occasional core dump that signals can sometimes trigger in the C library. However, it's expensive if called in a tight loop, so a reasonable compromise is to use a hybrid strategy where you minimize the risk within the handler by doing as little as possible and waiting until outside to reap zombies:
our $zombies = 0; $SIG{CHLD} = sub { $zombies++ }; sub reaper { my $zombie; our %Kid_Status; # store each exit status $zombies = 0; while (($zombie = waitpid(-1, WNOHANG)) != -1) { $Kid_Status{$zombie} = $?; } } while (1) { reaper() if $zombies; … }
This code assumes your kernel supports reliable signals. Old SysV traditionally didn't, which made it impossible to write correct signal handlers there. Ever since way back in the 5.003 release, Perl has used the sigaction (2) syscall where available, which is a lot more dependable. This means that unless you're running on an ancient operating system or with an ancient Perl, you won't have to reinstall your handlers and risk missing signals. Fortunately, all BSD-flavored systems (including Linux, Solaris, and Mac OS X) plus all POSIX-compliant systems provide reliable signals, so the old broken SysV behavior is more a matter of historical note than of current concern.
With these newer kernels, many other things will work
better, too. For example, "slow" syscalls (those that can block,
like read
, wait
, and
accept
) will restart automatically if interrupted
by a signal. In the bad old days, user code had to remember to check
explicitly whether each slow syscall failed with
$!
($ERRNO
) set to
EINTR
and, if so, restart. This wouldn't happen
just from INT
signals; even innocuous signals
like TSTP
(from a Control-Z) or
CONT
(from foregrounding the job) would abort the
syscall. Perl now restarts the syscall for you automatically if the
operating system allows it to. This is generally construed to be a
feature.
You can check whether you have the more rigorous POSIX-style
signal behavior by loading the Config
module and
checking whether $Config{d_sigaction}
has a true
value. To find out whether slow syscalls are restartable, check your
system documentation on sigaction (2)
or sigvec (3), or scrounge around your
C sys/signal.h file for
SV_INTERRUPT
or SA_RESTART
. If
one or both symbols are found, you probably have restartable
syscalls.
A common use for signals is to impose time limits on
long-running operations. If you're on a Unix system (or any other
POSIX-conforming system that supports the ALRM
signal), you can ask the kernel to send your process an
ALRM
at some point in the future:
use Fcntl ':flock'; eval { local $SIG{ALRM} = sub { die "alarm clock restart" }; alarm 10; # schedule alarm in 10 seconds eval { flock(FH, LOCK_EX) # a blocking, exclusive lock or die "can't flock: $!"; }; alarm 0; # cancel the alarm }; alarm 0; # race condition protection die if $@ && $@ !~ /alarm clock restart/; # reraise
If the alarm hits while you're waiting for the lock,
and you simply catch the signal and return, you'll go right back
into the flock
because Perl automatically
restarts syscalls where it can. The only way out is to raise an
exception through die
and then let
eval
catch it. (This works because the exception
winds up calling the C library's longjmp
(3) function, which is what really gets you out of the
restarting syscall.)
The nested exception trap is included because calling
flock
would raise an exception if
flock
is not implemented on your platform, and
you need to make sure to clear the alarm anyway. The second
alarm 0
is provided in case the signal comes in
after running the flock
but before getting to the
first alarm 0
. Without the second
alarm
, you would risk a tiny race condition--but
size doesn't matter in race conditions; they either exist or they
don't. And we prefer that they don't.
Now and then, you'd like to delay receipt of a signal
during some critical section of code. You don't want to blindly
ignore the signal, but what you're doing is too important to
interrupt. Perl's %SIG
hash doesn't implement
signal blocking, but the POSIX
module does,
through its interface to the sigprocmask
(2) syscall:
use POSIX qw(:signal_h); $sigset = POSIX::SigSet->new; $blockset = POSIX::SigSet->new(SIGINT, SIGQUIT, SIGCHLD); sigprocmask(SIG_BLOCK, $blockset, $sigset) or die "Could not block INT,QUIT,CHLD signals: $! ";
Once the three signals are all blocked, you can do whatever you want without fear of being bothered. When you're done with your critical section, unblock the signals by restoring the old signal mask:
sigprocmask(SIG_SETMASK, $sigset) or die "Could not restore INT,QUIT,CHLD signals: $! ";
If any of the three signals came in while blocked, they are
delivered immediately. If two or more different signals are pending,
the order of delivery is not defined. Additionally, no distinction
is made between having received a particular signal once while
blocked and having received it many times.[5] For example, if nine child processes exited while you
were blocking CHLD
signals, your handler (if you
had one) would still be called only once after you unblocked. That's
why, when you reap zombies, you should always loop until they're all
gone.
[1] Well, except for AF_UNIX
sockets.
[2] Actually, it's more like five or six bits, depending on how many signals your OS defines and on whether the other process makes use of the fact that you didn't send a different signal.
[3] Synchronizing signal delivery with Perl-level opcodes is scheduled for a future release of Perl, which should solve the matter of signals and core dumps.
[4] Yes, that really is the technical term.
[5] Traditionally, that is. Countable signals may be implemented on some real-time systems according to the latest specs, but we haven't seen these yet.