13.5. Security

Not long ago, security was still considered optional by many people. Today that attitude is widely recognized as dangerous to others as well as oneself. The issue of security on the Internet has garnered universal attention, and one of the ways a host can be broken into is through a poorly written CGI program.

Don't let yours be one of them.

13.5.1. Taint mode

Perl provides a powerful mechanism for securing your CGI programs. It's called taint mode, and no program you put on the Web should be without it. You invoke it with the -T flag, making the first line of your scripts

#!/usr/bin/perl -wT

(Of course, the path to your perl may differ.)

Taint mode doesn't actually do anything by itself to secure your program. What it does is force you to address every place where a security hole could occur. You see, the chief cause of security holes in CGI programs is malicious user inputs being used to affect things outside of your program. If you've never seen how easy this is, you're in for a shock.

Let's say that your e-commerce Web site contains a feedback form for a user to input their e-mail address and a message of praise. Unfortunately, let's say that your shipping department sent a customer a Barbie doll instead of the Sony PlayStation he ordered, and rather than praise, this soon-to-be-former customer has a different message for you. In the e-mail address field, he types 'rm *'. Your script, after reading the user input into a variable $email, reasonably enough sends a response:

open MAIL, "|$SENDMAIL $email" or die $!;

(We'll talk about better ways of signaling errors shortly.) Net result: Chaos. If you're running on DOS instead, don't think you're safe; that user could have entered ; erase *.*. What you need to do is check the e-mail address that's been entered and make sure it won't cause that kind of problem. Now, suppose we assume for a moment that valid e-mail addresses match the pattern

w[w*%.:-]*@w[w.-]*w

(That's not quite true, but it'll do for our example.) After setting $email from the user input, we could massage it:

($email) = $email =~ /(w[w*%.:-]*@w[w.-]*w)/;
unless ($email)
   {
   # Code to handle no valid email being entered
   }

From the user's input, we extracted an e-mail address that won't cause any nasty side effects when passed to our mail program. If we don't find anything in the input matching that, we can treat it as if nothing at all was entered.

You might also choose to compare the result of the match with what they entered, and if they differ, grumble about a “nonstandard” address being entered. Before you use that language, however, consider this caveat: the preceding pattern is a grossly simplified version of what it really takes to match a valid e-mail address. The RFC 822 standard (http://www.ietf.org/rfc/rfc0822.txt) specifies a complicated syntax that includes several ways of embedding comments that are arguably not part of the address at all and unlikely to be entered by a user in a Web form. A regular expression to match it is more than 6K in length and is the tour de force conclusion of Jeffrey Friedl's seminal book, Mastering Regular Expressions (O'Reilly, 1997). A slighly shorter and somewhat more practical approach is contained in Chapter 9 of the second edition of CGI Programmming with Perl by Scott Guelich, Shishir Gundavaram, and Gunther Birznieks (O'Reilly, 2000), but even this book acknowledges that its algorithm's value is principally instructional. (It doesn't confirm that the address can receive e-mail; only an attempt to send mail there can do that.)

What taint mode does is force you to launder input data as we just described. Any data that comes from outside your program—including even environment variables or directory listings—has associated with it a special flag that marks it as tainted. Any attempt to use tainted data to affect something outside of your program—such as input to an external program—results in a run-time exception before that can happen. (The error will mention an “insecure dependency.”) And if you use a tainted variable in an expression that is assigned to another variable, that variable becomes tainted too.

The only way to derive untainted data from tainted data is to perform a regular expression match with capturing parentheses on it and use the resulting $1, $2, etc. It is assumed that if you have gone to this much trouble, you have constructed a regular expression that will result in safe data. Perl does not—and cannot—check that you have really done so. If you simply do

($untainted) = $tainted =~ /(.*)/s;

then this is the equivalent of putting a gun to your head and pulling the trigger without looking in the chamber first.

Create CGI programs with taint checking from the beginning. Retrofitting -T onto existing programs is tedious and frustrating.


Taint mode is utterly paranoid. One of the consequences of environment variables being tainted is that your path is tainted, so any attempt to run an external program fails with an insecure dependency in path unless you untaint $ENV{PATH}, which usually means setting it explicitly to a (colon- or semicolon-separated) list of directories you trust.

13.5.2. Debugging in Taint Mode

If you syntax check a program that has -T in its #! line, you'll see something like this:

$ perl -c foo.cgi
Too late for "-T" option at foo.cgi line 1.

This is caused by an obscure feature of Perl's implementation that requires taint mode to be turned on really early in its startup process. (If you remembered Perl of Wisdom #10—Use use diagnostics to explain error messages—you would have seen the explanation.) Just add -T:

$ perl -cT foo.cgi
					

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset