C H A P T E R  6

Form Design and Management

Web-based forms are a common source of new data for an application. Generally, this data is unstructured and will often need to be massaged, formatted, or otherwise conditioned before being stored. Data may also be coming from a potentially unreliable source. This chapter will demonstrate methods of capturing data from web forms, validating the input fields using JavaScript, passing data to PHP via an AJAX request, and maintaining data integrity within your data-storage services. We shall also give you advice about manipulating images, integrating multiple languages, and using regular expressions.

Data Validation

There are two main areas where data validation is performed in web-based forms, and these serve two distinct functions. The first type of validation occurs in the form itself on the client side using JavaScript, and the second type occurs when PHP receives the data on the server side through a GET or POST request.

The role of JavaScript validation is twofold, and both actions occur client side. It can be used to notify the client (web site user) of suggestions and warnings about the data entered, and to put the data into a consistent pattern that the receiving PHP script is looking for. PHP validation concentrates more on maintaining the integrity of the received data while manipulating it to be consistent and compliant with data that has already been stored.

We will define a form and two form elements for the JavaScript validation—one accepting a required first and last name, the second accepting a phone number with an optional area code. A submit button will not be included in the form for this example; instead, it will be handled by JavaScript in the search function and will be activated when an onblur event occurs. This will be discussed later in the chapter.

In Listing 6-1, we use the GET method so that we can have the submitted values in the address bar of a browser. This will allow for quick testing with other data and bypass the JavaScript validation.

Listing 6-1. Validation Example Using the GET Method

<form method='GET' action='script7_1.php' name='script'>
<input type='text' id='name' name='name' onkeyup='validate(this);' onfocus='this.select();'images
onblur='search();' value='First and Last Name' />
<input type='text' id='phone' name='phone' onkeyup='validate(this);' onfocus=images
'this.select();' onblur='search();' value='Phone Number' />
</form>

After testing is completed, the form can be switched to a POST method. A clean URL look can be achieved using the POST method, whereas the form parameters are not visible to the client (for example, http://domain.com/dir/ or http://domain.com/dir/script.php). Alternatively, if the submitted page may be bookmarked by the client, it can be better to use the GET method, as the client won't be required to “re-post” the form data again when revisiting the web page. When using the GET method, the URL of the browser after form submission will resemble http://domain.com/?var1=exa&var2=mple.

images Note You can use either the POST or GET method, depending on project requirements. It is important to remember that neither method is more secure than the other; both can be exploited. However, using POST may be considered slightly more secure, as regular users cannot obtain different results by manipulating the query string. Be sure to read Chapter 11 on security for more details.

As the client types in the text boxes in Listing 6-1, the JavaScript validation function is called by the onkeyup event and passes this as a reference. This will allow JavaScript to have access to all the element properties that are needed to perform the validation. The onblur event will attempt to submit the form using AJAX if both fields pass required validation. In Chapter 15, we will be examining how to perform an AJAX request. The onfocus event selects the text already entered into the text box so that the client does not have to delete the data that has been entered already. This is achieved by calling the select() method in JavaScript and using the current element's properties (onfocus='this.select();').

We will define the validate JavaScript function, which accepts one parameter (see Listing 6-2). The validation will consist of regular expression pattern matching. We will explain how to build regular expressions later in the chapter. For now, we will concentrate on basic structure.

Listing 6-2. The validate JavaScript Function

function validate(a){
    if(a.value.length>3)
    switch(a.name){
        case 'name':
            if(a.value.match(/^[a-zA-Z-]+ [a-zA-Z-]+$/)){
                /* ... successful match code for name ... */
                return true;
            }else{
                /* ... no match code for name ... */
            }
        break;
        case 'phone':
            if(a.value.match((/^(((|[)?d{3}?(]|))?(s|-)?)?d{3}(s|-)?d{4}$/)){
                /* ...successful match code for phone ... */
                return true;
            } else{
                /* ... no match code for phone ... */
            }
        break;
    }
    return false;
}//validate function

The validate function only begins to perform checks on the form input field if the length of the string is greater than three. The validation threshold must be adjusted if the data set being validated doesn't contain enough characters. If the code executed in validation makes AJAX requests, limiting the length of the string also helps to prevent unnecessary stress on server resources that may be performing database lookups or running through algorithms.

images Note In Listing 6-2, we are assuming the data being collected is greater than three characters in length. Often though, data being gathered from forms contain strings and numbers that are three or fewer characters. In this case, the a.value.length>3 if statement could be moved inside the case statement and adjusted to match each of the patterns for the fields being validated.

Next, we will use the name of the form element to determine which expression the value is required to match. By returning either true or false, we can also use this function for our search function. We match the text box value against the expression by calling the string's match function. If the expression matches successfully, the matching string is returned. If a match is not present, null is returned.

We will define the search function in Listing 6-3, which will validate the two form fields, and then in Listing 6-4 we will initiate an AJAX request to pass the form values to a PHP script.

Listing 6-3. Defining the search Function

function search(){
    if(validate(document.getElementById('name'))images
        &&validate(document.getElementById('phone'))){
        //build and execute AJAX request
    }
}//save function

With the search function, the required parameters are validated first by calling the validate function and passing in the element properties. The actual form submission is performed through an AJAX request.

If the name and phone text box values are validated, the form will request AJAX to submit a URL similar to http://localhost/script7_1.php?name=john+smith&phone=(201) 443-3221. We can now begin to build the PHP validation component. Since the attributes are in the URL, we can initially test different values by manually adjusting the URL for known exceptions and accepted formats. For example, we test the name validation performed by PHP using the following URLs:

http://localhost/script7_1.php?name=Shérri+smith&phone=(201) 443-3221
http://localhost/script7_1.php?name=john+o'neil&phone=(201) 443-3221
http://localhost/script7_1.php?name=john+(*#%_0&phone=(201) 443-3221

We could then test the phone validation performed by PHP using the next set of URLs:

http://localhost/script7_1.php?name=john+smith&phone=2014433221
http://localhost/script7_1.php?name=john+smith&phone=john+smith
http://localhost/script7_1.php?name=john+smith&phone=201 443-3221 ext 21

We can now introduce the PHP validation code in Listing 6-4.

Listing 6-4. PHP Validation

<?php
$formData=array();//an array to house the submitted form data
foreach($_GET as $key => $val){
        $formData[$key]=htmlentities($val,ENT_QUOTES,'UTF-8'),
}
if(isset($formData['name'])&&isset($formData['phone'])){
    $expressions=array('name'=>"/^[a-zA-Z-]+ [a-zA-Z-]+$/",
                                        'phone'=>"/^(((|[)?d{3}?(]|))?(s|-)?)?d{3}images
(s|-)?d{4}$/"
                                 );
    if(preg_match($expressions['name'],$formData['name'],$matches['name'])===1 &&images
       preg_match($expressions['phone'],$formData['phone'],$matches['phone'])===1){
        /*      code, do something with name and phone  */
    }
}
?>

The function preg_match accepts a regular expression, followed by a string to match against the expression, and then followed by an array that is filled with the results of the match.

There are several handy PHP Extensions that also help with data validation, sanitizing, and expression matching. Data filtering can be a nice and consistent way to clean and validate data. It includes several common functions for validating client-entered data. Listing 6-5 uses URL and e-mail validate filters with the PHP filter_var function of the Filter library. The filter_var function is used by passing a string to the function along with either a sanitize filter or a validate filter. The sanitize filter removes unsupported characters from the string, where the validate filter ensures the string is formed correctly and contains the proper data type.

Listing 6-5. The PHP filter_var Function

<?php

//string(15) "[email protected]"
var_dump(filter_var('[email protected]', FILTER_VALIDATE_EMAIL));

//string(18) "[email protected]"
var_dump(filter_var('[email protected]', FILTER_VALIDATE_EMAIL));

//string(20) "[email protected]"
var_dump(filter_var('[email protected]', FILTER_VALIDATE_EMAIL));

//bool(false)
var_dump(filter_var('[email protected]', FILTER_VALIDATE_EMAIL));

//bool(false)
var_dump(filter_var('email@domain', FILTER_VALIDATE_EMAIL));

//bool(false)
var_dump(filter_var('example.com', FILTER_VALIDATE_URL));

//bool(false)
var_dump(filter_var('www.example.com', FILTER_VALIDATE_URL));

//string(22) "http://www.example.com"
var_dump(filter_var('http://www.example.com', FILTER_VALIDATE_URL));

//string(23) "http://www.e#xample.com"
var_dump(filter_var('http://www.e#xample.com', FILTER_VALIDATE_URL));

//bool(false)
var_dump(filter_var('www example com', FILTER_VALIDATE_URL));

//bool(false)
var_dump(filter_var('www.ex#ample.com', FILTER_VALIDATE_URL));

?>

Sanitize filters are useful for preparing data to be consistent. They also use the filter_var function. Listing 6-6 uses URL and e-mail validate filters.

Listing 6-6. Making Data Consistent with URL and E-mail Validate Filters

<?php

//string(15) "[email protected]"
var_dump(filter_var('[email protected]', FILTER_SANITIZE_EMAIL));

//string(17) "[email protected]"
var_dump(filter_var('[email protected]é.ab', FILTER_SANITIZE_EMAIL));

//string(20) "[email protected]"
var_dump(filter_var('em-ail@examp"le.co.uk', FILTER_SANITIZE_EMAIL));

//string(16) "[email protected]"
var_dump(filter_var('[email protected]', FILTER_SANITIZE_EMAIL));

//string(13) "email@do^main"
var_dump(filter_var('email@do^main', FILTER_SANITIZE_EMAIL));

//string(11) "example.com"
var_dump(filter_var('example.com', FILTER_SANITIZE_URL));

//string(15) "www.example.com"
var_dump(filter_var(" www.example.com", FILTER_SANITIZE_URL));

//string(22) "http://www.example.com"
var_dump(filter_var('http://www.example.com', FILTER_SANITIZE_URL));

//string(23) "http://www.e#xample.com"
var_dump(filter_var('http://www.e#xample.com', FILTER_SANITIZE_URL));

//string(13) "wwwexamplecom"
var_dump(filter_var('www example com', FILTER_SANITIZE_URL));

?>

The Perl Compatible Regular Expression (PCRE) library includes some useful functions as well, such as regular expression find and replace, grep, match, and match all, and also a handy find and replace using a callback function.

In Listing 6-7, we will work with the preg_match_all function to identify all strings that start with an uppercase character followed by lowercase characters. We will use the var_export function to dump the results matched in the $matches array to the screen.

Listing 6-7. Using the preg_match_all Function

<?php

$str='A Ford car was seen at Super Clean car wash.';
preg_match_all('/[A-Z][a-z]+/',$str,$matches);
var_export($matches);

/*
array (
  0 =>
  array (
    0 => 'Ford',
    1 => 'Super',
    2 => 'Clean',
  ),
)
*/

?>

In Listing 6-7, preg_match_all is passed three parameters, the regular expression, the string to match against, and an array that houses the matches. We can also pass a flag to adjust the $matches array and an offset to start at a certain character position in the string, as shown in Listing 6-8.

Listing 6-8. Adjusting the $matches Array

<?php

$str='A Ford car was seen at Super Clean car wash.';

preg_match_all('/[A-Z][a-z]+/',$str,$matches,PREG_PATTERN_ORDER,5);
var_export($matches);

preg_match_all('/[A-Z][a-z]+/',$str,$matches,PREG_SET_ORDER);
var_export($matches);

preg_match_all('/[A-Z][a-z]+/',$str,$matches,PREG_OFFSET_CAPTURE);
var_export($matches);

/*
// PREG_PATTERN_ORDER,5
array (
  0 =>
  array (
    0 => 'Super',
    1 => 'Clean',
  ),
)

// PREG_SET_ORDER
array (
  0 =>
  array (
    0 => 'Ford',
  ),
  1 =>
  array (
    0 => 'Super',
  ),
  2 =>
  array (
    0 => 'Clean',
  ),
)

// PREG_OFFSET_CAPTURE
array (
  0 =>
  array (
    0 =>
    array (
      0 => 'Ford',
      1 => 2,
    ),
    1 =>
    array (
      0 => 'Super',
      1 => 23,
    ),
    2 =>
    array (
      0 => 'Clean',
      1 => 29,
    ),
  ),
)
*/
?>

The other functions in the PCRE library work in a very similar fashion and are useful for data validation, extraction, and manipulation related tasks.

Uploading Files / Images

Adding documents to a server is a common request. The documents generally reside on a remote computer or server and have to be moved to the hosting server. This can be done using form elements.

Before adding files to the server through web-based forms, it is a good idea to check out the PHP configuration file. This contains many settings that directly affect how a form works, what it can do, how much it can do, and for how long. Being comfortable and aware of these configuration settings is important when troubleshooting uploading and form related issues. Another common issue is having the proper permissions to access a directory on the receiving server.

Consider the form in Listing 6-9, which will allow for a document to be added either via a browse / upload process or via a URL from a server. In the form, a new form attribute, enctype, is defined, which defines how the form data is encoded. This is required when the form uses binary data, in this case the input type file.

Listing 6-9. Defining the enctype Attribute

<form action='script7_9.php' method='post' enctype='multipart/form-data'>
<input type='file' name='localfile' />
<input type='text' name='remoteurl' />
<input type='Submit' value='Add Document' name='Submit' />
</form>

The default value of enctype, if not defined within the form tag, is application/x-www-form-urlencoded, which will handle most input types with the exception of file upload and non-ASCII data. When the web site user submits this form, two different PHP superglobals will contain data: $_FILES for the localfile input field and $_POST for the remoteurl input field. $_FILES will contain metadata about the localfile. The metadata contained in $_FILES is shown in Table 6-1.

images

Before the file from localfile in Listing 6-9 is moved to its permanent storage location, it is good practice to ensure the file came from an HTTP POST. This can be done using the function is_uploaded_file(), which accepts one parameter, the temporary filename ['tmp_name']. To move the file to a directory after it has been uploaded, the move_uploaded_file() function can be used which accepts two parameters, the temporary name ['tmp_name'] and the destination name.

For the input remote URL, there are several options we can use to get the file from a remote URL. A convenient method for downloading files via HTTP, HTTPS, and FTP is using wget (a command line utility) through the shell_exec function, which executes a shell command and returns the output (if any). Other methods of downloading documents include socket related tools like fsocketopen or curl. A file could be downloaded using the following syntax:

shell_exec('wget '. escapeshellcmd($_POST['remoteurl']));

Notice the use of the escapeshellcmd function. This is used to escape commonly used malicious characters in an attempt to prevent the execution of arbitrary commands on the server.

Image Conversion and Thumbnails

An image is a common file type when working with web-based applications. These can be used in photo galleries, screenshots, and slideshows. In the previous section, you learned how to add documents to a server either from a browser upload form or through the command line utility wget combined with shell_exec. Now that the files reside on the server, we can begin to manipulate them to fit into the structure necessary for other applications.

PHP has an image library called GD. It contains a list of functions that can create and manipulate images. We will only touch on a small section of those with resizing and conversions. A function named php_info() can be used to check the version of GD installed on the server.

When creating a thumbnail, we will be creating a PNG copy of the original image, resized to a width of 200px and a height that will be varied depending on the original image. To get the size of the image, we will use the function getimagesize(), which accepts the image name as a parameter and returns an array of metadata including width, height, and mime. Let's consider Listing 6-10.

Listing 6-10. Using the getimagesize() Function

<?php
$imgName='image.jpg';
$thumbName='thumb.png';
$metaData=getimagesize($imgName);
$img='';

$newWidth=200;
$newHeight=$metaData[1]/($metaData[0]/$newWidth);

switch($metaData['mime']){
    case 'image/jpeg':
        $img=imagecreatefromjpeg($imgName);
    break;
    case 'image/png':
        $img=imagecreatefrompng($imgName);
    break;
    case 'image/gif':
        $img=imagecreatefromgif($imgName);
    break;    
    case 'image/wbmp':
        $img=imagecreatefromwbmp($imgName);
    break;    

}

if($img){
    $imgThumb=imagecreatetruecolor($newWidth,$newHeight);
    imagecopyresampled($imgThumb,$img,0,0,0,0,$newWidth,$newHeight,$metaData[0],$metaData[1]);
    imagepng($imgThumb, $thumbName);
    imagedestroy($imgThumb);
}
?>

The array named $metadata contains the MIME type, and the height and width of the original image. This information will be used to open the original image in GD. We can now determine the value of $newheight for the thumbnail image. The MIME type is sent through a switch statement so that we can handle different types of image files. The imagecreatefromjpg and similar functions will open the original image and return a resource handle for that image. The thumbnail image is created with a defined width and height using imagecreatetruecolor. When creating the copy of the image, the imagecopyresampled function accepts several parameters, the resource handle of the thumbnail, the resource handle for the original file, the x and y values for destination and source points, the new width and height, and the original width and height. The thumbnail is created with imagepng by providing the thumb resource and a new filename. The resource is finally destroyed using the imagedestroy function and passing the thumb resource.

The end result is a PNG image in our desired thumbnail format. In Listing 6-11, the outputs for both image.jpg and thumb.png from getimagesize() are shown.

Listing 6-11. Outputs of image.jpg and thumb.png from getimagesize()

//image.jpg
array (
  0 => 1600,
  1 => 1200,
  2 => 2,
  3 => 'width="1600" height="1200"',
  'bits' => 8,

  'channels' => 3,
  'mime' => 'image/jpeg',
)
//thumb.png
array (
  0 => 200,
  1 => 150,
  2 => 3,
  3 => 'width="200" height="150"',
  'bits' => 8,
  'mime' => 'image/png',
)

The thumbnail shown in Figure 6-1 was created by the script in Listing 6-10. The original image was a 1200px wide by 900px high JPEG, and the output image is a PNG with a width of 200px and a height of 150px.

images

Figure 6-1. The thumbnail created by Listing 6-10

Regular Expressions

Regular expressions are useful for describing patterns in text, and can be used in JavaScript, MySQL, and PHP. Before we can start building regular expressions (regex), we should first get a regular expression editor. On Windows, Regex Tester by antix.co.uk is free and simple to use. There are many other regular expression editors available, some with many more features. To get started, however, Regex Tester is a nice tool.

Table 6-2 lists the characters and their respective matches available for use in regex.

images

In Table 6-2, we see commonly used characters and what they match. In Listings 6-7 and 6-8, the regex [A-Z][a-z]+ is used. Based on Table 6-2, the expression now can be read as: match an uppercase alpha character once [A-Z], followed by a lowercase alpha character [a-z] repeated one or more times +. The hyphen used between the upper- and lowercase A and Z is interpreted as any character between and including A and Z. Another example of the hyphen is [a-e], which would match a, b, c, d, and e, but not f or g, or using digits [1-3], which would match 1, 2, or 3, but not 4 or 5. The expression [A-Z][a-z]{4} would match all five letter words that begin with an uppercase character.

In JavaScript, we can apply expressions to strings by using the String methods match, search, replace, and split as defined in Table 6-3. The RegExp object can also be used and has compile, exec, and test methods, all defined in Table 6-4.

images

images

In PHP, we can apply regular expressions to strings using the PCRE functions and perform a variety of functions (Table 6-5)..

images

As witness these examples using both PHP and JavaScript, regular expression can be very useful, and a nice tool to perform both simple and complicated data manipulation. Expressions can also be used in MySQL to perform complex searches. For example, a query could be "select * from contacts where civics regexp '[0-9]{5}'", which would return all the records that contain a valid 5 digit ZIP code in the civics column (or also any other 5 digit number).

Multi-Language Integration

It is always a surprise when viewing data that is expected to output in a certain way, and you come across those weird looking A's or question marks or other strange characters nested inside of what should be clean data. The majority of times, these are the result of some sort of encoding error. Whether it is from a form, a CSV file, or text extracted from a document, the script that entered the data wasn't expecting some character outside of a certain character set, probably either ISO-8859-1 or UTF-8.

images Note The ISO-8859-1 (Latin-1) character set contains 8-bit single byte ASCII based coded graphic characters sets, and includes both the standard and extended ASCII characters (0-255). The UTF-8 character set is a multibyte character encoding for unicode and contains characters from ISO-8859-1 including those with diacritics.

When you come across encoding issues, there are a few things to observe and identify. The source of the data is important, and if it is from a web page, it is a good idea to confirm that the proper Content-Type is defined. This can be found in the HEAD section of the XHTML. If we were to have the content type set to UTF-8, it would appear as follows:

<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />

The database can be another source of encoding problems. When creating a MySQL database, a collation is set. This is also the case on tables and rows. If a UTF-8 character is provided when the database isn't expecting it, the character may be misrepresented when stored. If we were to accept UTF-8 characters for MySQL, the collation might be utf8_general_ci (case insensitive) or perhaps utf8_unicode_ci. We can also request MySQL to assume UTF-8 character sets by executing the query 'set names utf8' after making the database connection.

Finally, the PHP code may not be properly identifying the correct encoding of a string. There are several functions and libraries that can help with encoding issues, and flags that can assume a certain character set. The function utf8_encode() will encode an ISO-88591 string to UTF-8. Another useful function for converting certain character encoding to another encoding is by using the mb_convert_encoding() function, which accepts a string, the to encoding type, and the from encoding type. The 'mb_*' functions are from the Multibyte String Function library and contain many different multibyte character functions, such as mb_substr (get part of a string), mb_strlen (get length of string), or mb_eregi_replace (regex find replace). Certain PHP functions also accept charset flags. For example, when using htmlentities(), we can pass a flag to specify the UTF-8 character set htmlentities($str,ENT_QUOTES,'UTF-8').

In addition to the MB functions, there is the iconv module located in the human language and character encoding support provided by PHP. The iconv functions, and specifically the iconv() function will convert a string to a defined character encoding. By using the function iconv("UTF-8","ISO-8859-1//IGNORE//TRANSLIT",$str), we can convert a UTF-8 string to its ISO-8859-1 equivalent. The //IGNORE and //TRANSLIT flags are used to specify how to handle non-convertible characters. The IGNORE flag silently removes unknown characters and all following characters, where TRANSLIT will attempt to guess the proper character to convert to.

Summary

When designing and coding web-based forms, there are a number of issues to keep in mind, such as data validation, maintaining data integrity, and manipulating files (including images). In this chapter, we showed you a range of techniques for addressing those concerns, supported by suitable code examples and reference information. You can validate input data both on the client, using JavaScript, and on the server, using PHP. These approaches are complementary, as client-side validation focuses on making sure the user enters acceptable data, while server-side validation looks more toward maintaining database integrity and consistency.

No serious developer can afford to dispense with regular expressions (regex), so we offer a brief introduction in the context of this book. Regex can be used in many situations to find, match, split, and replace strings or parts of them; we provide simple examples of this type in both JavaScript and PHP. Last but not least, the final brief section on Multi-Language Integration stresses the importance of making quite sure that data is transferred in expected and acceptable formats.

In the next chapter, we shall look into integrating PHP with databases other than classic relational databases; for example SQLite3, MongoDB, and CouchDB.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset