Like the majority of open source software, Apache gives you two choices when it comes to installation: you can download the source and compile it yourself, or you can install a binary package. The binary packages are precompiled versions of Apache that come either from your software vendor (as is the case with most Linux distributions) or directly from the Apache Foundation itself (see http://www.apache.org/dist/httpd/binaries). Your first decision is therefore whether to choose a binary or a source distribution.
A source distribution has a number of advantages. First and foremost, it allows you to configure exactly what you want compiled into Apache and what you don't want. The configuration options before the actual compilation give you the ability to add or remove dozens of modules distributed with Apache. Another good reason for going with a source installation is that you may have third-party modules in source that must be compiled against the Apache source tree. If this is the case, you have to use the Apache source code in order to compile these modules.
The biggest disadvantage of a source installation (other than the potential complexity of the installation itself, if you are not familiar with such things) is package management. The easiest way to maintain a Unix-based system is to utilize the built-in package management tools to install, upgrade, and remove software for your operating system. If you install any package from source , you're bypassing the package management system. The onus then lies on you to ensure that the package you've installed is upgraded and documented correctly. Too many times have systems been migrated, only for users to discover that a certain compiled package was not included in the migration because the sysadmin failed to document its existence and didn't include it in the migration.
So although you may be in specific situations in which you need to maintain a source install of Apache, in general it is recommended to go with your vendor's binary package, for maintainability's sake. Because binary installation is usually trivial—especially when covered by package management—this chapter documents only the source installation.
The procedure for installing Apache from source follows the standard ./configure, make, make install routine that should be familiar to most Linux users. The first decision you must make when compiling Apache is which version you will use. At the time of this writing, there are two major versions of Apache supported: Version 2.2.0 and Version 1.3.34. Version 1.3.34 is the latest version of the Apache source that started life as a fork of the original web server, NCSA's httpd. The 2.0 Apache source was a major rewrite of Apache from the ground up, sharing little code with the 1.3 branch. The Apache Foundation recommends that all users use the latest version in the 2.0 branch, although the 1.3 branch is still maintained for legacy purposes. This section will focus on compiling Apache 2.2.0, although the procedure given here will work with Apache 1.3 as well.
The Apache source code can be downloaded from http://httpd.apache.org. Select the Download from a mirror link and you will see a link to the latest source code. The file will be either httpd-2.2.0.tar.gz or httpd-2.2.0.tar.bz, representing the two popular compression methods used in the Unix world, gzip and bzip2. If you're unsure of which to get, download httpd-2.2.0.tar.gz, as gunzip has been around longer and is more likely to be installed. Once you have downloaded the file, go to a shell prompt as the root user and type:
# tar -xvzf httpd-2.2.0.tar.gz
This will extract the contents of the archive into the directory httpd-2.2.0. Change to this directory and run the configure --help command to view the compile-time options :
#cd httpd-2.2.0
#./configure --help | more
'configure' configures this package to adapt to many kinds of systems. Usage: ./configure [OPTION]... [VAR=VALUE]... To assign environment variables (e.g., CC, CFLAGS...), specify them as VAR=VALUE. See below for descriptions of some of the useful variables. Defaults for the options are specified in brackets. Configuration: -h, --help display this help and exit --help=short display options specific to this package --help=recursive display the short help of all the included packages -V, --version display version information and exit ... Some influential environment variables: CC C compiler command CFLAGS C compiler flags LDFLAGS linker flags, e.g. -L<lib dir> if you have libraries in a nonstandard directory <lib dir> CPPFLAGS C/C++ preprocessor flags, e.g. -I<include dir> if you have headers in a nonstandard directory <include dir> CPP C preprocessor Use these variables to override the choices made by 'configure' or to help it to find libraries and programs with nonstandard names/locations.
The output was drastically shortened here; there are many compile-time options for Apache . For complete documentation on all of the options, visit http://httpd.apache.org/docs/2.2/programs/configure.html. We will touch here on some of the more important options that you probably want to enable.
The majority of the Apache compile-time options deal with modules. Modules are parts of the Apache source code that can be either compiled directly into the httpd binary or compiled as a separate, shared object file. Apache refers to these as DSOs, or Dynamically Shared Objects. The advantages of compiling options as DSOs are that they can be dynamically loaded or unloaded from memory when they are not needed and that third-party modules compiled as DSOs do not necessarily have to be upgraded when Apache itself is upgraded. Modules can be enabled individually or enabled in one configure option as either built-ins or DSOs. Here are some common configure options:
pref
Specifies where you want Apache installed. Common options are /opt/apache and /var/opt/apache. If not specified, defaults to /usr/local/apache.
Enable the loading of DSOs in Apache . Without this option enabled, all your modules will have to compiled directly into the binary.
Allows you to manipulate HTTP headers in your Apache configuration.
Turns Apache into a proxy server. It is not a full-featured proxy server and lacks the sophistication of a package like Squid (discussed later in this chapter), but it does work.
Allows Apache to handle connections encrypted with SSL, the Secure Sockets Layer. Recommended if you will serve web pages with sensitive information or if you require authentication (username/password).
Required if you want Apache to speak the HTTP protocol (as you probably do).
Enables the WebDAV module, which allows you to use special client software to edit web pages directly through Apache. For example, with the WebDAV client built into Microsoft Windows, you can connect to a WebDAV-enabled Apache server and map a drive to the web server, editing HTML pages as if they were on a local drive.
Enables the execution of Common Gateway Interface scripts. These are often written in scripting languages such as Perl or a shell.
Enables the Apache rewrite engine. This gives you the ability to apply regular expression based rules to redirect HTTP requests. This is a very powerful module and is quite useful.
Now that we've discussed some common options, we're ready to run the configure script:
# ./configure - prefix=/opt/apache --enable-so -enable-headers --enable-proxy --enable-ssl --enable-http --enable-dav --enable-cgi --enable-rewrite
This configure line will enable all of the modules previously listed and, when we run the make command, will cause all the modules to be compiled directly into the Apache binary. If we want to compile them all as DSOs instead, our configure line looks like this:
# ./configure -prefix=/opt/apache -enable-so -enable-http
-enable-mods-shared="headers proxy ssl dav cgi rewrite"
Note that so
and http
cannot be compiled as DSOs. Now that our configure command is complete and without errors (we hope), we are ready to compile Apache. The configure command has created a Makefile in the current directory. This file contains all of the instructions that are necessary to build Apache from source. The make command reads this file and follow the instructions. Note that the make command is not shipped with Apache; it comes with your Linux distribution as part of the program development tools.
# make
Depending upon the speed of your system, this process can take some time. Be sure to pay attention to any warnings or errors that are output. If the make command generates a fatal error, it is probably because a module you enabled requires certain software that is not available on your system. If this is case, either install that software and run make again, or rerun the configure command, removing the offending option.
Assuming the make command completed with no errors, you are now ready to complete the final step and install Apache onto your system:
# make install
In our case, this will create the directory /opt/apache and copy everything necessary to it.
Once the make install command is complete, change to the /opt/apache/bin directory and verify that you have a binary file called httpd there:
#cd /opt/apache/bin
#file httpd
httpd: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), for GNU/Linux 2.2.5, dynamically linked (uses shared libs), not stripped
This is the actual Apache binary. It has some command-line options that give you more information:
# ./httpd -h
Usage: ./httpd [-D name] [-d directory] [-f file]
[-C "directive"] [-c "directive"]
[-k start|restart|graceful|graceful-stop|stop]
[-v] [-V] [-h] [-l] [-L] [-t] [-S]
Options:
-D name : define a name for use in <IfDefine name> directives
-d directory : specify an alternate initial ServerRoot
-f file : specify an alternate ServerConfigFile
-C "directive" : process directive before reading config files
-c "directive" : process directive after reading config files
-e level : show startup errors of level (see LogLevel)
-E file : log startup errors to file
-v : show version number
-V : show compile settings
-h : list available command line options (this page)
-l : list compiled in modules
-L : list available configuration directives
-t -D DUMP_VHOSTS : show parsed settings (currently only vhost settings)
-S : a synonym for -t -D DUMP_VHOSTS
-t -D DUMP_MODULES : show all loaded modules
-M : a synonym for -t -D DUMP_MODULES
-t : run syntax check for config files
These options are particularly useful if you find yourself maintaining an Apache installation that you yourself did not compile.
Now change to the /opt/apache/modules directory to see the DSOs that we compiled:
#cd /opt/apache/modules
#ls -1
httpd.exp mod_dav.so mod_proxy_ajp.so mod_proxy_connect.so mod_proxy_http.so mod_rewrite.so mod_cgi.so mod_headers.so mod_proxy_balancer.so mod_proxy_ftp.so mod_proxy.so mod_ssl.so
Here we see all of the options that we compiled as modules. We see from the file command that these are shared objects, ready to be loaded into memory:
# file mod_cgi.so
mod_cgi.so: ELF 32-bit LSB shared object, Intel 80386, version 1 (SYSV), not stripped
Now that we've verified that we've compiled Apache the way we want, it's time to do a test run. As noted before, the file /opt/apache/bin/httpd is the actual Apache binary, so you can start the web server simply by running this command:
# /opt/apache/bin/httpd
The process will fork to the background and return you to a shell prompt. However, this is not the recommended way to manage the Apache process. Most Linux systems will provide you with a standard set of shell scripts that you can use to start and stop processes on your system. These shell scripts are usually called from the various runlevels, which means they start processes when the system boots and stops them when it shuts down. The advantage to this approach is that although you may have software on your system from many different sources, written in many different ways, the controlling shell scripts allow you to manage each process as if it were written in the same format. The shell scripts all take the same arguments, hiding the various command-line intricacies of the individual processes. While these scripts are usually provided by your Linux distribution, a very similar script ships with the Apache source code. It is /opt/apache/bin/apachectl and it takes three arguments: start, stop and restart. This is the preferred way to manage your Apache process.
# /opt/apache/bin/apachectl start
If all goes well, you will be immediately returned to a prompt. How do you know whether the web server is running? There are three easy ways to figure that out: verify the process exists, check the log file, and look for open network ports.
To verify the Apache process exists, we use the ps command. The following set of options is valid on most Linux systems:
# ps -aux | grep http
root 21640 0.3 0.6 7756 3232 ? Ss 21:58 0:00 /opt/apache/bin/httpd -k start
daemon 21641 0.0 0.6 7756 3264 ? S 21:58 0:00 /opt/apache/bin/httpd -k start
daemon 21642 0.0 0.6 7756 3260 ? S 21:58 0:00 /opt/apache/bin/httpd -k start
daemon 21643 0.0 0.6 7756 3260 ? S 21:58 0:00 /opt/apache/bin/httpd -k start
daemon 21644 0.0 0.6 7756 3260 ? S 21:58 0:00 /opt/apache/bin/httpd -k start
daemon 21645 0.0 0.6 7756 3260 ? S 21:58 0:00 /opt/apache/bin/httpd -k start
We see from this output that there are six copies of the httpd binary currently running.
To check the log files, change to the directory /opt/apache/logs and run the tail command on error_log, which is where Apache not only logs its errors, but also records every time it is stopped, started, or restarted.
#cd /opt/apache/logs
#tail error_log
[Tue Jan 03 21:58:18 2006] [notice] Apache/2.2.0 (Unix) mod_ssl/2.2.0 OpenSSL/0.9.7f DAV/2 configured -- resuming normal operations
Finally, we can verify that Apache is listening on a network port. By default, Apache listens on TCP port 80, which is the default port used for HTTP communication. To verify this, we use the netstat command:
# netstat -anp | grep -i httpd
tcp 0 0 :::80 :::* LISTEN 21640/httpd
This tells us that the httpd binary (process ID # 21640) is listening on TCP port 80.
Success! We have successfully installed and started the Apache web server! Now, how do we configure it to do our bidding?