Credit: Jürgen Hermann
To build a URL within a script, you need information such as the
hostname and script name. According to the CGI standard, the web
server sets up a lot of useful information in the process environment
of a script before it runs the script itself. In a Python script, we
can access the process environment as os.environ
,
an attribute of the os
module:
import os, string def isSSL( ): """ Return true if we are on an SSL (https) connection. """ return os.environ.get('SSL_PROTOCOL', '') != '' def getScriptname( ): """ Return the scriptname part of the URL ("/path/to/my.cgi"). """ return os.environ.get('SCRIPT_NAME', '') def getPathinfo( ): """ Return the remaining part of the URL. """ pathinfo = os.environ.get('PATH_INFO', '') # Fix for a well-known bug in IIS/4.0 if os.name == 'nt': scriptname = getScriptname( ) if string.find(pathinfo, scriptname) == 0: pathinfo = pathinfo[len(scriptname):] return pathinfo def getQualifiedURL(uri = None): """ Return a full URL starting with schema, servername, and port. Specifying uri causes it to be appended to the server root URL (uri must start with a slash). """ schema, stdport = (('http', '80'), ('https', '443'))[isSSL( )] host = os.environ.get('HTTP_HOST', '') if not host: host = os.environ.get('SERVER_NAME', 'localhost') port = os.environ.get('SERVER_PORT', '80') if port != stdport: host = host + ":" + port result = "%s://%s" % (schema, host) if uri: result = result + uri return result def getBaseURL( ): """ Return a fully qualified URL to this script. """ return getQualifiedURL(getScriptname( ))
There are, of course, many ways to manipulate URLs, but many CGI scripts have common needs. This recipe collects a few typical high-level functional needs for URL synthesis from within CGI scripts. You should never hardcode hostnames or absolute paths in your scripts, of course, because that would make it difficult to port the scripts elsewhere or rename a virtual host. The CGI environment has sufficient information available to avoid such hardcoding, and, by importing this recipe’s code as a module, you can avoid duplicating code in your scripts to collect and use that information in typical ways.
The recipe works by accessing information in
os.environ
, the attribute of
Python’s standard os
module that
collects the process environment of the current process and lets your
script access it as if it was a normal Python dictionary. In
particular, os.environ
has a
get
method, just like a normal dictionary does,
that returns either the mapping for a given key or, if that key is
missing, a default value that you supply in the call to
get
. This recipe performs all accesses through
os.environ.get
, thus ensuring sensible behavior
even if the relevant environment variables have been left undefined
by your web server (this should never happen, but, of course, not all
web servers are bug-free).
Among the functions presented in this recipe,
getQualifiedURL
is the one you’ll use most often. It transforms a
URI into a URL on the same host (and with the same schema) used by
the CGI script that calls it. It gets the information from the
environment variables HTTP_HOST
,
SERVER_NAME
, and SERVER_PORT
.
Furthermore, it can handle secure (https
) as well
as normal (http
) connections, and it selects
between the two by using the isSSL
function, which
is also part of this recipe.
Suppose you need to redirect a visiting browser to another location on this same host. Here’s how you can use the functions in this recipe, hardcoding only the redirect location on the host itself, but not the hostname, port, and normal or secure schema:
# an example redirect header: print "Location:", getQualifiedURL("/go/here")
Documentation of the standard library module os
in
the Library Reference; a basic introduction to
the CGI protocol is available at http://hoohoo.ncsa.uiuc.edu/cgi/overview.html.