Credit: Andy McKay
Cookies that your browser has downloaded contain potentially useful information, so it’s important to know how to get at them. With IE, you need to access the registry to find where the cookies are, then read them as files:
from string import lower, find import re, os, glob import win32api, win32con def _getLocation( ): """ Examines the registry to find the cookie folder IE uses """ key = r'SoftwareMicrosoftWindowsCurrentVersionExplorerShell Folders' regkey = win32api.RegOpenKey(win32con.HKEY_CURRENT_USER, key, 0, win32con.KEY_ALL_ACCESS) num = win32api.RegQueryInfoKey(regkey)[1] for x in range(num): k = win32api.RegEnumValue(regkey, x) if k[0] == 'Cookies': return k[1] def _getCookieFiles(location, name): """ Rummages through all the filenames in the cookie folder and returns only the filenames that include the substring 'name'. name can be the domain; for example 'activestate' will return all cookies for activestate. Unfortunately, it will also return cookies for domains such as activestate.foo.com, but that's unlikely to happen, and you can double-check later to see if it has occurred. """ filemask = os.path.join(location, '*%s*' % name) filenames = glob.glob(filemask) return filenames def _findCookie(files, cookie_re): """ Look through a group of files for a cookie that satisfies a given compiled RE, returning the first such cookie found. """ for file in files: data = open(file, 'r').read( ) m = cookie_re.search(data) if m: return m.group(1) def findIECookie(domain, cookie): """ Finds the cookie for a given domain from IE cookie files """ try: l = _getLocation( ) except: # Print a debug message print "Error pulling registry key" return None # Found the key; now find the files and look through them f = _getCookieFiles(l, domain) if f: cookie_re = re.compile('%s (.*?) ' % cookie) return _findCookie(f, cookie_re) else: print "No cookies for domain (%s) found" % domain return None if _ _name_ _=='_ _main_ _': print findIECookie(domain='kuro5hin', cookie='k5-new_session')
While Netscape cookies are in a text file, which you can access as
shown in Recipe 11.9, IE keeps cookies as
files in a directory, and you need to access the registry to find
which directory that is. This recipe uses the
win32all
Windows-specific extensions to
Python for registry access; as an alternative, the
_winreg
module that is part of
Python’s standard distribution for Windows can be
used. The code has been tested and works on IE 5 and 6.
In the recipe, the _getLocation
function accesses the registry and finds and returns the directory IE
is using for cookies files. The
_getCookieFiles
function
receives the directory as an argument and uses standard module
glob
to return all filenames in the directory
whose names include a particular requested domain name. The
_findCookie
function opens and reads all such files in turn, until it finds one
that satisfies a compiled regular expression which the function
receives as an argument. It then returns the substring of the
file’s contents corresponding to the first
parenthesized group in the RE, or None
if no
satisfactory file is found. As the leading underscore in each of
these functions’ names indicates, these are all
internal functions, meant only as implementation details of the only
function this module is meant to expose, namely
findIECookie
, which appropriately uses the other
functions to locate and return a specific cookie’s
value for a given domain.
An alternative to this recipe could be to write a Python extension,
or use calldll
, to access the
InternetGetCookie
API function in
Wininet.DLL
, as documented on MSDN. However, the
added value of the alternative seems to be not worth the effort of
dropping down from a pure Python module to a C-coded extension.
Recipe 11.9; the Unofficial Cookie FAQ
(http://www.cookiecentral.com/faq/) is
chock-full of information on cookies; Documentation for
win32api
and win32con
in
win32all
(http://starship.python.net/crew/mhammond/win32/Downloads.html)
or ActivePython (http://www.activestate.com/ActivePython/);
Windows API documentation available from Microsoft (http://msdn.microsoft.com); Python Programming on Win32, by Mark Hammond and Andy Robinson
(O’Reilly); calldll
is available
at Sam Rushing’s page (http://www.nightmare.com/~rushing/dynwin/).