Internet Artifacts
Information in this Chapter
It can be argued that nothing demonstrates the concept of evidence dynamics better than Internet artifacts. On a modern end-user computer system, the bulk of the user’s interaction with the system will likely be related to Internet communication of some sort. Every click of a link, every bookmark, and every search query can leave telltale traces on the user’s system. This chapter examines the application-specific artifacts created by Web browsers and then moves on to delve into analysis of the contents of local mailbox formats.
If the bulk of a computer user’s time is spent on the Internet, then it’s like that nearly all (or at least a great deal) of that time is spent interacting with a Web browser. The modern Web browsing experience is much richer than the creators of the World Wide Web had envisioned. As an example, at the time of this writing Google has begun distributing netbook computer systems, which are nothing more than minimal laptop systems with a Linux kernel running the Chrome Browser. This device’s utility is based on the assumption that everything the user does—create and edit documents, etc.—will all occur via the Web. Even in seemingly unexpected cases, the analysis of Web browser artifacts can be a key factor of digital forensic analysis. For example, the authors have examined various compromised servers where the built-in Web browser was used to load additional tools onto the compromised server or to submit stolen data to a file sharing site. Going forward, knowledge of the forensic analysis of Web browsers will be crucial.
Microsoft ships its operating systems with the Internet Explorer (IE) Web browser as part of the base installation. IE has two primary areas where data of primary interest to forensic analysts are stored: in the index.dat “database” used by the Web browser and in the browser cache. These index.dat files are structured in a manner that has become known as the “MS IE Cache File” (MSIECF) format. The index.dat file contains a record of accessed URLs, including search queries, Web mail accesses, and so on, and is often considered the primary source of forensic information when it comes to IE Web browser analysis.
Various open source tools can be used to access and parse the contents of the index.dat file into a readable format. Perhaps one of the most well-known open source tools for parsing index.dat files is pasco from FoundStone (pasco can be downloaded from http://sourceforge.net/projects/fast/files/Pasco/). Note that Pasco has not been updated since 2004, but it is still widely used in many forensic live CD distributions. Joachim Metz has developed an updated library based on further reverse engineering of the MSIECF format, which is available at http://sourceforge.net/projects/libmsiecf/. The libmsiecf library contains two programs. Msiecfinfo displays basic information about parsed MSIECF files, and msiecfexport, extracts the entries contained within the MSIECF files. This software is currently in an alpha state and is only available for Unix-like systems.
In addition, the Win32::URLCache, written by Kenichi Ishigaki, can also be used to parse index.dat files. If you’re using ActiveState’s ActivePerl, the Perl Package Manager (ppm) command to install the module on an Internet-connected system is
C:perl>ppm install win32-urlcache
This ppm command will install the module, as well as all dependencies. This same module can be installed on other platforms using Perl’s CPAN functionality:
perl -MCPAN -e "install Win32::UrlCache"
Based on documentation provided along with the module, code used to parse an index.dat file might look like the following:
my $index = Win32::UrlCache->new($config{file});
foreach my $url ($index->urls) {
my $epoch = getEpoch($url->last_accessed);
my $hdr = $url->headers;
$hdr =~ s/ / /g;
my $descr = $url->url.":".$url->filename.":".$hdr;
print $epoch."|URL|".$config{system}."|".$config{user}."|".$descr." ";
}
Note that the getEpoch() function mentioned in the aforementioned code is a user-defined function that converts the time value from the index.dat file into a 32-bit Unix-formatted time value so that it can be included in a timeline.
The module is also capable of parsing out LEAK records, which are created when a history item is marked for deletion but the actual cached file is locked [1].
IE Favorites can also contain information that may be interesting or essential to a forensic analyst. “Favorites” are the IE version of bookmarks, providing an indication of a user’s movements across the Internet. A user’s favorites can be found (on Windows XP) in the “C:Documents and SettingsuserFavorites” directory. The user’s Favorites appear in the Internet Explorer version 8 browser as illustrated in Figure 7.1.
Figure 7.1 User’s IE 8 Favorites.
When a user profile is created (i.e., the account is created and the user logs in for the first time), the profiles Favorite’s folder is populated with certain defaults. As seen in Figure 7.1, the user has chosen to add the Google.com Web site as a Favorite site. These Favorites appear as URL shortcut files (filename.url); the Google URL shortcut contains the following text. Users can create folders in order to organize their Favorites into common groups or simply add Favorites to the default folder.
Contents of the Google.url Favorite appear as follow:
[DEFAULT]
BASEURL=http://www.google.com/
[InternetShortcut]
IDList=
IconFile=http://www.google.com/favicon.ico
IconIndex=1
[{000214A0-0000-0000-C000-000000000046}]
Prop3=19,2
Contents of the URL shortcut can be viewed easily in a text editor or output to the console using the “type” (Windows) or “cat” (Linux) commands.
In addition to the content of the Favorites file, an analyst may find value in the file MAC times, which will illustrate when the file was created and when the file was last accessed or modified. Depending on the type of examination being performed, this information may prove to be valuable.
Internet Explorer cookies can be found in Documents and Settings\%username%Cookies on Windows XP systems and in Users\%username%AppDataRoamingMicrosoftWindowsCookies on Vista and Windows 7 systems. Because Internet Explorer stores user cookies as discrete, plain text files per issuing host, these can be inspected directly. See the following for an example:
SaneID
3A345581BB019948
1536
3378255872
30795568
4048194256
30118489
*
While the content is plain text, some of the fields need to be deciphered to be of value, in particular lines 5 and 6 (the cookie’s expiration time) and 7 and 8 (the creation time). The open source tool galleta was developed for this task. Here is the same cookie, processed with galleta:
SITE VARIABLE VALUE CREATION TIME EXPIRE TIME FLAGS
geico.com/ SaneID 3A345581BB019948 12/02/2010 11:48:50 02/19/2020 06:28:00 1536
The browser’s cache contains files that are cached locally on the system as a result of a user’s Web browsing activity. On XP systems, these files are located in Documents and Settings\%username%Local SettingsTemporary Internet FilesContent.IE5. On Vista and Windows 7 systems they can be found in Users\%username%AppDataLocalMicrosoftWindowsTemporary Internet FilesContent.IE5.
Files cached locally are stored in one of four randomly named subdirectories. The MSIE Cache File located in this directory has all the information needed to map any files of interest located in the cache subdirectories with the URL the file was retrieved from. For example, in the following output from msiecfexport, we can see that the file “favicon[1].ico” in the O2XMPJ7 directory was retrieved from “login.live.com.”
Record type : URL
Offset range : 80000 - 80384 (384)
Location : https://login.live.com/favicon.ico
Primary filetime : Dec 04, 2010 04:12:53
Secondary filetime : Jun 15, 2010 22:12:26
Filename : favicon[1].ico
Cache directory index : 0 (0x00) (O2XM9PJ7)
For comparison’s sake, here is the same item as viewed by pasco, which produces tabbed-separated output.
URL https://login.live.com/favicon.ico 06/15/2010 15:12:26 12/03/2010 20:12:53 favicon[1].ico O2XM9PJ7 HTTP/1.1 200 OK Content-Length: 1150 Content-Type: image/x-icon ETag: "0411ed6d7ccb1:46b" PPServer: PPV: 30 H: BAYIDSLGN1J28 V: 0 ~U:user
Mozilla’s Firefox browser is the second most widely used browser in the world, after Internet Explorer. Like the tools discussed in this book, it is open source software, so it is used commonly on Linux desktops, but is used on OS X and Windows as well.
Firefox 3 stores history data in SQLite 3 database files, which are quite easy to process using open source tools. Firefox stores these along with a few other items of interest in a user-specific profile directory. Please reference Table 7.1 for a listing of the default location of the profile directory on different operating systems.
Table 7.1
Firefox Profile Locations
Operating System | Location |
Windows XP | C:Documents and Settings\%usernameLocal SettingsApplication DataMozillaFirefoxProfiles |
Windows Vista/7 | C:Users\%username%AppDataRoamingMozillaFirefoxProfiles |
Linux | /home/$username/.mozilla/firefox/Profiles |
OS X | /Users/$username/Library/Application Support/Firefox/Profiles/ |
In this directory you will find one or more folders and a file named profiles.ini.
The content of this file will be similar to the following:
[General]
StartWithLastProfile=1
[Profile0]
Name=default
IsRelative=1
Path=Profiles/fenkfs6z.default
When Firefox is started it will use this file to determine which profile directory to read from. In a multiple profile environment, StartWithLastProfile=1 directs Firefox to skip asking the user which profile to select and use the last-used profile by default. The next section describes the first Firefox profile, which on this system is also the only profile. In most cases a single profile named «default» will be the only profile present, as shown earlier. In a multiple-profile Firefox environment, additional named profiles will be present, and the last-used profile will be indicated by a «Default=1» variable. The Path variable points to the directory where this profile’s data are stored.
Inside of each profile directory you will find numerous files and subdirectories.
The most important files here will be.sqlite files, which are the previously mentioned SQLite 3 databases. We will be examining four of these databases in detail.
• Formhistory.sqlite: stores data about form submission inputs—search boxes, usernames, etc.
• Downloads.sqlite: stores data about downloaded files
There are numerous open source utilities for interacting with SQLite databases. We will use two in this chapter: the command line sqlite3 tool and the graphical sqliteman program. The sqliteman program can be installed on Ubuntu using the following command:
sudo apt-get install sqliteman
Additional prebuilt packages are available for OS X and Windows.
After backing up the user’s Firefox profile, choose Open from the File menu and browse to the copy of the profile directory. Next, select the database you would like to examine and click Open. From here, you can browse the database structure, execute SQL queries, and export findings. Most of these databases have simple schemas with one table of interest. For example, to view data held in the formhistory.sqlite database, you would execute the following command:
SELECT * FROM moz_formhistory;
An example of results from this query is shown in Figure 7.2.
Figure 7.2 SQLite query for moz_formhistory.
The formhistory.sqlite database contains data that the user entered into forum submission fields. This includes items such as names, addresses, email addresses, phone numbers, Web mail subject lines, search queries, and usernames or “handlers” entered into forums.
The downloads.sqlite database contains records of the files downloaded by the user. Be aware that the files that show up in this database are those that are handled by the Firefox Download Manager. Multimedia files handled by browser plug-ins and other items that end up in the browser cache will not show up in this database. An important aspect of this particular database is that it allows the investigator to correlate items found on the file system to the URLs where they originated.
The cookies.sqlite database can produce information such as the last time the user visited a site that set or requested a specific cookie, whether or not the user was registered or logged in at a particular site, and other browser state information.
The places.sqlite database contains the most data related to user activity in the browser. In contrast to each of the previous databases examined, places.sqlite has a more complex multitable schema, which has been mapped in detail by Chris Cohen [2].
The two items of primary interest in most Web history examinations are the URL visited and the time of that visit. These two items are found in the url field in the moz_places table and in the visit_date in the moz_historyvisits table, respectively. The id field in the moz_places table corresponds to the places_id in the moz_historyvisits table. The visit_date is stored in «PRTime», which is a 64-bit integer counting the number of microseconds since the UNIX Epoch (January 1st, 1970 UTC). The following sqlite statement will retrieve these two values from their respective tables and convert the visit_date to a human-consumable format:
SELECT
datetime(moz_historyvisits.visit_date/1000000,’unixepoch’),
moz_places.url
FROM moz_places, moz_historyvisits
WHERE moz_places.id = moz_historyvisits.place_id
We will use the console sqlite3 client to perform this query.
forensics:~ forensics$ sqlite3 ~/Library/Application Support/Firefox/Profiles/fffffs6z.default/places.sqlite
sqlite> SELECT datetime(moz_historyvisits.visit_date/1000000,’unixepoch’), moz_places.url FROM moz_places, moz_historyvisits WHERE moz_places.id = moz_historyvisits.place_id;
...
2010-06-08 05:35:34|http://code.google.com/p/revealertoolkit/
2010-06-08 05:35:54|http://code.google.com/p/revealertoolkit/downloads/list
2010-06-08 05:35:58|http://code.google.com/p/revealertoolkit/downloads/detail?name=RVT_v0.2.1.zip&can=2&q=
2010-06-08 05:36:42|http://code.google.com/p/poorcase/
2010-06-08 05:36:46|http://code.google.com/p/poorcase/downloads/list
2010-06-08 05:36:46|http://code.google.com/p/poorcase/downloads/detail?name=poorcase.odp&can=2&q=
2010-06-08 05:36:50|http://code.google.com/p/poorcase/downloads/detail?name=poorcase_1.1.pl&can=2&q=
2010-06-08 05:37:12|http://liveview.sourceforge.net/
2010-06-08 05:37:12|http://sourceforge.net/project/showfiles.php?group_id=175252
2010-06-08 05:37:11|http://sourceforge.net/projects/liveview/files/
2010-06-08 05:37:35|http://www.google.com/search?q=system+combo+timeline&ie=utf-8&oe=utf-8&aq=t&rls=org.mozilla:en-US:official&client=firefox-a
2010-06-08 05:37:35|http://www.cutawaysecurity.com/blog/system-combo-timeline
2010-06-08 05:39:52|http://log2timeline.net/INSTALL.txt
2010-06-08 05:39:59|http://cdnetworks-us-2.dl.sourceforge.net/project/revit/revit07-alpha/revit07-alpha-20070804/revit07-alpha-20070804.tar.gz
...
In addition to browser history files, a user’s browser cache may be of investigative importance. Table 7.2 contains the location of the Firefox cache on different operating systems. Examining this directory directly for viewing will usually yield a stream of numbered unidentifiable files along with one cache map file «_CACHE_MAP_» and three cache block files (_CACHE_001_ through _CACHE_003_). These are binary files that contain information regarding the URLs and filenames associated with cached data, as well as time stamp data.
Table 7.2
Firefox Cache Locations
Operating System | Location |
Windows XP | C:Documents and Settings\%usernameLocal SettingsApplication DataMozillaFirefoxProfiles |
Windows Vista/7 | C:Users\%username%AppDataRoamingMozillaFirefoxProfiles |
Linux | /home/$username/.mozilla/firefox/Profiles |
OS X | /Users/$username/Library/Caches/Firefox/Profiles |
Although there are free forensic applications for parsing these data, none of these tools are open source. These free tools are discussed in the Appendix.
If a Firefox session is not terminated properly, a file named sessionstore.js will be present in the user’s profile directory. This file is used by Firefox to recover the browser session in case of a crash or other unexpected shutdown. If this file is present upon start up, Firefox will use the contents to restore the browser windows and tabs the user last had open. The content is stored as a series of JavaScript Object Notation (JSON) objects and can be viewed using any text editor. JSON is structured data, however, and can be parsed and displayed in a more logical format using a JSON viewer or “pretty printer.” One such open source application is the Adobe AIR-based JSON Viewer available at http://code.google.com/p/jsonviewer/. Figure 7.3 is a screenshot of an example sessionstore.js with structure viewed in JSON Viewer.
Figure 7.3 JSON Viewer.
Items of note in a sessionstore.js file include closed tabs and windows, saved form data, and temporary cookies.
Firefox stores Bookmarks in the places.sqlite database via combination of data in “moz_bookmarks,” “moz_places,” and “moz_items_annos” tables. Extraction of data from these tables has been documented thoroughly by Kristinn Gudjonsson [3]. Briefly, the SQL query Kristinn wrote to generate a simple list of stored bookmarks and associated dates is
SELECT moz_bookmarks.type,moz_bookmarks.title,moz_bookmarks.dateAdded,
moz_bookmarks.lastModified,moz_places.url,moz_places.title,
moz_places.rev_host,moz_places.visit_count
FROM moz_places, moz_bookmarks
WHERE
moz_bookmarks.fk = moz_places.id
AND moz_bookmarks.type <> 3
The moz_items_annos table may contain additional information relating to annotations the user has made to bookmarks, including the time the annotation was created and modified. In addition to direct SQLite queries, this time information can also be extracted using Kristinn’s log2timeline tool, which is discussed in Chapter 9.
Firefox Bookmark Backups are found in the user’s profile under the “bookmarkbackups” directory. These are stored as a series of JSON objects and can be parsed using any number of JSON viewers, such as the previously mentioned JSON Viewer. Recorded artifacts include the date the bookmark was added, the title, and the URL of the bookmarked site.
Firefox supports the installation of extensions, which can enhance or modify the behavior of the browser. A manifest of installed extensions can be found in the user’s profile directory in the “extensions.rdf” file. This XML document describes the extensions installed for this user. Simply grepping for the strings “NS1:name” can provide a list of installed extensions:
NS1:name="Evernote Web Clipper"
NS1:name="XSS Me"
NS1:name="Google Feedback"
NS1:name="SQL Inject Me"
NS1:name="Redirect Remover"
NS1:name="1Password"
NS1:name="Tamper Data"
NS1:name="Access Me"
NS1:name="Google Toolbar for Firefox"
NS1:name="Default"
The code and supporting files that make up the extensions can be found in subdirectories of the extensions directory under the user’s profile directory.
Chrome is the open source Web browser developed by Google. In the two short years since its release Chrome has become the third most popular browser in the world and is the centerpiece of Chrome OS. Chrome is available for Windows, OS X, and Linux.
Like Firefox, Chrome utilizes a variety of SQLite databases to store user data. We can access these data using any SQLite client, but will use the base command line sqlite3 program for most cases. Please reference Table 7.3 for a list of the storage locations for Chrome history on different operating systems.
Table 7.3
Chrome History Locations
Operating System | Location |
Windows XP | C:Documents and Settings\%usernameApplication DataGoogleChromedefault |
Windows Vista/7 | C:Users\%username%AppDataLocalGoogleChromedefault |
Linux | /home/$username/.config/google-chrome/Default |
OS X | /Users/$username/Library/Application Support/Google/Chrome/Default/ |
“Cookies” is the SQLite database Chrome uses to store all cookies. Information stored in this database includes the creation time of the cookie, the last access time of the cookie, and the host the cookie is issued for.
The “History” SQLite database contains the majority of user activity data of interest, divided among numerous tables. Three tables are of particular interest:
The downloads table tracks downloaded files, in much the same manner as the Downloads.sqlite database does for Firefox. Items of interest include the local path of the saved file, the remote URL, and the time the download was initiated.
Together, urls and visits tables can be used to construct a good view of user browsing activity. Because the id field of the urls table maps to the url field of the visits table, the following SQL query will produce a report of browsing activity [4]:
SELECT urls.url, urls.title, urls.visit_count, urls.typed_count, urls.last_visit_time, urls.hidden, visits.visit_time, visits.from_visit
FROM urls, visits
WHERE
urls.id = visits.url
The following section is an excerpt of the results produced by this query:
http://digitalcorpora.org/corpora/disk-images|Digital Corpora» Disk Images|1|0|12935304129883466|0|12935304129883466|76149
http://digitalcorpora.org/corp/images/nps/nps-2009-casper-rw|Index of /corp/images/nps/nps-2009-casper-rw|1|0|12935304152594759|0|12935304152594759|76150
http://digitalcorpora.org/corp/images/nps/nps-2009-casper-rw/|Index of /corp/images/nps/nps-2009-casper-rw|2|0|12935304190343005|0|12935304152594759|76151
http://digitalcorpora.org/corp/images/nps/nps-2009-casper-rw/|Index of /corp/images/nps/nps-2009-casper-rw|2|0|12935304190343005|0|12935304190343005|76150
http://digitalcorpora.org/corp/images/nps/nps-2009-casper-rw/narrative.txt||1|0|12935304181158875|0|12935304181158875|76152
Note that the visit_time value is stored in the “seconds since January 1, 1601 UTC” format used in many Chrome date fields
The “Login Data” SQLite database is used by Chrome to store saved login data. On Linux systems, this can include password data. On an OS X systems, native password storage systems are used.
“Web Data” is a SQLite database that contains data the user has opted to save for form auto-fill capabilities. This can include names, addresses, credit data, and more.
The “Thumbnails” SQLite database stores thumbnail images of visited sites. This can be useful for determining the content of sites of interest. Figure 7.4 shows a stored thumbnail binary blob as an image using SQLiteman’s image preview functionality to view a particular thumbnail.
Figure 7.4 Site image embedded in “thumbnails” table.
The “url_id” field in this table maps to the “id” field in the “urls” table in the History database. This can be used to map a generated thumbnail to a particular visit at a specific time and date.
sqlite> select * from urls where id is 36368;
36368|http://blogs.sans.org/computer-forensics/|SANS Computer Forensic Investigations and Incident Response|3|0|12930180528625238|0|1413
Chrome bookmarks are stored in “Bookmarks” file under the user’s profile directory. This file contains a series of JSON objects and can be viewed with any JSON viewer or examined trivially as plain text. See the following section for an example of a bookmark entry:
{
"date_added": "12924673772022388",
"id": "108",
"name": "Digital Corpora",
"type": "url",
"url": "http://digitalcorpora.org/"
},
Note that this date is also in the “seconds since January 1, 1601 UTC” format. A copy of the Bookmarks file named “Bookmarks.bak” will also be found in this directory.
The “Local State” file is used by Chrome to restore state after an unexpected shutdown. It is similar in function to the sessionstate.js file in Firefox and, like sessionstate.js, contains JSON objects. It can be viewed with any text editor or with the JSON Viewer we used to examine the sessionstate.js file in the previous section.
The Chrome cache consists of an index file, four numbered data files (data_0 through data_3), and many numbered files starting with f_ followed by six hex digits. There are currently no open source tools to process these files in a meaningful way, but the creation time of the f_ files can be correlated with data extracted from the History database. The f_ files can also be analyzed according to content. See Chapter 8 for a discussion of file artifact analysis.
Safari is the default browser included on Mac OS X. It is used almost exclusively by Mac OS X users, but is also available for Windows. Any examination of a Mac OS X system will likely require analysis of Safari artifacts. Please reference Table 7.4 for the location of Safari History files on Windows and OS X systems.
Table 7.4
Safari History Locations
Operating System | Location |
Windows XP | C:Documents and Settings\%usernameApplication DataApple ComputerSafari |
Windows Vista/7 | C:Users\%username%AppDataRoamingApple ComputerSafari |
OS X | /Users/$username/Library/Safari |
The main Safari history file is History.plist, which records the URL visited, the time and date of the previous visit, and the total number of times the site has been visited. Because this file is a plist, it can be processed using the plutil.pl script discussed in Chapter 6. The output from running this tool on a sample Safari History.plist can be seen here:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple Computer//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
<key>WebHistoryDates</key>
<array>
<dict>
<key></key>
<string>http://www.amazon.com/Digital-Forensics-Open-Source-Tools/dp/1597495867</string>
<key>D</key>
<array>
<integer>1</integer>
</array>
<key>lastVisitedDate</key>
<string>310529012.7</string>
<key>title</key>
<string>Amazon.com: Digital Forensics with Open Source Tools (9781597495868): Cory Altheide, Harlan Carvey: Books</string>
<key>visitCount</key>
<integer>1</integer>
</dict>
...
The time stamp just displayed is stored as a CFAbsoluteTime value (also known as Mac Absolute Time). This value is an offset representing the number of seconds since midnight January 1, 2001 GMT. Instead of converting these values manually, we can use Safari Forensic Tools (SFT), a set of command line tools designed for processing Safari’s plist files.
Safari maintains four main plist files of interest:
The Safari Forensics Tools suite has individual utilities to parse and display the content of each plist.
Downloads.plist stores all files downloaded to the system. This does not include cached media, images, or any items handled by browser plugins. The Downloads.plist file can be processed by SFT tool safari_downloads. See the following for an example entry from this file:
DownloadEntryProgressBytesSoFar: 494185
DownloadEntryIdentifier: 6438149F-D8A0-4677-9D00-C46DFFEE96C2
DownloadEntryPath: ~/Downloads/gp_lin32_rc4_2.tar.bz2
DownloadEntryURL: http://cache.greenpois0n.com/dl/gp_lin32_rc4_2.tar.bz2
DownloadEntryProgressTotalToLoad: 494185
Status: Completed
This can be used to correlate web browsing history to files on disk.
Bookmarks.plist stores user bookmarks. Safari bookmarks are not as interesting artifact-wise as bookmarks for other browsers, as it does not store any time stamps related to the bookmark entry. This file can be processed using the SFT tool safari_bm.
Title: BookmarksMenu
Windows NT Contains File System Tunneling Capabilities
http://support.microsoft.com/kb/172190
As the name implies, Cookies.plist holds entries related to cookies. Artifacts of interest include the domain and path the cookie is issued for, the creation and expiration times, and the cookie content. The SFT tool safari_cookies can parse this file.
Path: /
Expires: 2015-10-18 14:27:02 -0700
Domain: .howstuffworks.com
Value: [CS]v1|265F0693051D3BB9-40000104602169BD[CE]
Created: 2010-10-19 14:27:02 -0700 (309216422.496477)
Name: s_vi
Path: /
Expires: 2015-06-23 09:06:06 -0700
Domain: .southwest.com
Value: [CS]v1|261159919-40000173803F1B41[CE]
Created: 2010-06-24 09:06:06 -0700 (299088366.870542)
Name: s_vi
The record of a user’s visits to Web sites is stored in the History.plist file. The SFT tool safari_hist can be used to process this file into a tab-delimited format, listing the URL visited, the last visit date and time, the number of visits, and the title of the page in question. See the following excerpt for a sample.
URL Last Visit Date/Time Number of visits Page Title
http://developer.apple.com/library/mac/#technotes/tn2006/tn2166.html 2010-11-08 11:15:11 -0800 1 Technical Note TN2166: Secrets of the GPT
http://developer.apple.com/library/mac/navigation/index.html#topic=Guides§ion=Resource+Types 2010-11-06 22:25:33 -0700 2 Mac OS X Developer Library
http://developer.apple.com/documentation/mac/files/Files-2.html 2010-11-06 22:25:33 -0700 1 Guides
http://developer.apple.com/documentation/mac/files/Files-72.html 2010-11-06 22:25:20 -0700 1 Guides
The Safari cache is stored in the Cache.db SQLite3 database. Cached data are stored primarily in two tables: cfurl_cache_response, which stores the URL and request metadata, and cfurl_cache_blob_data, which stores actual cached data [5]. In many cases, direct examination of the live database using sqlite queries will yield no results because the cache has been “emptied.” However, unless the database has been vacuumed, the actual cached content will still be present in the database file and can be recovered using file carving techniques. This segment shows an excerpt of results of running hachoir-subfile against an “empty” Cache.db database.
forensics:~ $ hachoir-subfile Library/Caches/com.apple.Safari/Cache.db
[+] Start search on 78134272 bytes (74.5 MB)
[+] File at 32304 size=29913 (29.2 KB): JPEG picture
[err!] Unable to compute GifFile content size: Can’t get field "image[0]" from /
[+] File at 73682: GIF picture
[+] File at 74468: JPEG picture
[+] File at 74498: TIFF picture
[+] File at 75280: JPEG picture
[+] File at 81604 size=1344 (1344 bytes): GIF picture
[+] File at 88754 size=16472 (16.1 KB): GIF picture
[+] File at 102814 size=93773 (91.6 KB): JPEG picture
[+] File at 203702: JPEG picture
[+] File at 204574: JPEG picture
[+] File at 209803: JPEG picture
[+] File at 215181 size=3709 (3709 bytes): JPEG picture
[+] File at 221369 size=3665 (3665 bytes): JPEG picture
[+] File at 226953 size=3201 (3201 bytes): JPEG picture
[+] File at 232104 size=2146 (2146 bytes): JPEG picture
[+] File at 236244 size=35133 (34.3 KB): GIF picture
[+] File at 237376: JPEG picture
[+] File at 249450: JPEG picture
[+] File at 252313 size=4365 (4365 bytes): JPEG picture
[+] File at 284855 size=1619 (1619 bytes): JPEG picture
[+] File at 288346 size=4272 (4272 bytes): JPEG picture
[+] File at 294697: JPEG picture
[+] File at 313240 size=596850 (582.9 KB): PNG picture: 800x171x24
[+] File at 313779 size=3366 (3366 bytes): JPEG picture
[+] File at 319757 size=67069 (65.5 KB): JPEG picture
[+] File at 389267 size=4727 (4727 bytes): JPEG picture
[+] File at 399393: Macromedia Flash data: version 10, compressed
...
The LastSession.plist file is used by Safari to restore the browser state in case of an unexpected shutdown. This file can be parsed using the plutil.pl utility. Artifacts that can be extracted from this file are limited but, at a minimum, URLs and page titles can be recovered.
<key>BackForwardList</key>
<array>
<dict>
<key>Title</key>
<string>Top Sites</string>
<key>URL</key>
<string>topsites://</string>
</dict>
<dict>
<key>Title</key>
<string>Technical Note TN2166: Secrets of the GPT</string>
<key>URL</key>
<string>http://developer.apple.com/library/mac/#technotes/tn2006/tn2166.html</string>
</dict>
</array>
For home users, local email storage may be falling by the wayside in favor of Web mail, but there are still many businesses using locally stored mail. This section covers extraction of content from the binary Microsoft Outlook format, as well as some methods to speed up analysis of plain text email formats used commonly on Linux systems.
PST is the mail storage format used by Microsoft’s Outlook email client. A user’s PST file is not just for storage of email from their MSExchange server, but can also store email from POP3, IMAP, and even HTTP (such as Windows Live Hotmail) accounts. The PST file provides a data storage format for storing emails on the user’s computer system. Users of OutLook email clients may also have an OST file, which is for offline storage of email. This file allows the user to continue reviewing the email that they do have, even while they are offline and cannot connect to their MSExchange server.
A user’s PST file may be found in the “Local SettingsApplication DataMicrosoftOutlook” subfolder within their profile on Windows XP and 2003 systems; on Vista and Windows 7 systems, the user’s PST file may be located in the “AppDataRoamingMicrosoftOutlook” folder. However, PST files can be moved to any location within the file system, and an analyst may find several PST files on a system. PST files may contain considerable artifacts of user communications (as well as sharing files as attachments), and the value of the PST files will depend on the analyst’s goals and the type of examination.
One of the first open source libraries for accessing the PST file format was libpst [6]. This library converts 32-bit, pre-OutLook 2003 PST files, as well as 64-bit OutLook 2003 files, and is available as source RPMs, as well as .tar.gz files for download and installation. The library is also utilized by several of the utilities available at the referenced Web site, including readpst and lspst. In addition to the libpst library, the libpff library is also available [7] (the libpff library is available as a SourceForge.net project). As of November 11, 2010, the library is in alpha and is available as a .tar.gz file.
There may also be options available if you’re interested in more of a cross-platform approach. In January 2010, Richard Johnson posted to his blog [8] that he’d developed an open-source Java library for accessing PST files using documentation available as part of the libpff project. According to his blog post, Richard had done this in order to be able to convert PST files to Gmail format in order to take advantage of the search capabilities afforded by Gmail; clearly, this may also be a capability of interest to forensic analysts. Richard made the open-source java-pst library available on Google Code [9]; in addition to the library, there is an alpha version of the pst2gmail conversion utility available on the Web at http://code.google.com/p/pst2gmail/.
We can use the utilities provided by libpff to examine a sample PST file. The pffinfo tool will provide some basic information about the internals of a given PST.
user@ubuntu:~/pst$ pffinfo Outlook.pst
pffinfo 20101203
Personal Folder File information:
File size: 7382016 bytes
File content type: Personal Storage Tables (PST)
File type: 64-bit
Encryption type: compressible
Message store:
Folders: Subtree, Inbox, Outbox, Wastbox, Sentmail, Views, Common views, Finder
Password checksum: N/A
To extract the email content, we will use pffexport. This tool has a variety of options that can be used to configure what is extracted and how it is represented. The most important of these is the -m option, which defines the export mode. By default only “allocated” messages are exported. Note that this includes items in the “Deleted Items” directory that have not been purged by the user. The -m all option tells pffexport to attempt to export messages recovered from the unallocated space of the PST structure.
user@ubuntu:~/pst$ pffexport -m all -t outlook-export Outlook.pst
Pffexport creates two directories when using these flags: one for exported allocated items and one for recovered deleted items. In this instance we did not encounter any recovered items. Inside the outlook-export.allocated/ directory is a directory named “Top of Personal Folders.” This contains a directory structure that will be familiar to anyone that has used Outlook:
Calendar Deleted Items Inbox Junk E-mail Outbox Tasks
Contacts Drafts Journal Notes Sent Items
Messages stored in these directories are extracted into component pieces:
/home/user/pst/outlook-export.export/Top of Personal Folders/Sent Items/Message00066:
Attachments
Attachments/sample.xls
ConversationIndex.txt
Message.txt
OutlookHeaders.txt
Recipients.txt
As you can see, attachments are exported into an “Attachments” subdirectory. The “Message.txt” file is the actual mail content—the rest of the files are Outlook metadata.
For more information about internals of the PST format, please see Joachim Metz’s extensive documentation at the libpff project page on SourceForge.net: http://sourceforge.net/projects/libpff/files/documentation/.
mbox and maildir are the two primary local mail storage formats used by Linux email clients. These formats are also supported by cross-platform mail clients, especially those with a Unix pedigree. Examples include Eudora, Thunderbird, and Apple’s Mail.app. The older mbox format consists of a single flat file, containing numerous email entries, whereas the maildir format stores each email as a discreet file in a set of subdirectories.
Because both of these formats are plain text, searching for specific key words quickly can be performed without the need for a dedicated email forensics utility. We will demonstrate examination techniques using item 317398 from Digital Corpora, which is a large mail archive in mbox format. The file begins with the following lines:
From [email protected] Wed Feb 20 16:33:22 2002
Received: from localhost (newville@localhost)
by millenia.cars.aps.anl.gov (8.11.6/8.11.2) with ESMTP id g1KMXMY05595
for <[email protected]>; Wed, 20 Feb 2002 16:33:22 -0600
X-Authentication-Warning: millenia.cars.aps.anl.gov: newville owned process doing -bs
Date: Wed, 20 Feb 2002 16:33:22 -0600 (CST)
From: Matt Newville <newville@cars.uchicago.edu>
Message-ID: <Pine.LNX.4.43.0202201626470.5566-100000@millenia.cars.aps.anl.gov>
This mail entry continues with additional headers followed by the mail body. A new mail begins with another “From ” line, which is the defined message delineator for the mbox format. Note that capitalization and the trailing space are intentional and required for new mail—this is known as the “From_” line.
We will use two different tools to examine this mailbox: grepmail and mairix. Both can be installed on Ubuntu systems using apt-get.
Grepmail is a utility designed to search for individual mail entries that match a supplied set of criteria. Grepmail has knowledge of the mbox mail format and will return an entire message rather than a single matching line as is the case when using standard “grep.” Although grepmail can only process mbox format mailboxes, it can parse compressed mailboxes and can search through a number of mailboxes at once. Selected grepmail options that may be of particular interest to examiners are listed here:
-b Search must match body
-d Specify a required date range
-h Search must match header
-H Print headers but not bodies of matching emails
-j Search must match status (A=answered, R=read, D=deleted,
O=old, F=flagged)
-Y Specify a header to search (implies -h)
In addition to header-specific searches, another feature of grepmail an examiner may find of value is its date searching capabilities. Grepmail understands dates entered in a number of nonstandard formats: “10/31/10,” “4:08pm June eleventh,” and “yesterday” are all valid date entries, for example. Additionally, date searches can be constrained by keywords such as “before,” “since,” or “between.” Further discussion of extended time analysis in forensic examinations can be found in Chapter 9.
While grepmail certainly has interesting search capabilities, it does tend to slow down quite a bit dealing with very large (multiple gigabyte) mbox files. The grepmail program is better suited for queries against relatively small mailboxes and queries where a specific set of keywords, dates, and other search criteria are known in advance of the start of the examination and are unlikely to change. Many legal discovery examinations would fall into this category. For investigations that don’t have a fixed set of examination criteria from the beginning or that involve large mailboxes, mairix may be a better utility.
Mairix is a powerful mail searching utility that supports both maildir and mbox formats. The key difference between mairix and grepmail is that mairix first builds an index, which is subsequently queried as the examiner performs searches. Prior to searching, we need to provide a configuration file (mairixrc) that will tell mairix the location of our content to be indexed, where the index should go, and where any mail items that are returned in response to a query should be exported to.
We can build a minimal mairixrc file containing the following information:
base=.
mbox=input/mail.mbox
database=.database
mfolder=output
“Base” defines the base path that mairix will treat as its root. “mbox” points to the mbox file we will be examining. Note that this can be a colon-delimited set of mbox files if you need to index and examine multiple mailboxes. “Database” tells mairix where to store the index it will build. Finally, “mfolder “defines a directory where mairix will write the output from any subsequent queries. Search results are stored in maildir format by default.
Once the .mairixrc is written we can generate the index using:
mairix -v -f .mairixrc
...
Wrote 5283 messages (105660 bytes tables, 0 bytes text)
Wrote 1 mbox headers (16 bytes tables, 18 bytes paths)
Wrote 84528 bytes of mbox message checksums
To: Wrote 803 tokens (6424 bytes tables, 8158 bytes of text, 53244 bytes of hit encoding)
Cc: Wrote 430 tokens (3440 bytes tables, 4187 bytes of text, 4171 bytes of hit encoding)
From: Wrote 2074 tokens (16592 bytes tables, 22544 bytes of text, 38970 bytes of hit encoding)
Subject: Wrote 1875 tokens (15000 bytes tables, 13413 bytes of text, 39366 bytes of hit encoding)
Body: Wrote 165118 tokens (1320944 bytes tables, 1619831 bytes of text, 1488382 bytes of hit encoding)
Attachment Name: Wrote 385 tokens (3080 bytes tables, 6288 bytes of text, 1256 bytes of hit encoding)
(Threading): Wrote 5742 tokens (45936 bytes tables, 278816 bytes of text, 39685 bytes of hit encoding)
Note that adding the -v flag forces mairix to write status information to the console while indexing—omitting this flag will not harm anything but may give the indication of a hung program when indexing a very large mailbox.
Once the index has been generated we can begin issuing queries. Mairix supports a broad range of search operators, which we will not duplicate in their entirety here. Please review the mairix man page for a full list of search operators. The following search will return all messages with the word “vacation” in the body or subject.
user@ubuntu:~/mail$ mairix -f rcfile bs:vacation
Created directory ./output
Created directory ./output/cur
Created directory ./output/new
Created directory ./output/tmp
Matched 19 messages
The resulting mail files can be found in the “new” subdirectory under the folder defined as the “mfolder” in our .mairixrc file.
user@ubuntu:~/mail$ cd output/new/
user@ubuntu:~/mail/output/new$ ls
123456789.121.mairix 123456789.1663.mairix 123456789.1688.mairix 123456789.2686.mairix 123456789.589.mairix
123456789.1616.mairix 123456789.1674.mairix 123456789.1691.mairix 123456789.2986.mairix 123456789.593.mairix
123456789.1618.mairix 123456789.1675.mairix 123456789.1692.mairix 123456789.579.mairix 123456789.619.mairix
123456789.1622.mairix 123456789.1677.mairix 123456789.2685.mairix 123456789.581.mairix
This chapter identified and analyzed numerous artifacts generated by the top four browsers in use today. As deciphering a user’s browser activity is becoming more and more relevant to a wider variety of investigations, being able to locate and process these data effectively is crucial. This chapter also extracted mail content and metadata from Outlook’s binary format and discussed how to analyze locally stored mail in formats used commonly by Linux and OS X mail clients.
1. The Meaning of LEAK Records. http://www.forensicblog.org/2009/09/10/the-meaning-of-leak-records/; (accessed 29.12.10).
2. Firefox 3 Forensics—Research—Firefox Places Schema. In: http://www.firefoxforensics.com/research/firefox_places_schema.shtml;.
3. IR and forensic talk» Version 0.41 of log2timeline published. In: http://blog.kiddaland.net/2010/01/version-0-41-of-log2timeline-published/;.
4. SANS—Computer Forensics and Incident Response with Rob Lee. In: http://computer-forensics.sans.org/blog/2010/01/21/google-chrome-forensics/;.
5. Forensics from the sausage factory: Safari browser cache—Examination of Cache.db. In: http://forensicsfromthesausagefactory.blogspot.com/2010/06/safari-browser-cache-examination-of.html;.
6. libpst Utilities–Version 0.6.49. In: http://www.five-ten-sg.com/libpst/;.
7. libpff. In: http://sourceforge.net/projects/libpff/;.
8. java-libpst and pst2gmail. http://www.rjohnson.id.au/wordpress/2010/01/26/java-libpst-pst2gmail/; (accessed 26.01.10).
9. java-libpst. In: http://code.google.com/p/java-libpst/;.