5.5 FAT: An Example File System

The essentials of file system formatting are best illustrated with an example: the FAT file system. While the format dates back to the earliest personal computers, it has evolved since then. The FAT 12 format was superseded by FAT 16, which supported gigabytes of storage. Today’s FAT 32 supports terabytes of disk space.

Today, we often use FAT formatting on removable hard drives and flash memory. Most commercial digital cameras use FAT format on their storage cards. FAT provides hierarchical directories and flexible file naming. The major shortcoming is that individual FAT files must contain less than 4 gigabytes.

Volume Layout

FIGURE 5.11 shows the layout of a FAT-formatted volume. Sector 0, the first sector of the volume, marks the beginning of the boot blocks. These first sectors may contain a program to bootstrap the operating system, but they also contain variables that describe the volume’s layout. Next comes the FAT. This table keeps track of every cluster on the volume.

An illustration depicts the layout of a FAT-formatted volume.

FIGURE 5.11 Layout of a FAT-formatted volume.

Following the FAT is the root directory, which contains entries for files and subdirectories on the volume. Each entry contains the name and the number of the first sector in the file or subdirectory. FAT directories are hierarchical, so any directory may contain entries for subdirectories.

The rest of the volume is divided into clusters to store files and subdirectories. The FAT contains one entry per cluster.

On older FAT 12 and FAT 16 volumes, the root directory had its own, dedicated set of sectors on the volume. On FAT 32, the root directory is the first file stored in the volume’s clusters; it appears in roughly the same place, but there is a little more flexibility in its location.

5.5.1 Boot Blocks

The boot block resides on the first sectors of the hard drive; they are the blocks read by the BIOS when we boot from this drive. Every file system stores special information in the boot block, as well as providing room for a bootstrap program.

If we partition a hard drive, the first block contains the MBR, which itself contains a boot program. The MBR’s boot program automatically redirects the boot operation to the boot block of the first “bootable” partition.

The first item stored in the FAT boot block is a “jump” instruction. If the BIOS tries to start an operating system stored on the FAT volume, it first reads a boot program from these boot blocks and then jumps to the first instruction read into RAM. This first instruction jumps over the variables FAT provides to describe the volume’s format. This block of variables is called the BIOS parameter block (BPB). TABLE 5.3 describes the contents of this block in the FAT 32 format.

TABLE 5.3 Contents of the FAT 32 Boot Block

Offset in Bytes Size in Bytes Contents
0 3 A “jump” instruction to skip over the BPB data
3 8 System name, usually “MSWIN 4.1”
11 2 Number of bytes in a hard drive sector, usually 512
13 1 Number of sectors in a cluster; depends on the volume size
14 2 Number of sectors contained in the boot blocks
16 1 Number of FATs on the volume, usually 2
17 15 Variables that aren’t used in FAT 32, left as spares, or used by the device driver
32 4 Total number of sectors in the volume
36 4 Total number of sectors in one FAT
40 50 Variables for other purposes, like error recovery, device driver support, and volume identification.

We use the information in the BPB to find the boundaries between the boot blocks, the FAT, and the cluster storage area. To find the start of the FAT, we look at offset 14, which indicates the number of sectors in the boot blocks.

The FAT

The FAT is an array of index variables. It contains one entry for every cluster on the volume. The number associated with different FAT formats (12, 16, and 32) indicates the size of the address variables in bits. Each array entry corresponds to one cluster on the volume. The array entry’s contents indicate one of the following:

  • ■   The cluster is part of a file.

  • ■   The cluster is free and may be assigned to a file that needs more space.

  • ■   The cluster contains a bad disk sector and should not be used.

Volumes often contain two FATs, one following right after the other. This is supposed to increase reliability. If the operating system keeps both FATs up to date and one is damaged by a disk error, then the system still can retrieve files by reading the undamaged FAT.

FAT Format Variations

The file format establishes the rules for finding and storing information on the hard drive, flash drive, or other storage device. We can think of it as a map. First, it leads us to the directories, then to a file in a directory, and then through the clusters of sectors that make up the file. TABLE 5.4 summarizes the FAT file formats.

TABLE 5.4 Microsoft’s FAT File System Formats

images

The maximum sizes in Table 5.4 represent typical implementations that worked over multiple operating systems. Occasionally, someone would modify one operating system or another to support larger sizes, usually to satisfy a particularly important customer or vendor.

The format names or sizes (12, 16, and 32) indicate the size in bits of the address variables used to track individual clusters on the hard drive. Thus, a FAT 12 drive could only handle a few thousand clusters, because every cluster address had to fit in a 12-bit variable. By the same token, FAT 32 can only handle a quarter billion clusters, because 4 bits of the 32-bit cluster variables are used for other purposes within the file system. In theory, the maximum drive size should be the product of the maximum cluster size and the maximum number of clusters. Operating system and software restrictions may reduce the practical maximum size.

5.5.2 Building Files from Clusters

Files in most file systems, including FAT, consist of a series of clusters. The clusters themselves may be scattered across the volume. When the file system retrieves that file, however, it retrieves the clusters in the order the data appears in the file. In a file containing three clusters, the beginning of the file appears in the first cluster, the middle in the second cluster, and the end in the third cluster.

The challenge for the file system is to find a file’s clusters and present them in the right order. FAT stores each file as a cluster chain. In other words, each FAT entry provides a “link” to connect one cluster in a file to the next cluster in that file.

Cluster Storage

Cluster storage begins with the first sector following the end of the FAT area. To find the start of cluster storage, we find the size of the FAT area and add it to the area’s starting point. We do that by extracting data from the BPB as follows:

  • ■   Find the number of FATs: BPB offset 16.

  • ■   Find the size of each FAT in sectors: BPB offset 36.

  • ■   Multiply the number of FATs and the size per FAT.

  • ■   Add the start of the FAT area: BPB offset 14.

For historical reasons, the first cluster in the cluster storage area is numbered 2. This first cluster almost always contains the beginning of the root directory. If the root directory contains more than one cluster, we must look in the cluster’s FAT entry to find the root’s next cluster.

An Example FAT File

FIGURE 5.12 displays a set of clusters beginning at cluster 300. The first file begins at 300 and contains the phrase “The quick brown fox jumps over the lazy dog.” The text is spread across three separate clusters. A second file, containing Lincoln’s Gettysburg Address, begins at cluster 302. The cluster at 305 is not being used in a file.

An illustration depicts the set of clusters in a FAT file containing parts of a file.

FIGURE 5.12 Clusters containing parts of files.

FIGURE 5.13 displays the corresponding FAT entries. The first file contains three clusters (300, 301, and 304), and their FAT entries link them together. The FAT entry for 304, the file’s final cluster, is too large to be a legal cluster number; this marks the end of the file. The second file contains at least two clusters (302 and 303) and links to additional clusters outside our example. Cluster 305 is unused and its FAT entry contains a 0.

An illustration depicts the fat entries for the corresponding clusters in a FAT file.

FIGURE 5.13 The FAT points to the clusters in a file.

If a cluster is part of a file, then its FAT entry contains the cluster number of the next cluster in that file. If the cluster is the last one in the file, then the FAT entry contains a special value to mark the end of the file. Files or directories may contain two or more clusters, and both use this linking method when their contents don’t fit in a single cluster.

FAT and Flash Storage

Successful low-cost flash drives first appeared in the mid-1990s, and they soon became popular components of electronic devices like digital cameras. Early flash drives ranged in size from a few megabytes to 2 gigabytes. By 2005, however, 4 gigabyte flash drives were commercially available, but they would not work in older digital products.

The problem arose from the difference between FAT 16 and FAT 32 formats. The older products used FAT 16, because it was simpler to implement and worked with the smaller memory sizes. The larger flash drives required FAT 32 to handle their full memory size. Unfortunately, the older products did not recognize the FAT 32 file format. The only solution is to use smaller devices, though it may be possible to format a newer device with FAT 16 and simply ignore storage beyond its 2 gigabyte limit.

One type of flash memory product has adopted names to reflect the different formatting standards: the secure digital card, an offshoot of the multimedia card (MMC), a flash memory in a 24 mm x 32 mm format. The card exists in these formats:

  • ■   Secure digital (SD)—up to 2 GB using FAT 16

  • ■   SD high capacity (SDHC)—4 GB to 32 GB using FAT 32

  • ■   SD extra capacity (SDXC)—32 GB to 2 TB using FAT 32

Extended “exFAT” Format Microsoft has developed a proprietary extended version of the FAT format called “exFAT.” The file format is covered by Microsoft patents, so its use is restricted to Microsoft licensees. This has restricted the format’s use in many products. Here are key features in the format:

  • ■   Disk sizes in the petabyte range (1015)

  • ■   File size limit in the exabyte range (1018)

  • ■   Higher maximums for the number of files in a volume, directory, and subdirectory

  • ■   More customization and optional support of better timestamps, access controls, and transactions

5.5.3 FAT Directories

The directories, starting with the root directory, tell us what files exist and where to find their starting clusters. Each directory in FAT 32 consists of an array of 32-byte entries. In the simplest case, each entry represents a single file or subdirectory. A typical entry contains the following:

  • ■   The name of the file or directory

  • ■   Attributes of this directory entry

  • ■   Date and time created, last examined, and last modified

  • ■   An address variable pointing to the first cluster in the file

  • ■   The total number of bytes of data in the file

The FAT directory begins with the root directory, which itself begins at cluster 2. We look in the root directory entries to find files and other directories. The “attributes” indicate whether a directory entry points to a file or to another directory. We follow links in the FAT to find additional clusters belonging to the root directory or to the other files and directories.

Long File Names

FAT directory entries become more complicated as we look at long file names and other advanced features. The traditional FAT directory only supports file names of eight upper case characters or less, with a three-letter extension identifier. Modern FAT devices support much longer names that contain both upper- and lowercase characters.

To do this, the FAT directory sets aside a series of directory entries, one for the actual file, and additional ones to hold segments of the long name. Each “long name” entry may contain up to 13 characters. The “actual” directory entry holds a short, 11-character version of the long name plus the normal contents of the file’s directory entry. The short version of the name allows older Windows and DOS systems to handle directories containing long names.

Deleting Files

To delete a FAT file, the system takes two steps. First, it locates the file’s directory entry and sets it to “empty.” Some files have multiple directory entries, for example, files with long file names, and those entries also are marked as “empty.” Next, the system “frees” all clusters used in the file, allowing the clusters to be used in other files.

The system marks a directory entry as empty by storing a special value (decimal 229) in the first character of its file name. The system does not make any other changes to the directory entry. When it is time to create a new file and give it a new entry, the system scans all entries in the directory to ensure that the new name is unique, and to find an existing empty entry to use. The contents of a deleted directory entry remain unchanged until the entry is reused for a new file.

To free the clusters in the file, the system follows the file’s cluster chain in the FAT. For each cluster in the file, the system sets the FAT entry to 0. This marks the cluster as being free for reuse.

Undeleting a File The most common undelete programs are designed to recover FAT files. To undelete a FAT file, the program starts by simply changing the first character of the deleted directory entry to a legal filename character. This retrieves the file’s directory entry and its first cluster of data.

To recover the entire file, its clusters must not be fragmented. In other words, they must appear one after another on the drive, in contiguous clusters following the first cluster. In addition, the clusters must not have been allocated to another file in the meantime.

The undelete operation recovers the file’s data by using information in its directory entry and in the FAT. The directory entry tells us the starting cluster number and the number of bytes in the file. We calculate the number of clusters by dividing the file’s length by the cluster size, then we check the sequence of clusters in the FAT that start with the first cluster in the file. If all clusters are still free (i.e., their FAT entries are 0), then they most likely still contain the file’s data.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset