Chapter 11
Handling Storage

  • imagesObjective 1.4: Given a scenario, manage storage in a Linux environment.

images The world runs on data. Whether it’s an employee database, a folder with all of your family pictures, or just your weekly bowling scores, the ability to save and retrieve data is a must for every application. Linux provides lots of different ways to store and manage files for applications. This chapter first discusses the basics of how Linux handles storage devices, and then it walks through how you use those methods to manage data within a Linux environment.

Storage Basics

The most common way to persistently store data on computer systems is to use a hard disk drive (HDD). Hard disk drives are physical devices that store data using a set of disk platters that spin around, storing data magnetically on the platters with a moveable read/write head that writes and retrieves magnetic images on the platters.

These days, another popular type of persistent storage is called a solid-state drive (SSD). These drives use integrated circuits to store data electronically. There are no moving parts contained in SSDs, making them faster and more resilient than HDDs. While currently SSDs are more expensive than HDDs, technology is quickly changing that, and it may not be long before HDDs are a thing of the past.

Linux handles both HDD and SSD storage devices the same way. It mostly depends on the connection method used to connect the drives to the Linux system. The following sections describe the different methods that Linux uses in connecting and using both HDD and SSD devices.

Drive Connections

While HDDs and SSDs differ in how they store data, they both interface with the Linux system using the same methods. There are three main types of drive connections that you’ll run into with Linux systems:

  • Parallel Advanced Technology Attachment (PATA) connects drives using a parallel interface, which requires a wide cable. PATA supports two devices per adapter.
  • Serial Advanced Technology Attachment (SATA) connects drives using a serial interface, but at a much faster speed than PATA. SATA supports up to four devices per adapter.
  • Small Computer System Interface (SCSI) connects drives using a parallel interface, but with the speed of SATA. SCSI supports up to eight devices per adapter.

When you connect a drive to a Linux system, the Linux kernel assigns the drive device a file in the /dev folder. That file is called a raw device, as it provides a path directly to the drive from the Linux system. Any data written to the file is written to the drive, and reading the file reads data directly from the drive.

For PATA devices, this file is named /dev/hdx, where x is a letter representing the individual drive, starting with a. For SATA and SCSI devices, Linux uses /dev/sdx, where x is a letter representing the individual drive, again starting with a. Thus, to reference the first SATA device on the system, you’d use /dev/sda, then for the second device /dev/sdb, and so on.

Partitioning Drives

Most operating systems, including Linux, allow you to partition a drive into multiple sections. A partition is a self-contained section within the drive that the operating system treats as a separate storage space.

Partitioning drives can help you better organize your data, such as segmenting operating system data from user data. If a rogue user fills up the disk space with data, the operating system will still have room to operate on the separate partition.

Partitions must be tracked by some type of indexing system on the drive. Systems that use the old BIOS boot loader method (see Chapter 5) use the Master Boot Record (MBR) method for managing disk partitions. This method only supports up to four primary partitions on a drive. Each primary partition itself, however, can be split into multiple extended partitions.

Systems that use the UEFI boot loader method (see Chapter 5) use the more advanced GUID Partition Table (GPT) method for managing partitions, which supports up to 128 partitions on a drive. Linux assigns the partition numbers in the order that the partition appears on the drive, starting with number 1.

Linux creates /dev files for each separate disk partition. It attaches the partition number to the end of the device name and numbers the primary partitions starting at 1, so the first primary partition on the first SATA drive would be /dev/sda1. MBR extended partitions are numbered starting at 5, so the first extended partition is assigned the file /dev/sda5.

Automatic Drive Detection

Linux systems detect drives and partitions at boot time and assign each one a unique device file name. However, with the invention of removable USB drives (such as memory sticks), which can be added and removed at will while the system is running, that method needed to be modified.

Most Linux systems now use the udev application. The udev program runs in background at all times and automatically detects new hardware connected to the running Linux system. As you connect new drives, USB devices, or optical drives (such as CD and DVD devices), udev will detect them and assign each one a unique device file name in the /dev folder.

Another feature of the udev application is that it also creates persistent device files for storage devices. When you add or remove a removable storage device, the /dev name assigned to it may change, depending on what devices are connected at any given time. That can make it difficult for applications to find the same storage device each time.

To solve that problem, the udev application uses the /dev/disk folder to create links to the /dev storage device files based on unique attributes of the drive. There are four separate folders udev creates for storing links:

  • /dev/disk/by-id links storage devices by their manufacturer make, model, and serial number.
  • /dev/disk/by-label links storage devices by the label assigned to them.
  • /dev/disk/by-path links storage devices by the physical hardware port they are connected to.
  • /dev/disk/by-uuid links storage devices by the 128-bit universally unique identifier (UUID) assigned to the device.

With the udev device links, you can specifically reference a storage device by a permanent identifier rather than where or when it was plugged into the Linux system.

Partitioning Tools

After you connect a drive to your Linux system, you’ll need to create partitions on it (even if there’s only one partition). Linux provides several tools for working with raw storage devices to create partitions. The following sections cover the most popular partitioning tools you’ll run across in Linux.

Working with fdisk

The most common command-line partitioning tool is the fdisk utility. The fdisk program allows you to create, view, delete, and modify partitions on any drive that uses the MBR method of indexing partitions.

To use the fdisk program, you must specify the drive device name (not the partition name) of the device you want to work with:

$ sudo fdisk /dev/sda
[sudo] password for rich:
Welcome to fdisk (util-linux 2.23.2).

Changes will remain in memory only, until you decide to write them.
Be careful before using the write command.


Command (m for help):

The fdisk program uses its own command line that allows you to submit commands to work with the drive partitions. Table 11.1 shows the common commands you have available to work with.

Table 11.1 Common fdisk commands

Command Description
a Toggle a bootable flag.
b Edit bsd disk label.
c Toggle the DOS compatibility flag.
d Delete a partition.
g Create a new empty GPT partition table.
G Create an IRIX (SGI) partition table.
l List known partition types.
m Print this menu.
n Add a new partition.
o Create a new empty DOS partition table.
p Print the partition table.
q Quit without saving changes.
s Create a new empty Sun disklabel.
t Change a partition’s system ID.
u Change display/entry units.
v Verify the partition table.
w Write table to disk and exit.
x Extra functionality (experts only).

The p command displays the current partition scheme on the drive:

Command (m for help): p

Disk /dev/sda: 10.7 GB, 10737418240 bytes, 20971520 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk label type: dos
Disk identifier: 0x000528e6

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1   *        2048     2099199     1048576   83  Linux
/dev/sda2         2099200    20971519     9436160   83  Linux

Command (m for help):

In this example, the /dev/sda drive is sectioned into two partitions, sda1 and sda2. The Id and System columns refer to the type of filesystem the partition is formatted to handle. We cover that in the section “Understanding Filesystems” later in this chapter. Both partitions are formatted to support a Linux filesystem. The first partition is allocated about 1GB of space, while the second is allocated a little over 9GB of space.

The fdisk command is somewhat rudimentary in that it doesn’t allow you to alter the size of an existing partition; all you can do is delete the existing partition and rebuild it from scratch.

To be able to boot the system from a partition, you must set the boot flag for the partition. You do that with the a command. The bootable partitions are indicated in the output listing with an asterisk.

If you make any changes to the drive partitions, you must exit using the w command to write the changes to the drive.

Working with gdisk

If you’re working with drives that use the GPT indexing method, you’ll need to use the gdisk program:

$ sudo gdisk /dev/sda
[sudo] password for rich:
GPT fdisk (gdisk) version 1.0.3

Partition table scan:
  MBR: protective
  BSD: not present
  APM: not present
  GPT: present

Found valid GPT with protective MBR; using GPT.

Command (? for help):

The gdisk program identifies the type of formatting used on the drive. If the drive doesn’t currently use the GPT method, gdisk offers you the option to convert it to a GPT drive.

images Be careful with converting the drive method specified for your drive. The method you select must be compatible with the system firmware (BIOS or UEFI). If not, your drive will not be able to boot.

The gdisk program also uses its own command prompt, allowing you to enter commands to manipulate the drive layout, as shown in Table 11.2.

Table 11.2 Common gdisk commands

Command Description
b Back up GPT data to a file.
c Change a partition’s name.
d Delete a partition.
i Show detailed information on a partition.
l List known partition types.
n Add a new partition.
o Create a new empty GUID partition table (GPT).
p Print the partition table.
q Quit without saving changes.
r Recovery and transformation options (experts only).
s Sort partitions.
t Change a partition’s type code.
v Verify disk.
w Write table to disk and exit.
x Extra functionality (experts only).
? Print this menu.

You’ll notice that many of the gdisk commands are similar to those in the fdisk program, making it easier to switch between the two programs. One of the added options that can come in handy is the i option that displays more detailed information about a partition:

Command (? for help): i
Partition number (1-3): 2
Partition GUID code: 0FC63DAF-8483-4772-8E79-3D69D8477DE4 (Linux filesystem)
Partition unique GUID: 5E4213F9-9566-4898-8B4E-FB8888ADDE78
First sector: 1953792 (at 954.0 MiB)
Last sector: 26623999 (at 12.7 GiB)
Partition size: 24670208 sectors (11.8 GiB)
Attribute flags: 0000000000000000
Partition name: ’’

Command (? for help):

The GNU parted Command

The GNU parted program provides yet another command-line interface for working with drive partitions:

$ sudo parted
GNU Parted 3.2
Using /dev/sda
Welcome to GNU Parted! Type ’help’ to view a list of commands.
(parted) print
Model: ATA VBOX HARDDISK (scsi)
Disk /dev/sda: 15.6GB
Sector size (logical/physical): 512B/512B
Partition Table: gpt
Disk Flags:

Number  Start   End     Size    File system     Name  Flags
 1      1049kB  1000MB  999MB   fat32                 boot, esp
 2      1000MB  13.6GB  12.6GB  ext4
 3      13.6GB  15.6GB  2000MB  linux-swap(v1)

(parted)

One of the selling features of the parted program is that it allows you to modify existing partition sizes, so you can easily shrink or grow partitions on the drive.

Graphical Tools

There are also some graphical tools available to use if you’re working from a graphical desktop environment. The most common of these is the GNOME Partition Editor, called GParted. Figure 11.1 shows an example of running the gparted command in an Ubuntu desktop environment.

The figure shows a screenshot illustrating an example of running the gparted command in an Ubuntu desktop environment.

Figure 11.1 The GParted interface

The gparted window displays each of the drives on a system one at a time, showing all of the partitions contained in the drive in a graphical layout. You right-click a partition to select options for mounting or unmounting, formatting, deleting, or resizing the partition.

While it’s certainly possible to interact with a drive as a raw device, that’s not usually how Linux applications work. There’s a lot of work trying to read and write data to a raw device. Instead, Linux provides a method for handling all of the dirty work for us, which is covered in the next section.

Understanding Filesystems

Just like storing stuff in a closet, storing data in a Linux system requires some method of organization for it to be efficient. Linux utilizes filesystems to manage data stored on storage devices. A filesystem utilizes a method of maintaining a map to locate each file placed in the storage device. This and the following sections describe the Linux filesystem and show how you can locate files and folders contained within it.

The Linux filesystem can be one of the most confusing aspects of working with Linux. Locating files on drives, CDs, and USB memory sticks can be a challenge at first.

If you’re familiar with how Windows manages files and folders, you know that Windows assigns drive letters to each storage device you connect to the system. For example, Windows uses C: for the main drive on the system or E: for a USB memory stick plugged into the system.

In Windows, you’re used to seeing file paths such as

C:Users
ichDocuments	est.docx

This path indicates the file is located in the Documents folder for the rich user account, which is stored on the disk partition assigned the letter C (usually the first drive on the system).

The Windows path tells you exactly what physical device the file is stored on. Unfortunately, Linux doesn’t use this method to reference files. It uses a virtual directory structure. The virtual directory contains file paths from all the storage devices installed on the system consolidated into a single directory structure.

The Virtual Directory

The Linux virtual directory structure contains a single base directory, called the root directory. The root directory lists files and folders beneath it based on the folder path used to get to them, similar to the way Windows does it.

images Be careful with the terminology here. While the main admin user account in Linux is called root, that’s not related to the root virtual directory folder. The two are separate, which can be confusing.

For example, a Linux file path could look like this:

/home/rich/Documents/test.doc

First, note that the Linux path uses forward slashes instead of the backward slashes that Windows uses. That’s an important difference that trips many novice Linux administrators. As for the path itself, also notice that there’s no drive letter. The path only indicates that the file test.doc is stored in the Documents folder for the user rich; it doesn’t give you any clues as to which physical device contains the file.

Linux places physical devices in the virtual directory using mount points. A mount point is a folder placeholder within the virtual directory that points to a specific physical device. Figure 11.2 demonstrates how this works.

The figure shows the Linux virtual directory structure, divided between two drives: “Hard Drive 1” and “Hard Drive 2.”

Figure 11.2 The Linux virtual directory structure divided between two drives

In Figure 11.2, there are two drives used on the Linux system. The first drive on the left is associated with the root of the virtual directory. The second drive is mounted at the location /home, which is where the user folders are located. Once the second drive is mounted to the virtual directory, files and folders stored on the drive are available under the /home folder.

Since Linux stores everything within the virtual directory, it can get somewhat complicated. Fortunately, there’s a standard format defined for the Linux virtual directory, called the Linux filesystem hierarchy standard (FHS). The FHS defines core folder names and locations that should be present on every Linux system and what type of data they should contain. Table 11.3 shows just a few of the more common folders defined in the FHS.

Table 11.3 Common Linux FHS folders

Folder Description
/boot Contains bootloader files used to boot the system
/home Contains user data files
/media Used as a mount point for removable devices
/mnt Also used as a mount point for removable devices
/opt Contains data for optional third-party programs
/tmp Contains temporary files created by system users
/usr Contains data for standard Linux programs
/usr/bin Contains local user programs and data
/usr/local Contains data for programs unique to the local installation
/usr/sbin Contains data for system programs and data

images While the FHS helps standardize the Linux virtual filesystem, not all Linux distributions follow it completely. It’s best to consult with your specific Linux distribution’s documentation on how it manages files within the virtual directory structure.

Maneuvering Around the Filesystem

Using the virtual directory makes it a breeze to move files from one physical device to another. You don’t need to worry about drive letters, just the locations within the virtual directory:

$ cp /home/rich/Documents/myfile.txt /media/usb

In moving the file from the Documents folder to a USB memory stick, we used the full path within the virtual directory to both the file and the USB memory stick. This format is called an absolute path. The absolute path to a file always starts at the root folder (/) and includes every folder along the virtual directory tree to the file.

Alternatively, you can use a relative path to specify a file location. The relative path to a file denotes the location of a file relative to your current location within the virtual directory tree structure. If you were already in the Documents folder, you’d just need to type

$ cp myfile.txt /media/usb

When Linux sees that the path doesn’t start with a forward slash, it assumes the path is relative to the current directory.

Formatting Filesystems

Before you can assign a drive partition to a mount point in the virtual directory, you must format it using a filesystem. There are myriad different filesystem types that Linux supports, with each having different features and capabilities. The following sections discuss the different filesystems that Linux supports and how to format a drive partition for the filesystems.

Common Filesystem Types

Each operating system utilizes its own filesystem type for storing data on drives. Linux not only supports several of its own filesystem types, it also supports filesystems of other operating systems. The following sections cover the most common Linux and non-Linux filesystems that you can use in your Linux partitions.

Linux Filesystems

When you create a filesystem specifically for use on a Linux system, there are four main filesystems that you can choose from:

  • btrfs: A newer, high-performance filesystem that supports files up to 16 exbibytes (EiB) in size and a total filesystem size of 16EiB. It also can perform its own form of Redundant Array of Inexpensive Disks (RAID) as well as logical volume management (LVM). It includes additional advanced features such as built-in snapshots for backup, improved fault tolerance, and data compression on the fly.
  • eCryptfs: The Enterprise Cryptographic File System (eCryptfs) applies a POSIX-compliant encryption protocol to data before storing it on the device. This provides a layer of protection for data stored on the device. Only the operating system that created the filesystem can read data from it.
  • ext3: Also called ext3fs, this is a descendant of the original Linux ext filesystem. It supports files up to 2 tebibytes (TiB), with a total filesystem size of 16TiB. It supports journaling as well as faster startup and recovery.
  • ext4: Also called ext4fs, it’s the current version of the original Linux filesystem. It supports files up to 16TiB, with a total filesystem size of 1EiB. It also supports journaling and utilizes improved performance features.
  • reiserFS: Created before the Linux ext3fs filesystem and commonly used on older Linux systems, it provides features now found in ext3fs and ext4fs. Linux has dropped support for the most recent version, reiser4fs.
  • swap: The swap filesystem allows you to create virtual memory for your system using space on a physical drive. The system can then swap data out of normal memory into the swap space, providing a method of adding additional memory to your system. This is not intended for storing persistent data.

The default filesystem used by most Linux distributions these days is ext4fs. The ext4fs filesystem provides journaling, which is a method of tracking data not yet written to the drive in a log file, called the journal. If the system fails before the data can be written to the drive, the journal data can be recovered and stored upon the next system boot.

Non-Linux Filesystems

One of the great features of Linux that makes it so versatile is its ability to read data stored on devices formatted for other operating systems, such as Apple and Microsoft. This feature makes it a breeze to share data between different systems running different operating systems.

Here’s a list of the more common non-Linux filesystems that Linux can handle:

  • CIFS: The Common Internet File System (CIFS) is a filesystem protocol created by Microsoft for reading and writing data across a network using a network storage device. It was released to the public for use on all operating systems.
  • HFS: The Hierarchical File System (HFS) was developed by Apple for its Mac OS systems. Linux can also interact with the more advanced HFS+ filesystem.
  • ISO-9660: The ISO-9660 standard is used for creating filesystems on CD-ROM devices.
  • NFS: The Network File System (NFS) is an open-source standard for reading and writing data across a network using a network storage device.
  • NTFS: The New Technology File System (NTFS) is the filesystem used by the Microsoft NT operating system and subsequent versions of Windows. Linux can read and write data on an NTFS partition as of kernel 2.6.x.
  • SMB: The Server Message Block (SMB) filesystem was created by Microsoft as a proprietary filesystem used for network storage and interacting with other network devices (such as printers). Support for SMB allows Linux clients and servers to interact with Microsoft clients and servers on a network.
  • UDF: The Universal Disc Format (UDF) is commonly used on DVD-ROM devices for storing data. Linux can both read data from a DVD and write data to a DVD using this filesystem.
  • VFAT: The Virtual File Allocation Table (VFAT) is an extension of the original Microsoft File Allocation Table (FAT) filesystem. It’s not commonly used on drives but is commonly used for removable storage devices such as USB memory sticks.
  • XFS: The X File System (XFS) was created by Silicon Graphics for its (now defunct) advanced graphical workstations. The filesystem provided some advanced high-performance features that makes it still popular in Linux.
  • ZFS: The Zettabyte File System (ZFS) was created by Sun Microsystems (now part of Oracle) for its Unix workstations and servers. Another high-performance filesystem, it has features similar to the btrfs Linux filesystem.

It’s generally not recommended to format a partition using a non-Linux filesystem if you plan on using the drive for only Linux systems. Linux supports these filesystems mainly as a method for sharing data with other operating systems.

Creating Filesystems

The Swiss Army knife for creating filesystems in Linux is the mkfs program. The mkfs program is actually a front end to several individual tools for creating specific filesystems, such as the mkfs.ext4 program for creating ext4 filesystems.

The beauty of the mkfs program is that you only need to remember one program name to create any type of filesystem on your Linux system. Just use the -t option to specify the filesystem type:

$ sudo mkfs -t ext4 /dev/sdb1
mke2fs 1.44.1 (24-Mar-2018)
Creating filesystem with 2621440 4k blocks and 655360 inodes
Filesystem UUID: f9137b26-0caf-4a8a-afd0-392002424ee8
Superblock backups stored on blocks:

32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632
Allocating group tables: done
Writing inode tables: done
Creating journal (16384 blocks): done
Writing superblocks and filesystem accounting information: done
$

After you specify the -t option, just specify the partition device file name for the partition you want to format on the command line. Notice that the mkfs program does a lot of things behind the scenes when formatting the filesystem. Each filesystem has its own method for indexing files and folders and tracking file access. The mkfs program creates all of the index files and tables necessary for the specific filesystem.

images Be very careful when specifying the partition device file name. When you format a partition, any existing data on the partition is lost. If you specify the wrong partition name, you could lose important data or make your Linux system not able to boot.

Mounting Filesystems

Once you’ve formatted a drive partition with a filesystem, you can add it to the virtual directory on your Linux system. This process is called mounting the filesystem.

You can either manually mount the partition within the virtual directory structure from the command line or allow Linux to automatically mount the partition at boot time. The following sections walk through both of these methods.

Manually Mounting Devices

To temporarily mount a filesystem to the Linux virtual directory, use the mount command. The basic format for the mount command is

mount -t fstype device mountpoint

Use the -t command-line option to specify the filesystem type of the device:

$ sudo mount -t ext4 /dev/sdb1 /media/usb1
$

If you specify the mount command with no parameters, it displays all of the devices currently mounted on the Linux system. Be prepared for a long output though, as most Linux distributions mount lots of virtual devices in the virtual directory to provide information about system resources. Listing 11.1 shows a partial output from a mount command.

Listing 11.1: Output from the mount command

$ mount
...
/dev/sda2 on / type ext4 (rw,relatime,errors=remount-ro,data=ordered)
/dev/sda1 on /boot/efi type vfat
 (rw,relatime,fmask=0077,dmask=0077,codepage=437,iocharset=iso8859
-1,shortname=mixed,errors=remount-ro)
...
/dev/sdb1 on /media/usb1 type ext4 (rw,relatime,data=ordered)
/dev/sdb2 on /media/usb2 type ext4 (rw,relatime,data=ordered)
rich@rich-TestBox2:~$

To save space, we trimmed down the output from the mount command to show only the physical devices on the system. The main hard drive device (/dev/sda) contains two partitions, and the USB memory stick device (/dev/sdb) also contains two partitions.

images The mount command uses the -o option to specify additional features of the filesystem, such as mounting it in read-only mode, user permissions assigned to the mount point, and how data is stored on the device. These options are shown in the output of the mount command. Usually you can omit the -o option to use the system defaults for the new mount point.

The downside to the mount command is that it only temporarily mounts the device in the virtual directory. When you reboot the system, you have to manually mount the devices again. This is usually fine for removable devices, such as USB memory sticks, but for more permanent devices it would be nice if Linux could mount them for us automatically. Fortunately for us, Linux can do just that.

To remove a mounted drive from the virtual directory, use the umount command (note the missing n). You can remove the mounted drive by specifying either the device file name or the mount point directory.

Automatically Mounting Devices

For permanent storage devices, Linux maintains the /etc/fstab file to indicate which drive devices should be mounted to the virtual directory at boot time. The /etc/fstab file is a table that indicates the drive device file (either the raw file or one of its permanent udev file names), the mount point location, the filesystem type, and any additional options required to mount the drive. Listing 11.2 shows the /etc/fstab file from an Ubuntu workstation.

Listing 11.2: The /etc/fstab file

$ cat /etc/fstab
# /etc/fstab: static file system information.
#
# Use ’blkid’ to print the universally unique identifier for a
# device; this may be used with UUID= as a more robust way to name devices
# that works even if disks are added and removed. See fstab(5).
#
# <file system> <mount point>   <type>  <options>       <dump>  <pass>
# / was on /dev/sda2 during installation
UUID=46a8473c-8437-4d5f-a6a1-6596c492c3ce /               ext4  
 errors=remount-ro 0       1
# /boot/efi was on /dev/sda1 during installation
UUID=864B-62F5  /boot/efi       vfat    umask=0077      0       1
# swap was on /dev/sda3 during installation
UUID=8673447a-0227-47d7-a67a-e6b837bd7188 none            swap    sw            
0       0
$

This /etc/fstab file references the devices by their udev UUID value, ensuring that the correct drive partition is accessed no matter the order in which it appears in the raw device table. The first partition is mounted at the /boot/efi mount point in the virtual directory. The second partition is mounted at the root (/) of the virtual directory, and the third partition is mounted as a swap area for virtual memory.

You can manually add devices to the /etc/fstab file so that they are mounted automatically when the Linux system boots. However, if they don’t exist at boot time, that will generate a boot error.

images If you use the encryptfs filesystem type on any partitions, they will appear in the /etc/crypttab file and will be mounted automatically at boot time. While the system is running, you can also view all of the currently mounted devices, whether they were mounted automatically by the system or manually by users, by viewing the /etc/mtab file.

Managing Filesystems

Once you’ve created a filesystem and mounted it to the virtual directory, you may have to manage and maintain it to keep things running smoothly. The following sections walk through some of the Linux utilities available for managing the filesystems on your Linux system.

Retrieving Filesystem Stats

As you use your Linux system, there’s no doubt that at some point you’ll need to monitor disk performance and usage. There are a few different tools available to help you do that:

  • df displays disk usage by partition.
  • du displays disk usage by directory, good for finding users or applications that are taking up the most disk space.
  • iostat displays a real-time chart of disk statistics by partition.
  • lsblk displays current partition sizes and mount points.

In addition to these tools, the /proc and /sys folders are special filesystems that the kernel uses for recording system statistics. Two directories that can be useful when working with filesystems are the /proc/partitions and /proc/mounts folders, which provide information on system partitions and mount points, respectively. Additionally, the /sys/block folder contains separate folders for each mounted drive, showing partitions and kernel-level stats.

images Some filesystems, such as ext3 and ext4, allocate a specific number of inodes when created. An inode is an entry in the index table that tracks files stored on the filesystem. If the filesystem runs out of inode entries in the table, you can’t create anymore files, even if there’s available space on the drive. Using the -i option with the df command will show you the percentage of inodes used on a filesystem and can be a lifesaver.

Filesystem Tools

Linux uses the e2fsprogs package of tools to provide utilities for working with ext filesystems (such as ext3 and ext4). The most popular tools in the e2fsprogs package are as follows:

  • blkid displays information about block devices, such as storage drives.
  • chattr changes file attributes on the filesystem.
  • debugfs manually views and modifies the filesystem structure, such as undeleting a file or extracting a corrupt file.
  • dumpe2fs displays block and superblock group information.
  • e2label changes the label on the filesystem.
  • resize2fs expands or shrinks a filesystem.
  • tune2fs modifies filesystem parameters.

These tools help you fine-tune parameters on an ext filesystem, but if corruption occurs on the filesystem, you’ll need the fsck program.

The XFS filesystem also has a set of tools available for tuning the filesystem. Here are the two that you’ll most likely run across:

  • xfs_admin displays or changes filesystem parameters such as the label or UUID assigned.
  • xfs_info displays information about a mounted filesystem, including the block sizes and sector sizes as well as label and UUID information.

While these ext and XFS tools are useful, they can’t help fix things if the filesystem itself has errors. For that, the fsck program is the tool to use:

$ sudo fsck -f /dev/sdb1
fsck from util-linux 2.31.1
e2fsck 1.44.1 (24-Mar-2018)
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
/dev/sdb1: 11/655360 files (0.0% non-contiguous), 66753/2621440 blocks
$

images  At the time of this writing the XFS module for fsck does not repair XFS filesystems. For now you’ll need to use the xfs_repair tool.

The fsck program is a front end to several different programs that check the various filesystems to match the index against the actual files stored in the filesystem. If any discrepancies occur, run the fsck program in repair mode, and it will attempt to reconcile the discrepancies and fix the filesystem.

Storage Alternatives

Standard partition layouts on storage devices do have their limitations. Once you create and format a partition, it’s not easy making it larger or smaller. Individual partitions are also susceptible to disk failures, in which case all of the data stored in the partition will be lost.

To accommodate more dynamic storage options, as well as fault-tolerance features, Linux has incorporated a few advanced storage management techniques. The following sections cover three of the more popular techniques you’ll run into.

Multipath

The Linux kernel now supports Device Mapper Multipathing (DM-multipathing), which allows you to configure multiple paths between the Linux system and network storage devices. Multipathing aggregates the paths providing for increased throughout while all of the paths are active or for fault tolerance if one of the paths becomes inactive.

Linux DM-multipathing includes the following tools:

  • dm-multipath: The kernel module that provides multipath support
  • multipath: A command-line command for viewing multipath devices
  • multipathd: A background process for monitoring paths and activating/deactivating paths
  • kpartx: A command-line tool for creating device entries for multipath storage devices

The DM-multipath feature uses the dynamic /dev/mapper device file folder in Linux. Linux creates a /dev/mapper device file named mpathN for each new multipath storage device you add to the system, where N is the number of the multipath drive. That file acts as a normal device file to the Linux system, allowing you to create partitions and filesystems on the multipath device just as you would a normal drive partition.

Logical Volume Manager

The Linux Logical Volume Manager (LMV) also utilizes the /dev/mapper dynamic device folder to allow you to create virtual drive devices. You can aggregate multiple physical drive partitions into virtual volumes, which you then treat as a single partition on your system.

The benefit of LVM is that you can add and remove physical partitions as needed to a logical volume, expanding and shrinking the logical volume as needed.

Using LVM is somewhat complicated. Figure 11.3 demonstrates the layout for an LVM environment.

The figure shows the layout for an LVM environment.

Figure 11.3 The Linux LVM layout

In the example shown in Figure 11.3, three physical drives each contain three partitions. The first logical volume consists of the first two partitions of the first drive. The second logical volume spans drives, combining the third partition of the first drive with the first and second partitions of the second drive to create one volume. The third logical volume consists of the third partition of the second drive and the first two partitions of the third drive. The third partition of the third drive is left unassigned and can be added later to any of the logical volumes when needed.

For each physical partition, you must mark the partition type as the Linux LVM filesystem type in fdisk or gdisk. Then, you must use several LVM tools to create and manage the logical volumes:

  • pvcreate creates a physical volume.
  • vgcreate groups physical volumes into a volume group.
  • lvcreate creates a logical volume from partitions in each physical volume.

The logical volumes create entries in the /dev/mapper folder that represent the LVM device you can format with a filesystem and use like a normal partition. Listing 11.3 shows the steps you’d take to create a new LVM logical volume and mount it to your virtual directory.

Listing 11.3: Creating, formatting, and mounting a logical volume

$ sudo gdisk /dev/sdb

Command (? for help): n
Partition number (1-128, default 1): 1
First sector (34-10485726, default = 2048) or {+-}size{KMGTP}:
Last sector (2048-10485726, default = 10485726) or {+-}size{KMGTP}:
Current type is ’Linux filesystem’
Hex code or GUID (L to show codes, Enter = 8300): 8e00
Changed type of partition to ’Linux LVM’

Command (? for help): w

Final checks complete. About to write GPT data.
THIS WILL OVERWRITE EXISTING PARTITIONS!!

Do you want to proceed? (Y/N): Y
OK; writing new GUID partition table (GPT) to /dev/sdb.
The operation has completed successfully.

$ sudo pvcreate /dev/sdb1
  Physical volume "/dev/sdb1" successfully created.

$ sudo vgcreate newvol /dev/sdb1
  Volume group "newvol" successfully created

$ sudo lvcreate -l 100%FREE -n lvdisk newvol
  Logical volume "lvdisk" created.

$ sudo mkfs -t ext4 /dev/mapper/newvol-lvdisk
mke2fs 1.44.1 (24-Mar-2018)
Creating filesystem with 1309696 4k blocks and 327680 inodes
Filesystem UUID: 06c871bc-2eb6-4696-896f-240313e5d4fe
Superblock backups stored on blocks:
       32768, 98304, 163840, 229376, 294912, 819200, 884736

Allocating group tables: done                            
Writing inode tables: done                            
Creating journal (16384 blocks): done
Writing superblocks and filesystem accounting information: done

$ sudo mkdir /media/newdisk
$ sudo mount /dev/mapper/newvol-lvdisk /media/newdisk
$ cd /media/newdisk
$ ls -al
total 24
drwxr-xr-x 3 root root  4096 Jan 10 10:17 .
drwxr-xr-x 4 root root  4096 Jan 10 10:18 ..
drwx------ 2 root root 16384 Jan 10 10:17 lost+found
$

While the initial setup of a LVM is complicated, it does provide great benefits. If you run out of space in a logical volume, just add a new disk partition to the volume.

Using RAID Technology

Redundant Array of Inexpensive Disks (RAID) technology has changed the data storage environment for most data centers. RAID technology allows you to improve data access performance and reliability as well as implement data redundancy for fault tolerance by combining multiple drives into one virtual drive. There are several versions of RAID commonly used:

  • RAID-0: Disk striping, spreads data across multiple disks for faster access.
  • RAID-1: Disk mirroring duplicates data across two drives.
  • RAID-10: Disk mirroring and striping provides striping for performance and mirroring for fault tolerance
  • RAID-4: Disk striping with parity adds a parity bit stored on a separate disk so that data on a failed data disk can be recovered.
  • RAID-5: Disk striping with distributed parity adds a parity bit to the data stripe so that it appears on all of the disks so that any failed disk can be recovered.
  • RAID-6: Disk striping with double parity stripes both the data and the parity bit so two failed drives can be recovered.

The downside is that hardware RAID storage devices can be somewhat expensive (despite what the I stands for) and are often impractical for most home uses. Because of that, Linux has implemented a software RAID system that can implement RAID features on any disk system.

The mdadm utility allows you to specify multiple partitions to be used in any type of RAID environment. The RAID device appears as a single device in the /dev/mapper folder, which you can then partition and format to a specific filesystem.

Exercise 11.1 Experimenting with filesystems

This exercise will demonstrate how to partition, format, and mount a drive for use on a Linux system using a USB memory stick. You’ll need to have an empty USB memory stick available for this exercise. All data will be deleted from the USB memory stick.

  1. Log into your Linux system and open a new command prompt.
  2. Insert a USB memory stick into your system. If you’re using a virtual machine (VM) environment, you may need to configure the VM to recognize the new USB device. For VirtualBox, click the Devices menu bar item, then select USB, and then the USB device name.
  3. The Linux system should mount the device automatically. Type dmesg | tail to display the last few lines from the system console output. This should show if the USB device was mounted and, if so, the device name assigned to it, such as /dev/sdb1.
  4. Unmount the device using the command sudo umount /dev/xxxx, where xxxx is the device name shown from the dmesg output.
  5. Type fdisk /dev/xxx to partition the disk, where xxx is the device name, without the partition number (such as /dev/sdb). At the command prompt, type p to display the current partitions.
  6. Remove the existing partition by typing d.
  7. Create a new partition. Type n to create a new partition. Type p to create a primary partition. Type 1 to assign it as the first partition. Press the Enter key to accept the default starting location and then press the Enter key again to accept the default ending location. Type y to remove the original VFAT signature if prompted.
  8. Save the new partition layout. Type w to save the partition layout and exit the fdisk program.
  9. Create a new filesystem on the new partition. Type sudo mkfs -t ext4 /dev/xxx1, where xxx is the device name for the USB memory stick.
  10. Create a new mount point in your home folder. Type mkdir mediatest1.
  11. Mount the new filesystem to the mount point. Type sudo mount -t ext4 /dev/xxx1 mediatest1, where xxx is the device name. Type ls mediatest1 to list any files currently in the filesystem.
  12. Remove the USB stick by typing sudo umount /dev/xxx1, where xxx is the device name.
  13. If you want to return the USB memory stick to a Windows format, you can change the filesystem type of the USB memory stick to VFAT, or you can reformat it using the Windows format tool in File Manager.

Summary

The ability to permanently store data on a Linux system is a must. The Linux kernel supports both hard drive disk (HDD) and solid-state drive (SSD) technologies for persistently storing data. It also supports the three main types of drive connections—PATA, SATA, and SCSI. For each storage device you connect to the system, Linux creates a raw device file in the /dev folder. The raw device is hdx for PATA drives and sdx for SATA and SCSI drives, where x is the drive letter assigned to the drive.

Once you connect a drive to the Linux system, you’ll need to create partitions on the drive. For MBR disks, you can use the fdisk or parted command-line tool or the gparted graphical tool. For GPT disks, you can use the gdisk or gparted tool. When you partition a drive, you must assign it a size and a filesystem type.

After you partition the storage device, you must format it using a filesystem that Linux recognizes. The mkfs program is a front-end utility that can format drives using most of the filesystems that Linux supports. The ext4 filesystem is currently the most popular Linux filesystem. It supports journaling and provides good performance. Linux also supports more advanced filesystems, such as btrfs, xfs, zfs, and of course, the Windows vfat and ntfs filesystems.

After creating a filesystem on the partition, you’ll need to mount the filesystem into the Linux virtual directory using a mount point and the mount command. The data contained in the partition’s filesystem appears under the mount point folder within the virtual directory. To automatically mount partitions at boot time, make an entry for each partition in the /etc/fstab file.

There are a host of tools available to help you manage and maintain filesystems. The df and du command-line commands are useful for checking disk space for partitions and the virtual directory, respectively. The fsck utility is a vital tool for repairing corrupt partitions and is run automatically at boot time against all partitions automatically mounted in the virtual directory.

Linux also supports alternative solutions for storage, such as multipath IO for fault tolerance, logical volumes (within which you can add and remove physical partitions), and software RAID technology.

Exam Essentials

Describe how Linux works with storage devices. Linux creates raw device files in the /dev folder for each storage device you connect to the system. Linux also assigns a raw device file for each partition contained in the storage device.

Explain how to prepare a partition to be used in the Linux virtual directory. To use a storage device partition in the virtual directory, it must be formatted with a filesystem that Linux recognizes. Use the mkfs command to format the partition. Linux recognizes several different filesystem types, including ext3, ext4, btrfs, xfs, and zfs.

Describe how Linux can implement a fault-tolerance storage configuration. Linux supports two types of fault-tolerance storage methods. The multipath method uses the mdadm utility to create two paths to the same storage device. If both paths are active, Linux aggregates the path speed to increase performance to the storage device. If one path fails, Linux automatically routes traffic through the active path. Linux can also use standard RAID technology to support RAID levels 0, 1, 10, 4, 5, or 6 for fault tolerance and high- performance storage.

Describe how Linux uses virtual storage devices. Linux uses the logical volume manager (LVM) to create a virtual storage device from one or more physical devices. The pvcreate command defines a volume from a physical partition, and the vgcreate command creates a volume group from one or more virtual volumes. The lvcreate command then creates a logical volume in the /dev/mapper folder from one or more partitions in the volume group. This method allows you to add or remove drives within a filesystem to grow or shrink the filesystem area as needed.

List some of the filesystem tools available in Linux. The df tool allows you to analyze the available and used space in drive partitions, while the du tool allows you to analyze space within the virtual directory structure. The e2fsprogs package provides a wealth of tools for tuning ext filesystems, such as debugfs, dumpe2fs, tune2fs, and blkid. Linux also provides the xfs_admin and xfs_info tools for working with xfs filesystems. The fsck tool is available for repairing corrupt filesystems and can repair most cases of file corruption.

Review Questions

  1. Which type of storage device uses integrated circuits to store data with no moving parts?

    1. SSD
    2. SATA
    3. SCSI
    4. HDD
    5. PATA
  2. What raw device file would Linux create for the second SCSI drive connected to the system?

    1. /dev/hdb
    2. /dev/sdb
    3. /dev/sdb1
    4. /dev/hdb1
    5. /dev/sda
  3. What program runs in the background to automatically detect and mount new storage devices?

    1. mkfs
    2. fsck
    3. umount
    4. mount
    5. udev
  4. What folder does the udev program use to create a permanent link to a storage device based on its serial number?

    1. /dev/disk/by-path
    2. /dev/sdb
    3. /dev/disk/by-id
    4. /dev/disk/by-uuid
    5. /dev/mapper
  5. Which partitioning tool provides a graphical interface?

    1. gdisk
    2. gparted
    3. fdisk
    4. parted
    5. fsck
  6. Linux uses to add the file-system on a new storage device to the virtual directory.

    1. Mount points
    2. Drive letters
    3. /dev files
    4. /proc folder
    5. /sys folder
  7. What filesystem is the latest version of the original Linux filesystem?

    1. reiserFS
    2. btrfs
    3. ext3
    4. ext4
    5. nfs
  8. What tool do you use to create a new filesystem on a partition?

    1. fdisk
    2. mkfs
    3. fsck
    4. gdisk
    5. parted
  9. What tool do you use to manually add a filesystem to the virtual directory?

    1. fsck
    2. mount
    3. umount
    4. fdisk
    5. mkfs
  10. The program is a handy tool for repairing corrupt filesystems.

    1. fsck
    2. mount
    3. umount
    4. fdisk
    5. mkfs
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset