Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

10 Directory tree

Of all the programming tasks, I’m embarrassed to admit that I enjoy coding file utilities the most. The casual user is unaware of the mountain of information about files provided by the operating system. It’s highly detailed low-hanging fruit, eager for plucking. Plus, exploring files and directories opens your understanding of how computer storage works. Exploring this field may inspire you to write your own interesting file utilities. If it doesn’t, you can keep reading this chapter—your introduction to filesystems and storage.

The goal here is to create a directory tree program. The output shows subdirectories as they sit in the hierarchical filesystem. In addition to being exposed to the word hierarchical (which I can amazingly both spell and type), in this chapter you learn how to:

Examine information about a file
Decipher file modes and permissions
Read a directory entry
Use recursion to explore the directory structure
Extract a directory name from a full pathname
Output a directory tree
Avoid confusing the word hierarchical with hieroglyphical

Before diving into the details, be aware that GUI nomenclature prefers the term folder over directory. As a C programmer, you must use the term directory, not folder. All C functions that deal with files and directories use directory or contain the abbreviation “dir.” Don’t wimp out and use the term folder.

The point of the directory tree utility is to output a map of the directory structure. The map details which directories are parents and children of each other. Unlike years ago, today’s directory structures are busy with lots of organization. Users are more attentive when it comes to saving files. Programs are geared toward this type of organization and provide hints to help users employ the subdirectory concept.

Even if a directory map seems trivial, the process of exploring the directory tree lends itself well to other handy disk utilities. For example, chapter 11 covers a file-finding utility, which relies heavily upon the information presented in this chapter to make the utility truly useful.

10.1 The filesystem

At the core of all media storage lies the filesystem. The filesystem describes the way data is stored on media, how files are accessed, and various nerdy tidbits about the files themselves.

The only time most users deal with the filesystem concept is when formatting media. Choosing a filesystem is part of the formatting process, because it determines how the media is formatted and which protocols to follow. This step is necessary for compatibility: not every filesystem is compatible with every computer operating system. Therefore, the user is allowed to select a filesystem for the media’s format to allow for sharing between operating systems, such as Linux and PC or Macintosh.

The filesystem’s duty is to organize storage. It takes a file’s data and writes it to one or more locations on the media. This information is recorded along with other file details, such as the file’s name, size, dates (created, modified, accessed), permissions, and so on.

Some of the file details are readily obtainable through existing utilities or from various C library functions. But most of the mechanics of the filesystem are geared toward saving, retrieving, and updating the file’s data lurking on the media. All this action takes place automatically under the supervision of the operating system.

The good news for most coders is that it isn’t necessary to know the minutiae of how files are stored on media. Even if you go full nerd and understand the subtle differences between the various filesystems and can tout the benefits of the High Performance File System (HPFS ) at nerd cocktail parties, the level of media access required to manipulate a filesystem requires privileges above where typical C programs operate. Functions are available for exploring a file’s details. These functions are introduced in the next section.

Aside from knowing the names and perhaps a few details on how filesystems work, if you’re curious, you can use common tools on your computer to see which filesystems are in use. In a Linux terminal window, use the man fs command to review details on how Linux uses a filesystem and the different filesystems available. The /proc/filesystems directory lists available filesystems for your Linux installation.

Windows keeps its filesystem information tucked away in the Disk Management console. To access this window, follow these steps:

Tap the Windows key on the keyboard to open the Start menu.
Type Disk Management.
From the list of search results, choose Create and Format Hard Disk Partitions.

Figure 10.1 shows the Disk Management console from one of my Windows computers. Available media is presented in the table, with the File System column listing the filesystems used; only NTFS is shown in the figure.

10-01

Figure 10.1 The Disk Management console reveals the filesystem used to format media available to the PC.

On the Macintosh, you can use the Disk Utility to browse available media to learn which filesystem is in use. This app is found in the Utilities directory: in the Finder, click Go > Utilities to view the directory and access the Disk Utility app.

If it were easy or necessary to program a filesystem, I’d explore the topic further. For now, understand that the filesystem is the host for data stored on media in a computer. A program such as a directory tree uses the filesystem, but in C, such a utility doesn’t need to know details about the filesystem type to do its job.

10.2 File and directory details

To gather directory details at the command prompt, use the ls command. It’s available in all shells, dating back to the first, prehistoric version of Unix used by the ancient Greeks, when the command was known as λσ. The output is a list of filenames in the current directory:

$ ls
changecwd.c  dirtree04.c   fileinfo03.c  readdir01.c  subdir01.c  subdir05.c
dirtree01.c  extractor.c   fileinfo04.c  readdir02.c  subdir02.c  subdir06.c
dirtree02.c  fileinfo01.c  fileinfo05.c  readdir03.c  subdir03.c
dirtree03.c  fileinfo02.c  getcwd.c  readdir04.c  subdir04.c

For more detail, the -l (long) switch is specified:

$ ls -l
total 68
-rwxrwxrwx 1 dang dang  292 Oct 31 16:26 changecwd.c
-rwxrwxrwx 1 dang dang 1561 Nov  4 21:14 dirtree01.c
-rwxrwxrwx 1 dang dang 1633 Nov  5 10:39 dirtree02.c
...

This output shows details about each file, its permissions, ownership, size, date, and other trivia you can use to intimidate your computer illiterate pals. It’s not secret stuff; the details output by the ls -l command are stored in the directory like a database. In fact, directories on storage media are really databases. Their records aren’t specifically files, but rather inodes.

An inode is not an Apple product. No, it’s a collection of data that describes a file. Although your C programs can’t readily access low-level filesystem details, you can easily examine a file’s inode data. The inode’s name is the same as the file’s name. But beyond the name, the inode contains oodles of details about the file.

10.2.1 Gathering file info

To obtain details about a file, as well as to read a directory, you need to access inode data. The command-line program that does so is called stat. Here’s some sample output on the stat program file fileinfo:

  File: fileinfo
  Size: 8464        Blocks: 24         IO Block: 4096   regular file
Device: eh/14d  Inode: 11258999068563657  Links: 1
Access: (0777/-rwxrwxrwx)  Uid: ( 1000/    dang)   Gid: ( 1000/    dang)
Access: 2021-10-23 21:11:17.457919300 -0700
Modify: 2021-10-23 21:11:00.071527400 -0700
Change: 2021-10-23 21:11:00.071527400 -0700

These details are stored in the directory database. In fact, part of the output shows the file’s inode number: 11258999068563657. Of course, the name fileinfo is far easier to use as a reference.

To read this same information in your C programs, you use the stat() function. It’s prototyped in the sys/stat.h header file. Here is the man page format:

int stat(const char *pathname, struct stat *statbuf);

The pathname is a filename or a full pathname. Argument statbuf is the address of a stat structure. Here’s a typical stat() function statement, with the filename char pointer containing the filename, fs as a stat structure, and int variable r capturing the return value:

r = stat(filename,&fs);

Upon failure, value -1 is returned. Otherwise, 0 is returned and the stat structure fs is joyously filled with details about the file—inode data. Table 10.1 lists the common members of the stat structure, though different filesystems and operating systems add or change specific members.

Table 10.1 Members in the stat() function’s statbuf structure

Member	Data type (placeholder)	Detail
st_dev	dev_t (%lu)	ID of the media (device) containing the file
st_ino	ino_t (%lu)	Inode number
st_mode	mode_t (%u)	File type, mode, permissions
st_nlink	nlink_t (%lu)	Number of links
st_uid	uid_t (%u)	Owner’s user ID
st_gid	gid_t (%u)	Group’s user ID
st_rdev	dev_t (%lu)	Special file type’s device ID
st_size	off_t (%lu)	File size in bytes
st_blksize	blksize_t (%lu)	Filesystem’s block size
st_blocks	blkcnt_t (%lu)	File blocks allocated (512-byte blocks)
st_atime	struct timespec	Time file last accessed
st_mtime	struct timespec	Time file last modified
st_ctime	struct timespec	Time file status last changed

Most of the stat structure members are integers; I’ve specified the printf() placeholder type in table 10.1. They’re all unsigned, though some values are unsigned long. Watch out for the long unsigned values because the compiler bemoans using the incorrect placeholder to represent these values.

The timespec structure is accessed as a time_t pointer. It contains two members: tv_sec and tv_nsec for seconds and nanoseconds, respectively. An example of using the ctime() function to access this structure is shown later.

The following listing shows a sample program, fileinfo01.c, that outputs file (or inode) details. Each of the stat structure members is accessed for a file supplied as a command-line argument. Most of the code consists of error-checking—for example, to confirm that a filename argument is supplied and to check on the return status of the stat() function.

Listing 10.1 Source code for fileinfo01.c

#include <stdio.h>
#include <stdlib.h>
#include <sys/stat.h>
#include <time.h>
 
int main(int argc, char *argv[])                         ❶
{
    char *filename;
    struct stat fs;
    int r;
 
    if( argc<2 )                                         ❷
    {
        fprintf(stderr,"Specify a filename
");
        exit(1);
    }
 
    filename = argv[1];                                  ❸
    printf("Info for file '%s'
",filename);
    r = stat(filename,&fs);                              ❹
    if( r==-1 )                                          ❺
    {
        fprintf(stderr,"Error reading '%s'
",filename);
        exit(1);
    }
 
    printf("Media ID: %lu
",fs.st_dev);                 ❻
    printf("Inode #%lu
",fs.st_ino);                    ❻
    printf("Type and mode: %u
",fs.st_mode);            ❻
    printf("Hard links = %lu
",fs.st_nlink);            ❻
    printf("Owner ID: %u
",fs.st_uid);                  ❻
    printf("Group ID: %u
",fs.st_gid);                  ❻
    printf("Device ID: %lu
",fs.st_rdev);               ❻
    printf("File size %lu bytes
",fs.st_size);          ❻
    printf("Block size = %lu
",fs.st_blksize);          ❻
    printf("Allocated blocks = %lu
",fs.st_blocks);     ❻
    printf("Access: %s",ctime(&fs.st_atime));            ❼
    printf("Modification: %s",ctime(&fs.st_mtime));      ❼
    printf("Changed: %s",ctime(&fs.st_ctime));           ❼
    return(0);
}

❶ The filename is supplied as a program argument.

❷ Confirms the first argument

❸ Referring to the argument using char pointer filename aids readability.

❹ Calls the stat() function

❺ Checks for an error

❻ Outputs the members of the stat structure fs

❼ The time structures use the ctime() function to output their values.

The information output by the fileinfo01.c program mirrors what the command-line stat utility coughs up. Here’s a sample run on the same file, fileinfo, this code’s program:

Info for file 'fileinfo'
Media ID: 14
Inode #7318349394555950
Type and mode: 33279
Hard links = 1
Owner ID: 1000
Group ID: 1000
Device ID: 0
File size 8464 bytes
Block size = 4096
Allocated blocks = 24
Access: Tue Oct 26 15:55:10 2021
Modification: Tue Oct 26 15:55:10 2021
Changed: Tue Oct 26 15:55:10 2021

The details are the same as for the stat command’s output shown earlier in this section. The stat command does look up the Device ID, Owner ID, and Group ID details, which your code could do as well. But one curious item is structure member st_mode, the type and mode value. The value shown in the output above is 33279. This integer value contains a lot of details—bit fields—which you see interpreted in the stat command’s output. Your code can also examine this value to determine the file type and its permissions.

10.2.2 Exploring file type and permissions

Examining a file’s (or inode’s) st_mode value is how you determine whether a file is a regular old file, a directory, or some other special type of file. Remember that in the Linux environment, everything is a file. Using the stat() function is how your code can determine which type of file the inode represents.

The bit fields in the st_mode member of the stat structure also describe the file’s permissions. Though you could code a series of complex bitwise logical operations to ferret out the specific details contained in the st_mode value’s bits, I recommend that you use instead the handy macros available in the sys/stat.h header file.

For example, the S_ISREG() macro returns TRUE for regular files. To update the fileinfo01.c code to test for regular files, add the following statements:

printf("Type and mode: %X
",fs.st_mode);
if( S_ISREG(fs.st_mode) )
    printf("%s is a regular file
",filename);
else
    printf("%s is not a regular file
",filename);

If the S_ISREG() test on the fs.st_mode variable returns TRUE, the printf() statement belonging to the if statement outputs text confirming that the file is regular. The else condition handles other types of files, such as directories.

In my update to the code, fieinfo02.c (available in the online archive), I removed all the printf() statements from the original code. The five statements shown earlier replace the original printf() statements, because the focus of this update is to determine file type. Here’s sample output on the fileinfo02.c source code file itself:

Info for file 'fileinfo02.c'
Type and mode: 81FF
Fileinfo02.c is a regular file

If I instead specify the single dot (.), representing the current directory, I see this output:

Info for file '.'
Type and mode: 41FF
. is a directory

In the output above, the st_mode value changes as well as the return value from the S_ISREG() macro; a directory isn’t a regular file. In fact, you can test for directories specifically by using the S_ISDIR() macro:

printf("Type and mode: %X
",fs.st_mode);
if( S_ISREG(fs.st_mode) )
    printf("%s is a regular file
",filename);
else if( S_ISDIR(fs.st_mode) )
    printf("%s is a directory
",filename);
else
   printf("%s is some other type of file
",filename);

I’ve made these modifications and additions to the code in fileinfo02.c, with the improvements saved in fileinfo03.c, available in this book’s online repository.

Further modifications to the code are possible by using the full slate of file mode macros, listed in table 10.2. These are the common macros, though your C compiler and operating system may offer more. Use these macros to identify files by their type.

Table 10.2 Macros defined in sys/stat.h to help determine file type

Macro	True for this type of file
S_ISBLK()	Block special, such as mass storage in the /dev directory
S_ISCHR()	Character special, such as a pipe or the /dev/null device
S_ISDIR()	Directories
S_ISFIFO()	A FIFO (named pipe) or socket
S_ISREG()	Regular files
S_ISLNK()	Symbolic link
S_ISSOCK()	Socket

File type details aren’t the only information contained in the st_mode member of the stat structure. This value also reveals the file’s permissions. File permissions refer to access bits that determine who-can-do-what to a file. Three access bits, called an octet, are available:

Read (r)
Write (w)
Execute (x)

Read permission means that the file is accessed read-only: the file’s data can be read but not modified. Write permission allows the file to be read and written to. Execute permission is set for program files, such as your C programs (set automatically by the compiler or linker), shell scripts (set manually), and directories. This is all standard Linux stuff, so if you desire more information, hunt down a grim, poorly written book on Linux for specifics.

In Linux, the chmod command sets and resets file permissions. These permissions can be seen in the long listing of a file when using the ls command with the -l (little L) switch:

$ ls -l fileinfo
-rwxrwxrwx 1 dang dang 8464 Oct 26 15:55 fileinfo

The first chunk of info, -rwxrwxrwx, indicates the file type and permissions, which are detailed in figure 10.2. Next is the number of hard links (1), the owner (dang), and the group (dang). The value 8,464 is the file size in bytes, and then comes the date and time stamp, and finally the filename.

10-02

Figure 10.2 Deciphering file permission bits in a long directory listing

Three sets of file permissions octets are used for a file. These sets are based on user classification:

Owner
Group
Other

You are the owner of the files you create. As a user on the computer, you are also a member of a group. Use the id command to view your username and ID number, as well as the groups you belong to (names and IDs). View the /etc/group file to see the full list of groups on the system.

File owners grant themselves full access to their files. Setting group permissions is one way to grant access to a bunch of system users at once. The third field, other, applies to anyone who is not the owner or in the named group.

In the long directory listing, a file’s owner and group appear as shown earlier. This value is interpreted from the st_mode member of the file’s stat structure. As with obtaining the file’s type, you can use defined constants and macros available in the sys/stat.h header file to test for the permissions for each user classification.

I count nine permission-defined constants available in sys/stat.h, which accounts for each permission octet (three) and the three permission types: read, write, and execute. These are shown in table 10.3.

Table 10.3 Defined constants used for permissions, available from the sys/stat.h header file

Defined constant	Permission octet
S_IRUSR	Owner read permission
S_IWUSR	Owner write permission
S_IXUSR	Owner execute permission
S_IRGRP	Group read permission
S_IWGRP	Group write permission
S_IXGRP	Group execute permission
S_IROTH	Other read permission
S_IWOTH	Other write permission
S_IXOTH	Other execute permission

The good news is that these defined constants follow a naming pattern: each defined constant starts with S_I. The I is followed by R, W, or X for read, write, or execute, respectively. This letter is followed by USR, GRP, OTH for Owner (user), Group, and Other. This naming convention is summarized in figure 10.3.

10-03

Figure 10.3 The naming convention used for permission defined constants in sys/stat.h

For example, if you want to test the read permission for a group user, you use the S_IRGRP defined constant: S_I plus R for read and GRP for group. This defined constant is used in an if test with a bitwise AND operator to test the permission bit on the st_mode member:

 If( fs.st_mode & S_IRGRP )

The value in fs_st_mode (the file’s mode, including type and permissions) is tested against the bit in the S_IRGRP defined constant. If the test is true, meaning the bit is set, the file has read-only permissions set for the “other” group.

Listing 10.2 puts the testing macros and defined constants to work for a file supplied as a command-line argument. This update to the fileinfo series of programs outputs the file type and permissions for the named file. An if else-if else structure handles the different file types as listed in table 10.2. Three sets of if tests output permissions for the three different groups. You see all the macros and defined constants discussed in this section used in the code. The code appears lengthy, but it contains a lot of copied and pasted information.

Listing 10.2 Source code for fileinfo04.c

#include <stdio.h>
#include <stdlib.h>
#include <sys/stat.h>
#include <time.h>
 
int main(int argc, char *argv[])
{
    char *filename;
    struct stat fs;
    int r;
 
    if( argc<2 )
    {
        fprintf(stderr,"Specify a filename
");
        exit(1);
    }
 
    filename = argv[1];
    r = stat(filename,&fs);
    if( r==-1 )
    {
        fprintf(stderr,"Error reading '%s'
",filename);
        exit(1);
    }
                                         ❶
    printf("File '%s' is a ",filename);
    if( S_ISBLK(fs.st_mode) )            ❷
        printf("block special
");
    else if( S_ISCHR(fs.st_mode) )
        printf("character special
");
    else if( S_ISDIR(fs.st_mode) )
        printf("directory
");
    else if( S_ISFIFO(fs.st_mode) )
        printf("named pipe or socket
");
    else if( S_ISREG(fs.st_mode) )
        printf("regular file
");
    else if( S_ISLNK(fs.st_mode) )
        printf("symbolic link
");
    else if( S_ISSOCK(fs.st_mode) )
        printf("socket
");
    else
        printf("type unknown
");
 
    printf("Owner permissions: ");       ❸
    if( fs.st_mode & S_IRUSR )
        printf("read ");
    if( fs.st_mode & S_IWUSR )
        printf("write ");
    if( fs.st_mode & S_IXUSR )
        printf("execute");
    putchar('
');
 
    printf("Group permissions: ");       ❹
    if( fs.st_mode & S_IRGRP )
        printf("read ");
    if( fs.st_mode & S_IWGRP )
        printf("write ");
    if( fs.st_mode & S_IXGRP )
        printf("execute");
    putchar('
');
 
    printf("Other permissions: ");       ❺
    if( fs.st_mode & S_IROTH )
        printf("read ");
    if( fs.st_mode & S_IWOTH )
        printf("write ");
    if( fs.st_mode & S_IXOTH )
        printf("execute");
    putchar('
');
 
    return(0);
}

❶ New stuff starts here.

❷ Determines the file type, a long if-else structure

❸ Tests owner permission bits

❹ Tests group permission bits

❺ Tests other permission bits

The program I created from the source code shown in listing 10.2 is named a.out, the default. Here is a sample run on the original fileinfo program:

$ ./a.out fileinfo
File 'fileinfo' is a regular file
Owner permissions: read write execute
Group permissions: read write execute
Other permissions: read write execute

The information shown here corresponds to an ls -l listing output of -rwxrwxrwx.

Here is the output for system directory /etc:

$ ./a.out /etc
File '/etc' is a directory
Owner permissions: read write execute
Group permissions: read execute
Other permissions: read execute

From this output, the file type is correctly identified as a directory. The owner permissions are rwx (the owner is root). The group and other permissions are r-x, which means anyone on the computer can read and access (execute) the directory.

Exercise 10.1

The if-else structures in listing 10.2 (fileinfo04.c) contain a lot of repetition. Seeing repetitive statements in code cries out to me for a function. Your task for this exercise is to a write a function that outputs a file’s permissions.

Call the function permissions_out(). It takes a mode_t argument of the st_mode member in a stat structure. Here is the prototype:

void permissions_out(mode_t stm);

Use the function to output a string of permissions for each of the three access levels: owner, group, other. Use characters r, w, x, for read, write, and execute access if a bit is set; use a dash (-) for unset items. This output is the same as shown in the ls -l listing, but without the leading character identifying the file type.

A simple approach exists for this function, and I hope you find it. If not, you can view my solution in the source code file fileinfo05.c, available in the online repository. Please try this exercise on your own before peeking at my solution; comments in my code explain my philosophy. Use the fileinfo series of programs to perform the basic operations for the stat() function, if you prefer.

10.2.3 Reading a directory

A directory is a database of files, but call them inodes if you want to have a nerd find you attractive. Just like a file, a directory database is stored on media. But you can’t use the fopen() function to open and read the contents of a directory. No, instead you use the opendir() function. Here is its man page format:

DIR *opendir(const char *filename);

The opendir() function accepts a single argument, a string representing the pathname of the directory to examine. Specifying the shortcuts . and .. for the current and parent directory are also valid.

The function returns a pointer to a DIR handle, similar to the FILE handle used by the fopen() command. As the FILE handle represents a file stream, the DIR handle represents a directory stream.

Upon an error, the NULL pointer is returned. The global errno value is set, indicating the specific booboo the function encountered.

The opendir() function features a companion closedir() function, similar to the fclose() function as a companion to fopen(). The closedir() function requires a single argument, the DIR handle of an open directory stream, humorously called “dirp” in the man page format example:

int closedir(DIR *dirp);

Yes, I know that the internet spells it “derp.”

Upon success, the closedir() function returns 0. Otherwise, the value -1 is returned and the global errno variable is set, yadda-yadda.

Both the opendir() and closedir() functions are prototyped in the dirent.h header file.

In the following listing, you see both the opendir() and closedir() functions put to work. The current directory "." is opened because it’s always valid.

Listing 10.3 Source code for readdir01.c

#include <stdio.h>
#include <stdlib.h>
#include <dirent.h>
 
int main()
{
    DIR *dp;                         ❶
 
    dp = opendir(".");               ❷
    if(dp == NULL)                   ❸
    {
        puts("Unable to read directory");
        exit(1);
    }
 
    puts("Directory is opened!");
 
    closedir(dp);                    ❹
    puts("Directory is closed!");
 
    return(0);
}

❶ Directory handle

❷ Opens the current directory, whatever it may be

❸ Exits the program upon failure to open

❹ And just closes it back up

The code in listing 10.3 merely opens and closes the current directory. Boring! To access the files stored in the directory, you use another function, readdir(). This function is also prototyped in the dirent.h header file. Here is the man page format:

struct dirent *readdir(DIR *dirp);

The function consumes an open DIR handle as its only argument. The return value is the address of a dirent structure, which contains details about a directory entry. This function is called repeatedly to read file entries (inodes) from the directory stream. The value NULL is returned after the final entry in the directory has been read.

Sadly, the dirent structure isn’t as rich as I’d like it to be. Table 10.4 lists the two consistent structure members, though some C libraries offer more members. Any extra members are specific to the compiler or operating system and shouldn’t be relied on for code you plan to release into the wild. The only two required members for the POSIX.1 standard are d_ino for the entry’s inode and d_name for the entry’s filename.

Table 10.4 Common members of the dirent structure

Member	Data type (placeholder)	Description
d_ino	ino_t (%lu)	Inode number
d_reclen	unsigned short (%u)	Record length

The best structure member to use, and one that’s consistently available across all compilers and platforms, is d_name. This member is used in the source code for readdir02.c, shown in the next listing. This update to readdir01.c removes two silly puts() statements. Added is a readdir() statement, along with a printf() function to output the name of the first file found in the current directory.

Listing 10.4 Source code for readdir02.c

#include <stdio.h>
#include <stdlib.h>
#include <dirent.h>
 
int main()
{
    DIR *dp;
    struct dirent *entry;                   ❶
 
    dp = opendir(".");
    if(dp == NULL)
    {
        puts("Unable to read directory");
        exit(1);
    }
 
    entry = readdir(dp);                    ❷
 
    printf("File %s
",entry->d_name);      ❸
 
    closedir(dp);
 
    return(0);
}

❶ The dirent structure is created as a pointer, a memory address.

❷ The entry is read and stored in the dirent structure entry.

❸ The d_name member is output.

The program generated from the readdir02.c source code outputs only one file—most likely, the entry for the current directory itself, the single dot. Obviously, if you want a real directory-reading program, you must modify the code.

As with using the fread() function to read data from a regular file , the readdir() function is called repeatedly. When the function returns a pointer to a dirent structure, another entry is available in the directory. Only when the function returns NULL has the full directory been read.

To update the code from readdir02.c to readdir03.c, you must change the readdir() statement into a while loop condition. The printf() statement is then set inside the while loop. Here are the changed lines:

while( (entry = readdir(dp)) != NULL )
{
    printf("File %s
",entry->d_name);
}

The while loop repeats as long as the value returned from readdir() isn’t NULL. With this update, the program now outputs all files in the current directory.

To gather more information about files in a directory, use the stat() function, covered earlier in this chapter. The readdir() function’s dirent structure contains the file’s name in the d_name member. When this detail is known, you use the stat() function to gather details on the file’s type as well as other information.

The final rendition of the readdir series of programs is shown next. It combines code previously covered in this chapter to create a crude directory listing program. Entries are read one at a time, with the stat() function returning specific values for file type, size, and access date.

Listing 10.5 Source code for readdir04.c

#include <stdio.h>
#include <stdlib.h>
#include <sys/stat.h>
#include <dirent.h>
#include <time.h>
 
int main()
{
    DIR *dp;
    struct dirent *entry;
    struct stat fs;
    int r;
    char *filename;
 
    dp = opendir(".");
    if(dp == NULL)
    {
        puts("Unable to read directory");
        exit(1);
    }
 
    while( (entry = readdir(dp)) != NULL )
    {
        filename = entry->d_name;              ❶
        r = stat( filename,&fs );              ❷
        if( r==-1 )
        {
            fprintf(stderr,"Error reading '%s'
",filename);
            exit(1);
        }
        if( S_ISDIR(fs.st_mode) )              ❸
            printf(" Dir %-16s ",filename);    ❹
        else
            printf("File %-16s ",filename);    ❺
 
        printf("%8lu bytes ",fs.st_size);      ❻
 
        printf("%s",ctime(&fs.st_atime));      ❼
    }
 
    closedir(dp);
 
    return(0);
}

❶ Saves the directory entry’s name for readability and easy access

❷ Fills the stat structure for the current filename/directory entry

❸ Calls out directories from other file types

❹ Outputs the directory filename left-justified in a 16-character width

❺ Lines up a standard filename just like the directory filename

❻ Outputs the file size in an 8-character width

❼ Outputs the access time, which automatically adds a newline

This code shows that to truly read a directory, you need both the readdir() and stat() functions. Together, they pull in details about files in the directory—useful information if you plan on exploring directories or writing similar file utilities, such as a directory tree.

Here is sample output from the program generated by the readdir04.c source code:

 Dir .                    4096 bytes Sat Oct 30 16:44:34 2021
 Dir ..                   4096 bytes Fri Oct 29 21:55:05 2021
File a.out                8672 bytes Sat Oct 30 16:44:34 2021
File fileinfo             8464 bytes Tue Oct 26 15:55:22 2021
File fileinfo01.c          966 bytes Sat Oct 30 16:24:49 2021
File readdir01.c           268 bytes Fri Oct 29 19:30:10 2021

Incidentally, the order in which directory entries appear is dependent on the operating system. Some operating systems sort the entries alphabetically, so the readdir() function fetches filenames in that order. This behavior isn’t consistent, so don’t rely upon it for the output of your directory-reading programs.

10.3 Subdirectory exploration

Directories are referenced in three ways:

As a named path
As the .. shortcut to the parent directory
As a directory entry in the current directory, a subdirectory

Whatever the approach, pathnames are either direct or relative. A direct path is a fully named path, starting at the root directory, your home directory, or the current directory. A relative pathname uses the .. shortcut for the parent directory—sometimes, a lot of them.

As an example, a full pathname could be:

/home/dang/documents/finances/bank/statements

This direct pathname shows the directories as they branch from the root, through my home directory, down to the statements directory.

If I have another directory, /home/dang/documents/vacations, but I’m using the statements directory (shown earlier), the relative path from statements to vacations is:

../../../vacations

The first .. represents the bank directory. The second .. represents the finances directory. The third .. represents the documents directory, where vacations exists as a subdirectory. This construction demonstrates a relative path.

These details about the path are a basic part of using Linux at the command prompt. Understanding these items is vital when it comes to your C programs and how they explore and access directories.

10.3.1 Using directory exploration tools

Along with using the opendir() function to read a directory and readdir() to examine directory entries, your code may need to change directories. Further, the program may want to know in which directory it’s currently running. Two C library functions exist to sate these desires: chdir() and getcwd(). I cover getcwd() first because it can be used to confirm that the chdir() function did its job.

The getcwd() function obtains the directory in which the program is operating. Think of the name as Get the Current Working Directory. It works like the pwd command in the terminal window. This function is prototyped in the unistd.h header file. Here is the man page format:

char *getcwd(char *buf, size_t size);

Buffer buf is a character array or buffer of size characters. It’s where the current directory string is saved, an absolute path from the root. Here’s a tip: you can use the BUFSIZ defined constant for the size of the buffer as well as the second argument to getcwd(). Some C libraries have a PATH_MAX defined constant, which is available from the limits.h header file. Because its availability is inconsistent, I recommend using BUFSIZ instead. (The PATH_MAX defined constant is covered in chapter 11.)

The return value from getcwd() is the same character string saved in buf, or NULL upon an error. For the specific error, check the global errno variable.

The following listing shows a tiny demo program, getcwd.c, that outputs the current working directory. I use the BUFSIZ defined constant to set the size for char array cwd[]. The function is called and then the string output.

Listing 10.6 Source code for getcwd.c

#include <stdio.h>
#include <unistd.h>
 
int main()
{
    char cwd[BUFSIZ];                                      ❶
    getcwd(cwd,BUFSIZ);
    printf("The current working directory is %s
",cwd);   ❷
 
    return(0);
}

❶ The defined constant BUFSIZ is defined in the stdio.h header file.

❷ Outputs the current working directory

When run, the program outputs the current working directory as a full pathname. The buffer is filled with the same text you’d see output from the pwd command.

The second useful directory function is chdir(). This function works like the cd command in Linux. If you pay the senior price to see a movie, you may have used the chdir command in MS-DOS, though cd was also available and quicker to type.

Like getcwd(), the chdir() function is prototyped in the unistd.h header file. Here is the man page format:

int chdir(const char *path);

The sole argument is a string representing the directory (path) to change to. The return value is 0 upon success, with -1 indicating an error. As you may suspect by now, the global variable errno is set to indicate exactly what went afoul.

I use both directory exploration functions in the changecwd.c source code shown in the next listing. The chdir() function changes to the parent directory, indicated by the double dots. The getcwd() function obtains the full pathname to the new directory, outputting the results.

Listing 10.7 Source code for changecwd.c

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
 
int main()
{
    char cwd[BUFSIZ];
    int r;
 
    r = chdir("..");                                        ❶
    if( r==-1 )
    {
        fprintf(stderr,"Unable to change directories
");
        exit(1);
    }
 
    getcwd(cwd,BUFSIZ);                                     ❷
    printf("The current working directory is %s
",cwd);    ❸
 
    return(0);
}

❶ Changes to the parent directory

❷ Obtains the parent directory’s path

❸ Outputs the parent directory’s path

The resulting program outputs the pathname to the parent directory of the directory in which the program is run.

You notice in the source code for changecwd.c that I don’t bother returning to the original directory. Such coding isn’t necessary. An important thing to remember about using the chdir() function is that the directory change happens only in the program’s environment. The program may change to directories all over the media, but when it’s done, the directory is the same as where the program started.

10.3.2 Diving into a subdirectory

It’s easy to change to a subdirectory when you know its full path. An absolute path can be supplied by the user or it can be hardcoded into the program. But what happens when the program isn’t aware of its directory’s location?

The parent directory is always known; you can use the double-dot abbreviation (..) to access the parent of every directory except the top level. Going up is easy. Going down requires a bit more work.

Subdirectories are found by using the tools presented so far in this chapter: scan the current directory for subdirectory entries. Once known, plug the subdirectory name into the chdir() function to visit that subdirectory.

The code for subdir01.c in the next listing builds a program that lists potential subdirectories in a named directory. Portions of the code are pulled from other examples listed earlier in this chapter: a directory argument is required and tested for. The named directory is then opened and its entries read. If any subdirectories are found, they’re listed.

Listing 10.8 Source code for subdir01.c

#include <stdio.h>
#include <stdlib.h>
#include <sys/stat.h>
#include <dirent.h>
 
int main(int argc, char *argv[])
{
    DIR *dp;
    struct dirent *entry;
    struct stat fs;
    int r;
    char *dirname,*filename;
 
    if( argc<2 )                                       ❶
    {
        fprintf(stderr,"Missing directory name
");
        exit(1);
    }
    dirname = argv[1];                                 ❷
 
    dp = opendir(dirname);                             ❸
    if(dp == NULL)
    {
        fprintf(stderr,"Unable to read directory '%s'
",
                dirname
               );
        exit(1);
    }
 
    while( (entry = readdir(dp)) != NULL )             ❹
    {
        filename = entry->d_name;                      ❺
        r = stat( filename,&fs );                      ❻
        if( r==-1 )                                    ❼
        {
            fprintf(stderr,"Error on '%s'
",filename);
            exit(1);
        }
 
        if( S_ISDIR(fs.st_mode) )                      ❽
            printf("Found directory: %s
",filename);  ❾
    }
 
    closedir(dp);
 
    return(0);
}

❶ Confirms that a command-line argument (directory name) is available

❷ Assigns a pointer dirname to the first argument for readability

❸ Opens the directory and tests for an error

❹ Reads entries in the directory

❺ Assigns a pointer filename to each entry for readability

❻ Obtains inode details

❼ Tests for an error

❽ Tests to see whether the file is a directory (subdirectory)

❾ Outputs the directory’s name

The program generated from the source code subdir01.c reads the directory supplied as a command-line argument and then outputs any subdirectories found in that directory. Here is output from a sample run, using my home directory:

$ ./subdir /home/dang
Found directory: .
Found directory: ..
Error on '.bash_history'

Here is output from the root directory:

$ ./subdir /home/dang
Found directory: .
Found directory: ..
Error on 'bin'

In both examples, the stat() function fails. Your code could examine the errno variable, set when the function returns -1, but I can tell you right away what the error is: the first argument passed to the stat() function must be a pathname. In the program, only the directory’s name is supplied, not a pathname. For example, the .bash_history subdirectory found in the first sample run shown earlier, and the bin directory found in the second don’t exist in the current directory.

The solution is for the program to change to the named directory. Only when you change to a directory can the code properly read the files—unless you make the effort to build full pathnames. I’m too lazy to do that, so to modify the code, I add the following statements after the statement dirname = argv[1]:

r = chdir(dirname);
if( r==-1 )
{
    fprintf(stderr,"Unable to change to %s
",dirname);
    exit(1);
}

Further, you must include the unistd.h header file so that the compiler doesn’t complain about the chdir() function.

With these updates to the code, available in the online repository as subdir02.c, the program now runs properly:

$ ./subdir /home/dang
Found directory: .
Found directory: ..
Found directory: .cache
Found directory: .config
Found directory: .ddd
Found directory: .lldb
Found directory: .ssh
Found directory: Dan
Found directory: bin
Found directory: prog
Found directory: sto

Remember: to read files from a directory, you must either change to the directory (easy) or manually construct full pathnames for the files (not so easy).

Exercise 10.2

Every directory has the dot and dot-dot entries. Plus, many directories host hidden subdirectories. All hidden files in Linux start with a single dot. Your task for this exercise is to modify the source code from subdir02.c to have the program not output any file that starts with a single dot. My solution is available in the online repository as subdir03.c.

10.3.3 Mining deeper with recursion

It wasn’t until I wrote my first directory tree exploration program that I fully understood and appreciated the concept of recursion. In fact, directory spelunking is a great way to teach any coder the mechanics behind recursion and how it can be beneficial.

As a review, recursion is the amazing capability of a function to call itself. It seems dumb, like an endless loop. Yet within the function exists an escape hatch, which allows the recursive function to unwind. Providing that the unwinding mechanism works, recursion is used in programming to solve all sorts of wonderful problems beyond just confusing beginners.

When the subdir program encounters a subdirectory, it can change to that directory to continue mining for even more directories. To do so, the same function that read the current directory is called again but with the subdirectory’s path. The process is illustrated in figure 10.4. Once the number of entries in a directory is exhausted, the process ends with a return to the parent directory. Eventually the functions return, backtracking to the original directory, and the program is done.

10-04

Figure 10.4 The process of recursively discovering directories

My issue with recursion is always how to unwind it. Plumbing the depths of subdirectories showed me that once all the directories are processed, control returns to the parent directory. Even then, as a seasoned Assembly language programmer accustomed to working where memory is tight, I fear blowing up the stack. It hasn’t happened yet—well, not when I code things properly.

To modify the subdir series of programs into a recursive directory spelunker, you must remove the program’s core, which explores subdirectories, and set it into a function. I call such a function dir(). Its argument is a directory name:

void dir(const char *dirname);

The dir() function uses a while loop to process directory entries, looking for subdirectories. When found, the function is called again (within itself) to continue processing directory entries, looking for another subdirectory. When the entries are exhausted, the function returns, eventually ending in the original directory.

The following listing implements the program flow from figure 10.4, as well as earlier versions of the subdir programs, to create a separate dir() function. It’s called recursively (within the function’s while loop) when a subdirectory is found. The main() function is also modified so that the current directory (".") is assumed when a command line argument isn’t supplied.

Listing 10.9 Source code for subdir04.c

#include <stdio.h>
#include <stdlib.h>
#include <sys/stat.h>
#include <dirent.h>
#include <unistd.h>
#include <string.h>
 
void dir(const char *dirname)                     ❶
{
    DIR *dp;
    struct dirent *entry;
    struct stat fs;
    char *filename;
    char directory[BUFSIZ];
 
    if( chdir(dirname)==-1 )                      ❷
    {
        fprintf(stderr,"Unable to change to %s
",dirname);
        exit(1);
    }
 
    getcwd(directory,BUFSIZ);                     ❸
 
    dp = opendir(directory);                      ❹
    if( dp==NULL )
    {
        fprintf(stderr,"Unable to read directory '%s'
",
                directory
               );
        exit(1);
    }
 
    printf("%s
",directory);                     ❺
    while( (entry=readdir(dp)) != NULL )          ❻
    {
        filename = entry->d_name;                 ❼
        if( strncmp( filename,".",1)==0 )         ❽
            continue;
 
        stat(filename,&fs);                       ❾
        if( S_ISDIR(fs.st_mode) )                 ❿
            dir(filename);                        ⓫
    }
 
    closedir(dp);
}
 
int main(int argc, char *argv[])
{
    if( argc<2 )
    {
        dir(".");                                 ⓬
    }
    else
        dir(argv[1]);                             ⓭
 
    return(0);
}

❶ The function’s sole argument is a directory name, dirname.

❷ Confirms that the program can change to the name directory

❸ Gets the full pathname

❹ Confirms that the directory can be opened

❺ Outputs the directory’s name

❻ Loops through the directory’s entries, looking for subdirectories

❼ Saves the found filename for readability

❽ Ignores the dot and dot-dot entries as well as hidden files

❾ Obtains details on the found directory entry (inode)

❿ Checks for a subdirectory

⓫ Recursively calls the dir() function again

⓬ If no argument is supplied, assumes the current directory

⓭ Uses the argument as the named directory

Don’t bother typing in the code for subdir04.c. (Does anyone type in code from a book anymore?) Don’t even bother obtaining the source code from the online repository. The program won’t blow up your computer, but it contains several flaws.

For example, here is a sample run on my home directory:

$ ./subdir ~
/home/dang
/mnt/c/Users/Dan
/mnt/c/Users/Dan/3D Objects
Unable to change to AppData

You see the starting directory output correctly, /home/dang. Next, the program jumps on a symbolic link to my user profile directory in Windows (from the Linux command line). So far, so good; it followed the symbolic link to /mnt/c/Users/Dan. It successfully goes to the 3D Objects directory, but then it gets lost. The directory AppData exists, but it’s not the next proper subdirectory to which the code should branch.

What’s wrong?

The flaw is present in figure 10.4 as well as in the source code shown in listing 10.9: when the dir() function starts, it issues the chdir() function to change to the named directory, dirname. But the dir() function doesn’t change back to the parent/original directory when it has finished processing a subdirectory.

To update the code and make the program return to the parent directory, add the following statements at the end of the dir() function:

if( chdir("..")==-1 )
{
    fprintf(stderr,"Parent directory lost
");
    exit(1);
}

The updated code is found in the online repository as subdir05.c. A sample run on my home directory outputs pages and pages of directories, almost properly.

Almost.

Turns out, the program created from subdir05.c can get lost, specifically with symbolic links. The code follows the symbolic link, but when it tries to return to the parent, it either loses its location or goes to the wrong parent. The problem lies with the chdir() chunk of statements just added to the code at the end of the dir() function. The parent directory isn’t specific:

chdir("..");

This statement changes to the parent directory, but it’s far better to use the parent directory’s full path. In fact, as I was playing with the code, I discovered that it’s just best to work with full pathnames throughout the dir() function. Some changes are required.

My final update redefines the dir() function as follows:

void dir(const char *dirpath, const char *parentpath);

For readability, I changed the arguments name to reflect that both are full pathnames. The first is the full pathname to the directory to scan. The second is the full pathname to the parent directory. Both are const char types because neither string is modified within the function.

Listing 10.10 shows the updated dir() function. Most of the changes involve removing char variable directory and replacing it with argument dirpath. It’s also no longer necessary to change to the named directory in the function, which now assumes that the dirpath argument represents the current directory. Further comments are found in the code.

Listing 10.10 The updated dir() function from subdir06.c

void dir(const char *dirpath,const char *parentpath)
{
    DIR *dp;
    struct dirent *entry;
    struct stat fs;
    char subdirpath[BUFSIZ];                         ❶
 
    dp = opendir(dirpath);                           ❷
    if( dp==NULL )
    {
        fprintf(stderr,"Unable to read directory '%s'
",
                dirpath
               );
        exit(1);
    }
 
    printf("%s
",dirpath);                          ❸
    while( (entry=readdir(dp)) != NULL )             ❹
    {
        if( strncmp( entry->d_name,".",1)==0 )       ❺
            continue;
 
        stat(entry->d_name,&fs);                     ❻
        if( S_ISDIR(fs.st_mode) )                    ❼
        {
            if( chdir(entry->d_name)==-1 )           ❽
            {
                fprintf(stderr,"Unable to change to %s
",
                        entry->d_name
                       );
                exit(1);
            }
 
            getcwd(subdirpath,BUFSIZ);               ❾
            dir(subdirpath,dirpath);                 ❿
        }
    }
 
    closedir(dp);                                    ⓫
 
    if( chdir(parentpath)==-1 )                      ⓬
    {
        if( parentpath==NULL )                       ⓭
            return;
        fprintf(stderr,"Parent directory lost
");
        exit(1);
    }
}

❶ Storage for the new directory to change to, storing the full pathname

❷ The program is already in the desired directory, so rather than change to it, the code attempts to open the directory and read entries.

❸ Outputs the current directory path

❹ Reads all entries in the directory

❺ Avoids any dot entries

❻ Gets info for each directory entry (inode)

❼ Checks for a subdirectory entry

❽ Changes to the subdirectory

❾ Gets the subdirectory’s full pathname for the recursive call

❿ Recursively calls the function with the subdirectory and current directory as arguments

⓫ Closes the current directory after all entries are read

⓬ Changes back to the parent directory—full pathname

⓭ Checks for NULL, in which case, just returns

Updating the dir() function requires that the main() function be updated as well. It has more work to do: the main() function must obtain the full pathname to the current directory or the argv[1] value, as well as the directory’s parent. This update to the main() function is shown here.

Listing 10.11 The updated main() function for subdir06.c

int main(int argc, char *argv[])
{
    char current[BUFSIZ];
 
    if( argc<2 )
    {
        getcwd(current,BUFSIZ);      ❶
    }
    else
    {
        strcpy(current,argv[1]);     ❷
        if( chdir(current)==-1 )     ❸
        {
            fprintf(stderr,"Unable to access directory %s
",
                    current
                   );
            exit(1);
        }
        getcwd(current,BUFSIZ);      ❹
    }
 
    dir(current,NULL);               ❺
 
    return(0);
}

❶ For no arguments, obtains and stores the full path to the current directory

❷ Copies the first argument; hopefully, a directory

❸ Changes to the directory and checks for errors

❹ Gets the directory’s full pathname

❺ Calls the function; NULL is checked in dir().

The full source code file is available in the online repository as subdir06.c. It accepts a directory argument or no argument, in which case the current directory is plumbed.

Even though the program uses full pathnames, it may still get lost. Specifically, for symbolic links, the code may wander away from where you intend. Some types of links, such as aliases in Mac OS X, aren’t recognized as directories, so they’re skipped. And when processing system directories, specifically those that contain block or character files, the program’s stack may overflow and generate a segmentation fault.

10.4 A directory tree

The ancient MS-DOS operating system featured the TREE utility. It dumped a map of the current directory structure in a festive, graphical (for a text screen) manner. This command is still available in Windows. In the CMD (command prompt) program in Windows, type TREE and you see output like that shown in figure 10.5: directories appear in a hierarchical structure, with lines connecting parent directories and subdirectories in a festive manner, along with indentation showing directory depth.

10-05

Figure 10.5 Output from the TREE command

The mechanics behind creating a directory tree program are already known to you. The source code for subdir06.c processes directories and subdirectories in the same manner as the output shown in figure 10.5. What’s missing are the shortened directory names, text mode graphics, and indentation. You can add these items, creating your own directory tree utility.

10.4.1 Pulling out the directory name

To mimic the old TREE utility, the dir() function must extract the directory name from the full pathname. Because full pathnames are used, and the string doesn’t end with a trailing slash, everything from the last slash in the string to the null character qualifies as the directory’s name.

The easy way to extract the current directory name from a full pathname is to save the name when it’s found in the parent directory: the entry->d_name structure member contains the directory’s name as it appears in the parent’s directory listing. To make this modification, the dir() function requires another argument, the short directory name. This modification is simple to code, which is why this approach is the easy way.

The problem with the easy way is that the main() function obtains a full directory path when the program is started without an argument. So, even if you choose the easy way, you still must extract the directory name from the full pathname in the main() function. Therefore, my approach is to code a new function that pulls a directory name (or filename) from the end of a path.

When I add new features to a program, such as when extracting a directory name from the butt end of a pathname, I write test code. In the next listing, you see the test code for the extract() function. Its job is to plow through a pathname to pull out the last part—assuming the last part of the string (after the final/separator character) is a directory name. Oh, and the function also assumes the environment is Linux; if you’re using Windows, you specify the backslash (two of them: \) as the path separator, though Windows 10 may also recognize the forward slash.

Listing 10.12 Source code for extractor.c

#include <stdio.h>
#include <string.h>
 
const char *extract(char *path)
{
    const char *p;
    int len;
 
    len = strlen(path);
    if( len==0 )                      ❶
        return(NULL);
    if( len==1 & *(path+0)=='/' )     ❷
        return(path);
 
    p = path+len;                     ❸
    while( *p != '/' )                ❹
    {
        p--;
        if( p==path )                 ❺
            return(NULL);
    }
    p++;                              ❻
 
    if( *p == '' )                  ❼
        return(NULL);
    else
        return(p);                    ❽
}
 
int main()
{
    const int count=4;
    const char *pathname[count] =  {  ❾
        "/home/dang",
        "/usr/local/this/that",
        "/",
        "nothing here"
    };
    int x;
 
    for(x=0; x<count; x++)
    {
        printf("%s -> %s
",
                pathname[x],
                extract(pathname[x])
              );
    }
 
    return(0);
}

❶ If the string is empty, returns NULL

❷ Performs a special test for the root directory

❸ Positions pointer p at the end of string path

❹ Backs up p to find the separator; for Windows, uses \ as the separator

❺ If p backs up too far, returns NULL

❻ Increments p over the separator character

❼ Tests to see if the string is empty or malformed and returns NULL

❽ Returns the address where the final directory name starts

❾ Tests strings for a variety of configurations

The extract() function backs up through the pathname string passed. Pointer p scans for the / separator. It leaves the function referencing the position in the string path where the final directory name starts. Upon an error, NULL is returned. A series of test strings in the main() function puts the extract() function to work. Here is the output:

/home/dang -> dang
/usr/local/this/that -> that
/ -> /
nothing here -> (null)

The extract() function successfully processes each string, returning the last part, the directory name. It even catches the malformed string, properly returning NULL.

For my first rendition of the directory tree program, I added the extract() function to the final update to the subdir series of programs, subdir06.c. The extract() function is called from within the dir() function, just before the main while loop that reads directory entries, replacing the existing printf() statement at that line:

printf("%s
",extract(dirpath));

This update is saved as dirtree01.c. The resulting program, dirtree, outputs the directories but only their names and not the full pathnames. The output is almost a directory tree program, but without proper indenting for each subdirectory level.

10.4.2 Monitoring directory depth

Programming the fancy output from the old TREE command, shown in figure 10.5, is more complicated than it looks. Emulating it exactly requires that the code use wide character output (covered in chapter 8). Further, the directory’s depth must be monitored as well as when the last subdirectory in a directory is output. Indeed, to fully emulate the TREE command requires massively restructuring the dirtree program, primarily to save directory entries for output later.

Yeah, so I’m not going there—not all the way.

Rather than restructure the entire code, I thought I’d add some indentation to make the directory output of my dirtree series a bit more “tree”-like. This addition requires that the directory depth be monitored so that each subdirectory is indented a notch. To monitor the directory depth, the definition of the dir() function is updated:

void dir(const char *dirpath,const char *parentpath, int depth);

I consider three arguments to be the maximum for a function. Any more arguments, and it becomes obvious to me that what should really be passed to the function is a structure. In fact, I wrote a version of the dirtree program that held directory entries in an array of structures. That code became overly complex, however, so I decided to just modify the dir() function as shown earlier.

To complete the modification in the code, three more changes are required. First, in the main() function, the dir() function is originally called with zero as its third argument:

dir(current,NULL,0);

The zero sets the indent depth as the program starts; the first directory is the top level.

Second, the recursive call within the dir() function must be modified, adding the third argument depth:

dir(subdirpath,dirpath,depth+1);

For the recursive call, which means the program is diving down one directory level, the indent level depth is increased by one.

Finally, something must be done with the depth variable within the dir() function. I opted to add a loop that outputs a chunk of three spaces for every depth level. This loop requires a new variable to be declared for function dir(), integer i (for indent):

for( i=0; i<depth; i++ )
    printf("   ");

This loop appears before the printf() statement that outputs the directory’s name, just before the while loop. The result is that each subdirectory is indented three spaces as the directory tree is output.

The source code for dirtree02.c is available in the online repository. Here is the program’s output for my prog (programming) directory:

prog
   asm
   c
      blog
      clock
      debug
      jpeg
      opengl
      wchar
      xmljson
      zlib
   python

Each subdirectory is indented three spaces. The sub-subdirectories of the c directory are further indented.

Exercise 10.3

Modify the source code for dirtree02.c so that instead of indenting with blanks, the subdirectories appear with text mode graphics. For example:

prog
+--asm
+--c
|  +--blog
|  +--clock
|  +--debug
|  +--jpeg
|  +--opengl
|  +--wchar
|  +--xmljson
|  +--zlib
+--python

These graphics aren’t as fancy (or precise) as those from the MS-DOS TREE command, but they are an improvement. This modification requires only a few lines of code. My solution is available in the online repository as dirtree03.c.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 10 Directory tree

Create new playlist

Sign In

Sign Up