List Differences

The first program checks two lists of words and prints out the difference between the two. The method used is simple. The words in the first list are used to populate a hash (%list). The key is the word from the file, and the value is 1, as shown here:

76    while (<IN_FILE>) {
77        chomp($_); 
78        $list{$_} = 1; 
79    }

(Line numbers refer to the listings that follow.)

Now the words in the other file are used to populate the list. This is more complex because you already have items in the list. The rule used for population is simple: If the word is already in the list, change the 1 to a B. If it’s not, insert it with a value of 2. For example:

85    while (<IN_FILE>) {
86        chomp($_); 
87        if (defined($list{$_})) {
88            $list{$_} = 'B'; 
89        } else {
90            $list{$_} = 2; 
91        } 
92    }

The result is that %list contains all the words in both files. The key is the word. The value depends on which file the word is in. Words that appear only in file 1 have a value of 1. If a word appears only in the second file, the value is 2. Common words contained by both files have the value B.

From this point, it’s a simple process of printing out each of these categories of words. For example, to print the words that appear only in the first file, use the following code:

95    print "Words only in $ARGV[0]
"; 
96    foreach my $cur_key (sort keys %list) {
97        if ($list{$cur_key} eq 1) {
98            print "	$cur_key
"; 
99        } 
100    }

The full difference program appears in Listing 16.1.

Listing 16.1. diff.pl
  1    =pod 
  2 
  3    =head1 NAME 
  4 
  5    diff.pl – Check two word lists for differences 
  6 
  7    =head1 SYNOPSIS 
  8 
  9        diff.pl <file1> <file2> 
 10 
 11    =head1 DESCRIPTION 
 12 
 13    The I<diff.pl> checks the two word lists and prints out a list 
 14    of words that: 
 15 
 16    =over 4 
 17 
 18    =item 1. 
 19 
 20    Appear in the first file, but not the second. 
 21 
 22    =item 2. 
 23 
 24    Appear in the second file, but not the first. 
 25 
 26    =item 3. 
 27 
 28    Appear in both files. 
 29 
 30    =back 
 31 
 32    =head1 EXAMPLES 
 33 
 34    =head2 File: list1.txt 
 35 
 36        alpha 
 37        beta 
 38        gamma 
 39        fred 
 40 
 41    =head2 File: list2.txt 
 42 
 43        joe 
 44        alpha 
 45        beta 
 46        gamma 
 47 
 48    =head3 Sample run 
 49 
 50        $ perl diff.pl list1.txt list2.txt 
 51 
 52        Words only in list1.txt 
 53                fred 
 54        Words only in list2.txt 
 55                joe 
 56        Words in both list1.txt and list2.txt 
 57                alpha 
 58                beta 
 59                gamma 
 60 
 61    =cut 
 62    use strict; 
 63    use warnings; 
 64 
 65    if ($#ARGV != 1) {
 66        print STDERR "Usage is $0 <list1> <list2>
"; 
 67        exit (8); 
 68    } 
 69 
 70    my %list = ();  # Key = word, value = 1,2,B depending on 
 71                    # which file the work occurs in 
 72 
 73    open IN_FILE, "<$ARGV[0]" or 
 74        die("Could not open $ARGV[0]"); 
 75 
 76    while (<IN_FILE>) {
 77        chomp($_); 
 78        $list{$_} = 1; 
 79    } 
 80    close (IN_FILE); 
 81   
 #––––––––––––––––––––––––––––––––––––––––––––
 
 82    open IN_FILE, "<$ARGV[1]" or 
 83        die("Could not open $ARGV[1]"); 
 84 
 85    while (<IN_FILE>) {
 86        chomp($_); 
 87        if (defined($list{$_})) {
 88            $list{$_} = 'B'; 
 89        } else {
 90            $list{$_} = 2; 
 91        } 
 92    } 
 93    close (IN_FILE); 
 94   
 #––––––––––––––––––––––––––––––––––––––––––––
 
 95    print "Words only in $ARGV[0]
"; 
 96    foreach my $cur_key (sort keys %list) {
 97        if ($list{$cur_key} eq 1) {
 98            print "	$cur_key
"; 
 99        } 
100    } 
101   
 #––––––––––––––––––––––––––––––––––––––––––––
 
102    print "Words only in $ARGV[1]
"; 
103    foreach my $cur_key (sort keys %list) {
104        if ($list{$cur_key} eq 2) {
105            print "	$cur_key
"; 
106        } 
107    } 
108   
 #––––––––––––––––––––––––––––––––––––––––––––
 
109    print "Words in both $ARGV[0] and $ARGV[1]
"; 
110    foreach my $cur_key (sort keys %list) {
111        if ($list{$cur_key} eq 'B') {
112            print "	$cur_key
"; 
113        } 
114    }

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset