The first program checks two lists of words and prints out the difference between the two. The method used is simple. The words in the first list are used to populate a hash (%list). The key is the word from the file, and the value is 1, as shown here:
76 while (<IN_FILE>) { 77 chomp($_); 78 $list{$_} = 1; 79 }
(Line numbers refer to the listings that follow.)
Now the words in the other file are used to populate the list. This is more complex because you already have items in the list. The rule used for population is simple: If the word is already in the list, change the 1 to a B. If it’s not, insert it with a value of 2. For example:
85 while (<IN_FILE>) { 86 chomp($_); 87 if (defined($list{$_})) { 88 $list{$_} = 'B'; 89 } else { 90 $list{$_} = 2; 91 } 92 }
The result is that %list contains all the words in both files. The key is the word. The value depends on which file the word is in. Words that appear only in file 1 have a value of 1. If a word appears only in the second file, the value is 2. Common words contained by both files have the value B.
From this point, it’s a simple process of printing out each of these categories of words. For example, to print the words that appear only in the first file, use the following code:
95 print "Words only in $ARGV[0] "; 96 foreach my $cur_key (sort keys %list) { 97 if ($list{$cur_key} eq 1) { 98 print " $cur_key "; 99 } 100 }
The full difference program appears in Listing 16.1.
1 =pod 2 3 =head1 NAME 4 5 diff.pl – Check two word lists for differences 6 7 =head1 SYNOPSIS 8 9 diff.pl <file1> <file2> 10 11 =head1 DESCRIPTION 12 13 The I<diff.pl> checks the two word lists and prints out a list 14 of words that: 15 16 =over 4 17 18 =item 1. 19 20 Appear in the first file, but not the second. 21 22 =item 2. 23 24 Appear in the second file, but not the first. 25 26 =item 3. 27 28 Appear in both files. 29 30 =back 31 32 =head1 EXAMPLES 33 34 =head2 File: list1.txt 35 36 alpha 37 beta 38 gamma 39 fred 40 41 =head2 File: list2.txt 42 43 joe 44 alpha 45 beta 46 gamma 47 48 =head3 Sample run 49 50 $ perl diff.pl list1.txt list2.txt 51 52 Words only in list1.txt 53 fred 54 Words only in list2.txt 55 joe 56 Words in both list1.txt and list2.txt 57 alpha 58 beta 59 gamma 60 61 =cut 62 use strict; 63 use warnings; 64 65 if ($#ARGV != 1) { 66 print STDERR "Usage is $0 <list1> <list2> "; 67 exit (8); 68 } 69 70 my %list = (); # Key = word, value = 1,2,B depending on 71 # which file the work occurs in 72 73 open IN_FILE, "<$ARGV[0]" or 74 die("Could not open $ARGV[0]"); 75 76 while (<IN_FILE>) { 77 chomp($_); 78 $list{$_} = 1; 79 } 80 close (IN_FILE); 81 #–––––––––––––––––––––––––––––––––––––––––––– 82 open IN_FILE, "<$ARGV[1]" or 83 die("Could not open $ARGV[1]"); 84 85 while (<IN_FILE>) { 86 chomp($_); 87 if (defined($list{$_})) { 88 $list{$_} = 'B'; 89 } else { 90 $list{$_} = 2; 91 } 92 } 93 close (IN_FILE); 94 #–––––––––––––––––––––––––––––––––––––––––––– 95 print "Words only in $ARGV[0] "; 96 foreach my $cur_key (sort keys %list) { 97 if ($list{$cur_key} eq 1) { 98 print " $cur_key "; 99 } 100 } 101 #–––––––––––––––––––––––––––––––––––––––––––– 102 print "Words only in $ARGV[1] "; 103 foreach my $cur_key (sort keys %list) { 104 if ($list{$cur_key} eq 2) { 105 print " $cur_key "; 106 } 107 } 108 #–––––––––––––––––––––––––––––––––––––––––––– 109 print "Words in both $ARGV[0] and $ARGV[1] "; 110 foreach my $cur_key (sort keys %list) { 111 if ($list{$cur_key} eq 'B') { 112 print " $cur_key "; 113 } 114 } |