Because different operating systems use different EOL conventions, when moving text files from one system to another, you must perform an EOL conversion. This script shows you one way of doing this.
1 use strict; 2 use warnings; 3 4 sub usage() 5 { 6 print STDERR "Usage $0 <unix|linux|dos|mac|apple> "; 7 exit(8); 8 } 9 10 binmode(STDIN); 11 binmode(STDOUT); 12 13 my $eol = " "; 14 15 if ($#ARGV != 0) { 16 usage(); 17 } 18 if ($ARGV[0] eq "linux") { 19 $eol = " "; 20 } elsif ($ARGV[0] eq "unix") { 21 $eol = " "; 22 } elsif ($ARGV[0] eq "dos") { 23 $eol = " "; 24 } elsif ($ARGV[0] eq "apple") { 25 $eol = " "; 26 } elsif ($ARGV[0] eq "mac") { 27 $eol = " "; 28 } else { 29 usage(); 30 } 31 32 while (1) { 33 my $ch; # Character from the input 34 35 # Read a character 36 my $status = sysread(STDIN, $ch, 1); 37 if ($status <= 0) { 38 last; 39 } 40 41 if ($ch eq " ") { 42 syswrite(STDOUT, $eol); 43 next; 44 } 45 46 if ($ch eq " ") { 47 my $next_ch; # Check for 48 $status = sysread(STDIN, $next_ch, 1); 49 if ($status <= 0) { 50 syswrite(STDOUT, $eol); 51 last; 52 } 53 54 # Check for 55 if ($next_ch eq " ") { 56 syswrite(STDOUT, $eol); 57 next; 58 } 59 60 syswrite(STDOUT, $eol); 61 $ch = $next_ch; 62 } 63 syswrite(STDOUT, $ch); 64 }
The script takes one parameter: the type of EOL you wish to end up with. This can be apple, mac, linux, unix, or dos. The script reads the standard input and writes out the converted file to the standard output. For example, to convert a file to Linux format, use this command:
$ eol-change.pl linux <in-file.txt >out_file.txt
The result is a file with the lines in the correct format. Note that it doesn't matter what format the input is in; the program handles all types of text files as input.
Perl is a great language for dealing with strings. It was not designed to work on characters. Still, the job gets done, even if the program is a little inefficient.
The first thing the program does is to set binmode on the input and output. This prevents Perl's internal EOL logic from playing games with your file:
10 binmode(STDIN); 11 binmode(STDOUT);
You then read the file one character at a time using the sysread function:
36 my $status = sysread(STDIN, $ch, 1);
Each character is checked to see if it looks like an EOL (of any type). For example, a line feed is one type of EOL:
41 if ($ch eq " ") { 42 syswrite(STDOUT, $eol); 43 next; 44 }
Carriage return is a little trickier. A carriage return can be an end-of-line indicator, or it can be the first character in a carriage return/line feed pair. You need to check for both possibilities:
46 if ($ch eq " ") { 47 my $next_ch; # Check for 48 $status = sysread(STDIN, $next_ch, 1); 49 if ($status <= 0) { 50 syswrite(STDOUT, $eol); 51 last; 52 } 53 54 # Check for 55 if ($next_ch eq " ") { 56 syswrite(STDOUT, $eol); 57 next; 58 } 59 60 syswrite(STDOUT, $eol); 61 $ch = $next_ch; 62 }
Any other character is just passed from standard in to standard out:
63 syswrite(STDOUT, $ch);
The script as written is simple yet inefficient. It can be made more efficient at the expense of simplicity. But for small-to-medium files, it does the job well enough. And that's what Perl is good for: providing a simple way to get the job done well enough.