In this chapter, we’ll show you how to upgrade software on your system, including rebuilding and installing a new operating system kernel. Although most Linux distributions provide some automated means to install, remove, and upgrade specific software packages on your system, it is often necessary to install software by hand.
Non-expert users will find it easiest to install and upgrade software by using a package system, which most distributions provide. If you don’t use a package system, installations and upgrades are more complicated than with most commercial operating systems. Even though precompiled binaries are available, you may have to uncompress them and unpack them from an archive file. You may also have to create symbolic links or set environment variables so that the binaries know where to look for the resources they use. In other cases, you’ll need to compile the software yourself from sources.
Another common Linux activity is building the kernel. This is an important task for several reasons. First of all, you may find yourself in a position where you need to upgrade your current kernel to a newer version, to pick up new features or hardware support. Second, building the kernel yourself allows you to select which features you do (and do not) want included in the compiled kernel.
Why is the ability to select features a win for you? All kernel code and data are “locked down” in memory; that is, it cannot be swapped out to disk. For example, if you use a kernel image with support for hardware you do not have or use, the memory consumed by the support for that hardware cannot be reclaimed for use by user applications. Customizing the kernel allows you to trim it down for your needs.
It should be noted here that most distributions today ship with modularized kernels. This means that the kernel they install by default contains only the minimum functionality needed to bring up the system; everything else is then contained in modules that add any additionally needed functionality on demand. We will talk about modules in much greater detail later.
When installing or upgrading software on Unix systems, the first things you need to be familiar with are the tools used for compressing and archiving files. Dozens of such utilities are available. Some of these (such as tar and compress) date back to the earliest days of Unix; others (such as gzip and bzip2) are relative newcomers. The main goal of these utilities is to archive files (that is, to pack many files together into a single file for easy transportation or backup) and to compress files (to reduce the amount of disk space required to store a particular file or set of files).
In this section, we’re going to discuss the most common file formats and utilities you’re likely to run into. For instance, a near-universal convention in the Unix world is to transport files or software as a tar archive, compressed using compress or gzip. In order to create or unpack these files yourself, you’ll need to know the tools of the trade. The tools are most often used when installing new software or creating backups — the subject of the following two sections in this chapter.
gzip is a fast
and efficient compression program distributed by the
GNU project. The basic function of
gzip is to take a file, compress it, save the
compressed version as filename.gz
, and remove
the original, uncompressed file. The original file is removed only if
gzip is successful; it is very difficult to
accidentally delete a file in this manner. Of course, being
GNU software, gzip has more
options than you want to think about, and many aspects of its
behavior can be modified using command-line options.
First, let’s say that we have a large file named
garbage.txt
:
rutabaga% ls -l garbage.txt
-rw-r--r-- 1 mdw hack 312996 Nov 17 21:44 garbage.txt
To compress this file using gzip, we simply use the command:
gzip garbage.txt
This replaces garbage.txt
with the compressed
file garbage.txt.gz
. What we end up with is the
following:
rutabaga%gzip garbage.txt
rutabaga%ls -l garbage.txt.gz
-rw-r--r-- 1 mdw hack 103441 Nov 17 21:44 garbage.txt.gz
Note that garbage.txt
is removed when
gzip completes.
You can give gzip a list of filenames; it
compresses each file in the list, storing each with a
.gz
extension. (Unlike the
zip program for Unix and
MS-DOS systems, gzip will
not, by default, compress several files into a single
.gz
archive. That’s what
tar is for; see the next section.)
How efficiently a file is compressed depends upon its format and contents. For example, many graphics file formats (such as PNG and JPEG) are already well compressed, and gzip will have little or no effect upon such files. Files that compress well usually include plain-text files, and binary files, such as executables and libraries. You can get information on a gzipped file using gzip -l. For example:
rutabaga% gzip -l garbage.txt.gz
compressed uncompr. ratio uncompressed_name
103115 312996 67.0% garbage.txt
To get our original file back from the compressed version, we use gunzip, as in:
gunzip garbage.txt.gz
After doing this, we get:
rutabaga%gunzip garbage.txt.gz
rutabaga%ls -l garbage.txt
-rw-r--r-- 1 mdw hack 312996 Nov 17 21:44 garbage.txt
which is identical to the original file. Note that when you gunzip a file, the compressed version is removed once the uncompression is complete. Instead of using gunzip, you can also use gzip -d (e.g., if gunzip happens not to be installed).
gzip stores the name of the original,
uncompressed file in the compressed version. This way, if the
compressed filename (including the .gz
extension) is too long for the filesystem type (say,
you’re compressing a file on an
MS-DOS filesystem with 8.3 filenames), the
original filename can be restored using gunzip
even if the compressed file had a truncated name.
To
uncompress a file to its original filename, use the
-N
option with gunzip. To see
the value of this option, consider the following sequence of
commands:
rutabaga%gzip garbage.txt
rutabaga%mv garbage.txt.gz rubbish.txt.gz
If we were to gunzip rubbish.txt.gz at this
point, the uncompressed file would be named
rubbish.txt
, after the new (compressed)
filename. However, with the -N
option, we get:
rutabaga%gunzip -N rubbish.txt.gz
rutabaga%ls -l garbage.txt
-rw-r--r-- 1 mdw hack 312996 Nov 17 21:44 garbage.txt
gzip and gunzip can also
compress or uncompress data from standard input and output. If
gzip is given no filenames to compress, it
attempts to compress data read from standard input. Likewise, if you
use the -c
option with gunzip,
it writes uncompressed data to standard output. For example, you
could pipe the output of a command to gzip to
compress the output stream and save it to a file in one step, as in:
rutabaga% ls -laR $HOME | gzip > filelist.gz
This will produce a recursive directory listing of your home
directory and save it in the compressed file
filelist.gz
. You can display the contents of
this file with the command:
rutabaga% gunzip -c filelist.gz | more
This will uncompress filelist.gz
and pipe the
output to the more command. When you use
gunzip -c, the file on disk remains compressed.
The zcat command is identical to gunzip -c. You can think of this as a version of cat for compressed files. Linux even has a version of the pager less for compressed files, called zless.
When compressing files, you can use one of the options
-1
, -2
, through -9
to specify the speed and quality of the compression used.
-1
(also — fast
) specifies the
fastest method, which compresses the files less compactly, while
-9
(also — best
) uses the
slowest, but best compression method. If you don’t
specify one of these options the default is -6
. None
of these options has any bearing on how you use
gunzip; gunzip will be able
to uncompress the file no matter what speed option you use.
gzip is relatively
new in the Unix world. The compression programs used on most Unix
systems are compress and
uncompress, which were included in the original
Berkeley versions of Unix. compress and
uncompress are very much like
gzip and gunzip,
respectively; compress saves compressed files as
filename.Z
as opposed to
filename.gz
, and uses a slightly less efficient
compression algorithm.
However, the free software community has been moving to
gzip for several reasons. First of all,
gzip works better. Second, there has been a
patent dispute over the compression algorithm used by
compress — the results of which could
prevent third parties from implementing the
compress algorithm on their own. Because of
this, the Free Software Foundation urged a move to
gzip, which at least the Linux community has
embraced. gzip has been ported to many
architectures, and many others are following suit. Happily,
gunzip is able to uncompress the
.Z
format files produced by
compress.
Another compression/decompression program has also emerged to take
the lead from
gzip.
bzip2 is the new kid on the block and sports
even better compression (on the average about 10-20% better than
gzip), at the expense of longer compression
times. You cannot use bunzip2 to uncompress
files compressed with gzip and vice versa, and
because you cannot expect everybody to have
bunzip2 installed on their machine, you might
want to confine yourself to gzip for the time
being if you want to send the compressed file to somebody else.
However, it pays to have bzip2 installed because
more and more FTP servers now provide
bzip2-compressed packages in order to conserve
disk space and bandwidth. You can recognize
bzip2-compressed files by their
.bz2
filename extension.
While the command-line options of bzip2 are not exactly the same as those of gzip, those that have been described in this section are. For more information, see the bzip2(1) manual page.
The bottom line is that you should use
gzip/gunzip or
bzip2/bunzip2 for your
compression needs. If you encounter a file with the extension
.Z
, it was probably produced by
compress, and gunzip can
uncompress it for you.
Earlier versions of gzip used
.z
(lowercase) instead of
.gz as the compressed-filename extension.
Because of the potential confusion with .Z, this
was changed. At any rate, gunzip retains
backwards compatibility with a number of filename extensions and file
types.
tar is a general-purpose archiving utility capable of packing many files into a single archive file, while retaining information needed to restore the files fully, such as file permissions and ownership. The name tar stands for tape archive because the tool was originally used to archive files as backups on tape. However, use of tar is not at all restricted to making tape backups, as we’ll see.
The format of the tar command is:
tarfunctionoptions
files...
where function
is a single letter
indicating the operation to perform,
options
is a list of (single-letter)
options to that function, and files
is the
list of files to pack or unpack in an archive. (Note that
function
is not separated from
options
by any space.)
function
can be one of the following:
c
To create a new archive
x
To extract files from an archive
t
To list the contents of an archive
r
To append files to the end of an archive
u
To update files that are newer than those in the archive
d
To compare files in the archive to those in the filesystem
You’ll rarely use most of these functions; the more
commonly used are c
, x
, and
t
.
To print verbose information when packing or unpacking archives.
To keep any existing files when extracting — that is, to not overwrite any existing files which are contained within the tar file.
filename
To specify that the tar file to be read or written is
filename
.
To specify that the data to be written to the tar file should be compressed or that the data in the tar file is compressed with gzip.
Like z, but uses bzip2 instead of gzip; works only with newer versions of tar. Some intermediate versions of tar used I instead; older ones don’t support bzip2 at all.
To make tar show the files it is archiving or restoring — it is good practice to use this so that you can see what actually happens (unless, of course, you are writing shell scripts).
There are others, which we will cover later in this section.
Although the tar
syntax might appear complex at
first, in practice it’s quite simple. For example,
say we have a directory named mt
, containing
these files:
rutabaga% ls -l mt
total 37
-rw-r--r-- 1 root root 24 Sep 21 1993 Makefile
-rw-r--r-- 1 root root 847 Sep 21 1993 README
-rwxr-xr-x 1 root root 9220 Nov 16 19:03 mt
-rw-r--r-- 1 root root 2775 Aug 7 1993 mt.1
-rw-r--r-- 1 root root 6421 Aug 7 1993 mt.c
-rw-r--r-- 1 root root 3948 Nov 16 19:02 mt.o
-rw-r--r-- 1 root root 11204 Sep 5 1993 st_info.txt
We wish to pack the contents of this
directory into a single tar
archive. To do this,
we use the command:
tar cf mt.tar mt
The first argument to tar
is the
function
(here, c
, for
create) followed by any options
. Here, we
use the option f mt.tar
to specify that the
resulting tar archive be named mt.tar
. The last
argument is the name of the file or files to archive; in this case,
we give the name of a directory, so tar packs
all files in that directory into the archive.
Note that the first argument to tar must be the
function letter and options. Because of this,
there’s no reason to use a hyphen
(-
) to precede the options as many Unix commands
require. tar allows you to use a hyphen, as in:
tar -cf mt.tar mt
but it’s really not necessary. In some versions of
tar, the first letter must be the
function
, as in c
,
t
, or x
. In other versions, the
order of letters does not matter.
The function letters as described here follow the so-called “old option style.” There is also a newer “short option style” in which you precede the function options with a hyphen, and a “long option style” in which you use long option names with two hyphens. See the Info page for tar for more details if you are interested.
Be careful to remember the filename if you use the
cf
function letters. Otherwise tar will overwrite
the first file in your list of files to pack because it will mistake
that for the filename!
It is often a good idea to use the v
option with
tar; this lists each file as it is archived. For
example:
rutabaga% tar cvf mt.tar mt
mt/
mt/st_info.txt
mt/README
mt/mt.1
mt/Makefile
mt/mt.c
mt/mt.o
mt/mt
If you use v
multiple times, additional
information will be printed, as in:
rutabaga% tar cvvf mt.tar mt
drwxr-xr-x root/root 0 Nov 16 19:03 1994 mt/
-rw-r--r-- root/root 11204 Sep 5 13:10 1993 mt/st_info.txt
-rw-r--r-- root/root 847 Sep 21 16:37 1993 mt/README
-rw-r--r-- root/root 2775 Aug 7 09:50 1993 mt/mt.1
-rw-r--r-- root/root 24 Sep 21 16:03 1993 mt/Makefile
-rw-r--r-- root/root 6421 Aug 7 09:50 1993 mt/mt.c
-rw-r--r-- root/root 3948 Nov 16 19:02 1994 mt/mt.o
-rwxr-xr-x root/root 9220 Nov 16 19:03 1994 mt/mt
This is especially useful as it lets you verify that tar is doing the right thing.
In some versions of tar, f
must be the last letter in the list of options. This is because
tar expects the f
option to
be followed by a filename — the name of the tar file to read from
or write to. If you don’t specify
f
filename
at all,
tar assumes for historical reasons that it
should use the device /dev/rmt0
(that is, the
first tape drive). In Section 8.1, in Chapter 8, we’ll talk about using
tar in conjunction with a tape drive to make
backups.
Now, we can give the file mt.tar
to other
people, and they can extract it on their own system. To do this, they
would use the command:
tar xvf mt.tar
This creates the subdirectory mt
and places all
the original files into it, with the same permissions as found on the
original system. The new files will be owned by the user running the
tar xvf (you) unless you are running as
root
, in which case the original owner is
preserved. The x
option stands for
“extract.” The
v option is used again here to list each file as
it is extracted. This produces:
courgette% tar xvf mt.tar
mt/
mt/st_info.txt
mt/README
mt/mt.1
mt/Makefile
mt/mt.c
mt/mt.o
mt/mt
We can see that tar saves the pathname of each
file relative to the location where the tar file was originally
created. That is, when we created the archive using tar cf mt.tar mt, the only input filename we specified was
mt
, the name of the directory containing the
files. Therefore, tar stores the directory
itself and all the files below that directory in the tar file. When
we extract the tar file, the directory mt
is
created and the files placed into it, which is the exact inverse of
what was done to create the archive.
By default, tar extracts all tar files relative
to the current directory where you execute tar.
For example, if you were to pack up the contents of your
/bin
directory with the command:
tar cvf bin.tar /bin
tar would give the warning:
tar: Removing leading / from absolute pathnames in the archive.
What this means is that the files are stored in the archive within
the subdirectory bin
. When this tar file is
extracted, the directory bin
is created in the
working directory of tar — not as
/bin
on the system where the extraction is being
done. This is very important and is meant to prevent terrible
mistakes when extracting tar files. Otherwise, extracting a tar file
packed as, say, /bin
would trash the contents of
your /bin
directory when you extracted
it.[22] If you really wanted to
extract such a tar file into /bin
, you would
extract it from the root directory, /
. You can
override this behavior using the P option when
packing tar files, but it’s not recommended you do
so.
Another way to create the tar file mt.tar
would
have been to cd into the mt
directory itself, and use a command, such as:
tar cvf mt.tar *
This way the mt
subdirectory would not be stored
in the tar file; when extracted, the files would be placed directly
in your current working directory. One fine point of
tar etiquette is to always pack tar files so
that they have a subdirectory at the top level, as we did in the
first example with tar cvf mt.tar mt. Therefore,
when the archive is extracted, the subdirectory is also created and
any files placed there. This way you can ensure that the files
won’t be placed directly in your current working
directory; they will be tucked out of the way and prevent confusion.
This also saves the person doing the extraction the trouble of having
to create a separate directory (should they wish to do so) to unpack
the tar file. Of course, there are plenty of situations where you
wouldn’t want to do this. So much for etiquette.
When creating archives, you can, of course, give
tar a list of files or directories to pack into
the archive. In the first example, we have given
tar the single directory
mt
, but in the previous paragraph we used the
wildcard *
, which the shell expands into the list
of filenames in the current directory.
Before extracting a tar file, it’s usually a good idea to take a look at its table of contents to determine how it was packed. This way you can determine whether you do need to create a subdirectory yourself where you can unpack the archive. A command, such as:
tar tvf tarfile
lists the table of contents for the named
tarfile
. Note that when using the
t
function, only one v
is
required to get the long file listing, as in this example:
courgette% tar tvf mt.tar
drwxr-xr-x root/root 0 Nov 16 19:03 1994 mt/
-rw-r--r-- root/root 11204 Sep 5 13:10 1993 mt/st_info.txt
-rw-r--r-- root/root 847 Sep 21 16:37 1993 mt/README
-rw-r--r-- root/root 2775 Aug 7 09:50 1993 mt/mt.1
-rw-r--r-- root/root 24 Sep 21 16:03 1993 mt/Makefile
-rw-r--r-- root/root 6421 Aug 7 09:50 1993 mt/mt.c
-rw-r--r-- root/root 3948 Nov 16 19:02 1994 mt/mt.o
-rwxr-xr-x root/root 9220 Nov 16 19:03 1994 mt/mt
No extraction is being done here; we’re just
displaying the archive’s table of contents. We can
see from the filenames that this file was packed with all files in
the subdirectory mt
so that when we extract the
tar file, the directory mt
will be created and
the files placed there.
You can also extract individual files from a tar archive. To do this, use the command:
tar xvftarfile
files
where files
is the list of files to
extract. As we’ve seen, if you
don’t specify any files
,
tar extracts the entire archive.
When specifying individual files to extract, you must give the full
pathname as it is stored in the tar file. For example, if we wanted
to grab just the file mt.c
from the previous
archive mt.tar
, we’d use the
command:
tar xvf mt.tar mt/mt.c
This would create the subdirectory mt
and place
the file mt.c
within it.
tar has many more options than those mentioned here. These are the features that you’re likely to use most of the time, but GNU tar, in particular, has extensions that make it ideal for creating backups and the like. See the tar manual page and the following section for more information.
tar does not compress the data stored in its archives in any way. If you are creating a tar file from three 200K files, you’ll end up with an archive of about 600K. It is common practice to compress tar archives with gzip (or the older compress program). You could create a gzipped tar file using the commands:
tar cvftarfile
files...
gzip -9tarfile
But that’s so cumbersome, and requires you to have enough space to store the uncompressed tar file before you gzip it.
A much trickier way to accomplish the same task is to use an
interesting feature of tar that allows you to
write an archive to standard output. If you specify
-
as the tar file to read or write, the data will
be read from or written to standard input or output. For example, we
can create a gzipped tar file using the command:
tar cvf -files...
| gzip -9 >tarfile
.tar.gz
Here, tar creates an archive from the named
files
and writes it to standard output;
next, gzip reads the data from standard input,
compresses it, and writes the result to its own standard output;
finally, we redirect the gzipped tar file to
tarfile
.tar.gz
.
We could extract such a tar file using the command:
gunzip -c tarfile
.tar.gz | tar xvf -
gunzip uncompresses the named archive file and writes the result to standard output, which is read by tar on standard input and extracted. Isn’t Unix fun?
Of course, both commands are rather cumbersome to type. Luckily, the GNU version of tar provides the z option which automatically creates or extracts gzipped archives. (We saved the discussion of this option until now, so you’d truly appreciate its convenience.) For example, we could use the commands:
tar cvzftarfile
.tar.gzfiles...
and:
tar xvzf tarfile
.tar.gz
to create and extract gzipped tar files. Note
that you should name the files created in this way with the
.tar.gz
filename extensions (or the equally
often used .tgz
, which also works on systems
with limited filename capabilities) to make their format obvious. The
z option works just as well with other tar
functions such as t
.
Only the GNU version of tar supports the z option; if you are using tar on another Unix system, you may have to use one of the longer commands to accomplish the same tasks. Nearly all Linux systems use GNU tar.
When you want to use tar in conjunction with bzip2, you need to tell tar about your compression program preferences, like this:
tar cvftarfile
.tar.bz2 - -use-compress-program=bzip2files...
or, shorter:
tar cvftarfile
.tar.bz2 - -use-compress-program=bzip2files...
or, shorter still:
tar cvjftarfile.tar.bz2
files
The last version works only with newer versions of GNU tar that support the j option.
Keeping this in mind, you could write short shell scripts or aliases
to handle cookbook tar file creation and extraction for you. Under
bash, you could include the following functions
in your .bashrc
:
tarc ( ) { tar czvf $1.tar.gz $1 } tarx ( ) { tar xzvf $1 } tart ( ) { tar tzvf $1 }
With these functions, to create a gzipped tar file from a single directory, you could use the command:
tarc directory
The resulting archive file would be named
directory
.tar.gz
. (Be
sure that there’s no trailing slash on the directory
name; otherwise the archive will be created as
.tar.gz
within the given directory.) To list the
table of contents of a gzipped tar file, just
use:
tart file
.tar.gz
Or, to extract such an archive, use:
tarx file
.tar.gz
As a final note, we would like to mention that files created with
gzip and/or tar can be
unpacked with the well-known WinZip utility on
Windows systems. WinZip doesn’t
have support for bzip2 yet, though. If you, on
the other hand, get a file in .zip
format, you
can unpack it on your Linux system using the
unzip command.
Because
tar saves the ownership and permissions of files
in the archive and retains the full directory structure, as well as
symbolic and hard links, using tar is an
excellent way to copy or move an entire directory tree from one place
to another on the same system (or even between different systems, as
we’ll see). Using the -
syntax
described earlier, you can write a tar file to standard output, which
is read and extracted on standard input elsewhere.
For example, say that we
have a directory containing two subdirectories:
from-stuff
and to-stuff
.
from-stuff
contains an entire tree of files,
symbolic links, and so forth — something that is difficult to
mirror precisely using a recursive cp. In order
to mirror the entire tree beneath from-stuff
to
to-stuff
, we could use the commands:
cd from-stuff tar cf - . | (cd ../to-stuff; tar xvf -)
Simple and elegant, right? We start in the directory
from-stuff
and create a tar file of the current
directory, which is written to standard output. This archive is read
by a subshell (the commands contained within parentheses); the
subshell does a cd to the target directory,
../to-stuff
(relative to
from-stuff
, that is), and then runs
tar xvf, reading from standard input. No tar
file is ever written to disk; the data is sent entirely via pipe from
one tar process to another. The second
tar process has the v
option that prints each file as it’s extracted; in
this way, we can verify that the command is working as expected.
In fact, you could transfer directory trees from one machine to another (via the network) using this trick; just include an appropriate rsh (or ssh) command within the subshell on the right side of the pipe. The remote shell would execute tar to read the archive on its standard input. (Actually, GNU tar has facilities to read or write tar files automatically from other machines over the network; see the tar(1) manual page for details.)