rsync
is a program that can be used to synchronize files and directories across a variety of local and remote locations. It can interact with multiple operating systems, work over SSH, provide incremental backups, execute commands on a remote machine, and replace the need for the cp
and scp
commands. The rsync
program is an invaluable asset for any system administrator who intends to run a server or manage a network of computers, as it not only simplifies the process of making backups in general, but it can be used to action a complete backup solution. For this reason, it is the purpose of this recipe to offer a suitable starting point for a small utility that will quickly become your trusted friend.
To complete this recipe, you will require a working installation of the CentOS 7 operating system with root privileges, a console-based text editor of your choice, and a connection to the Internet in order to facilitate the download of additional packages.
During the course of this recipe, it will be assumed that you know the location of the source files and directories that you wish to synchronize, and that a suitable destination is available:
rsync
by typing:yum install rsync
mkdir ~/sync-target
/path/to/source/files/
with something more applicable to your needs:rsync -avz --delete /path/to/source/files/ ~/sync-target
diff
command (if both are the same, no output will be written):diff -r /path/to/source/files/ ~/sync-target
In this recipe, we considered the use of rsync
through the command line. Of course, this is only one of the many ways that this tool can be used, but by using this approach we were able to explore a handful of the features provided by this very valuable utility.
So, what did we learn from this experience?
Rsync is not intended to be complicated. It is a fast and efficient file synchronization tool that is designed to be versatile by giving you complete access to an array of features on the command line. It can be used to maintain an exact copy (or mirror) of the source
directory on the same machine or on a completely different system, and it does this by copying all the files once and then only updating the files that have changed the next time you run it. This can save tremendous bandwidth and should be your primary tool when copying data over the network. The use of the phrase, --delete
, is important, as it instructs rsync
to delete files on the target that do not exist in the source, while the chosen flags imply that rsync
should use -a
archive mode in order to recursively copy files and directories while keeping all permissions and time-based information; –v
)verbosity mode so you can see what is happening; and –z
to compress the data during the file transfer in order to save bandwidth and reduce the amount of time required to complete the entire process.
As you can see, rsync
is very flexible and has many options that go beyond the purpose of this recipe, but if you want to exclude certain files you could always extend the original instruction by invoking the --exclude
flag. By doing this, you tell rsync
to back up an entire directory but ensure that it does not include a predefined pattern of files and folders. For example, if you are copying files from your server to a USB device and you do not want to include large files (such as an .iso
image) or ZIP files, then your command may look similar to this:
rsync --delete -avz --exclude="*.zip" --exclude="*.iso" /path/to/source/ /path/to/external/disk/
On a final note, there is the subject of verbosity. Verbosity is very useful, but a tendency to use bytes as its primary unit of measurement can be a source of confusion. So, in order to change this, you can invoke rsync
with the –h
(or human readable) option, as shown next:
rsync -avzh --exclude="home/path/to/file.txt" /home/ /path/to/external/disk/