Rsync
rsync is a command line tool which synchronizes files and directories from one location to another while minimizing data transfer using delta encoding when appropriate. An important feature of rsync not found in most similar programs/protocols is that the mirroring takes place with only one transmission in each direction.
rsync can copy or display directory contents and copy files, optionally using compression and recursion.
rsync has the default TCP port of 873.
Contents
The rsync algorithm
Abstract: This report presents an algorithm for updating a file on one machine to be identical to a file on another machine. We assume that the two machines are connected by a low-bandwidth high-latency bi-directional communications link. The algorithm identifies parts of the source file which are identical to some part of the destination file, and only sends those parts which cannot be matched in this way. Effectively, the algorithm computes a set of differences without having both files on the same machine. The algorithm works best when the files are similar, but will also function correctly and reasonably efficiently when the files are quite different.[1]
Some features of rsync include:
- can update whole directory trees and filesystems
- optionally preserves symbolic links, hard links, file ownership, permissions, devices and times
- requires no special privileges to install
- internal pipelining reduces latency for multiple files
- can use rsh, ssh, or direct sockets as the transport
- supports anonymous rsync which is ideal for mirroring
Usage
Examples
Let us say you wish to rsync your home directory (e.g. /home/bob
) with a backup directory/disk (e.g. /backup
). The following command will accomplish this:
$ rsync -avz --delete /home/bob/ /backup/
where a
means archive, v
means do it verbosely, z
means compress the data, and delete
means to delete the backup file if the original file (i.e. in /home/bob
) has been deleted since the last rsync.
The same can be done where the backup disk is on a remote machine via ssh. See here for more information.
- Copy files to/from computers in your local area network
Consider a case where you have two computers plugged into your home router. To copy files/directories between the two, first find out their local IPv4 addresses (e.g., configured to use eth0) using `ifconfig`
or `ip a`
. The following command will copy files between the two machines (make sure your firewall rules allow connections via port 22 for SSH):
$ rsync -e 'ssh -p 22' -avl --stats --progress /home/bob/source bob@192.168.0.2:/home/bob/destination
- Copy multiple files at the same time
Consider a case where you have multiple image files (e.g., foo-1.jpg
, foo-2.jpg
, and foo-3.jpg
) and you wish to copy them from a source host to a destination host. You can you standard globing like so:
$ rsync -e 'ssh -p 22' -avl --stats --progress /home/source/foo-{1..3}.jpg bob@192.168.0.2:/home/destination
- Exclude certain directories:
rsync -e ssh -a --exclude 'dev' --exclude '/proc' --exclude '/sys' / bob@192.168.0.2:/arc/2010-03-24
rsyncd.conf
The rsyncd.conf
file is the runtime configuration file for rsync when run as an rsync daemon. The rsyncd.conf
file controls authentication, access, logging, and available modules. See man rsyncd.conf for more information.
- Sample
rsyncd.conf
file:
motd file = /etc/rsyncd.motd log file = /var/log/rsyncd.log pid file = /var/run/rsyncd.pid lock file = /var/run/rsync.lock [simple_path_name] path = /rsync_files_here comment = My Very Own Rsync Server uid = nobody gid = nobody read only = no list = yes auth users = username secrets file = /etc/rsyncd.scrt
Examples (extended)
Note: The following were taken directly from the rsync website (with some modifications).
Backup to a central backup server with 7 day incremental
#!/bin/sh # This script does personal backups to a rsync backup server. You will end up # with a 7 day rotating incremental backup. The incrementals will go # into subdirectories named after the day of the week, and the current # full backup goes into a directory called "current" # tridge@linuxcare.com # directory to backup BDIR=/home/$USER # excludes file - this contains a wildcard pattern per line of files to exclude EXCLUDES=$HOME/cron/excludes # the name of the backup machine BSERVER=owl # your password on the backup server export RSYNC_PASSWORD=XXXXXX ######################################################################## BACKUPDIR=`date +%A` OPTS="--force --ignore-errors --delete-excluded --exclude-from=$EXCLUDES --delete --backup --backup-dir=/$BACKUPDIR -a" export PATH=$PATH:/bin:/usr/bin:/usr/local/bin # the following line clears the last weeks incremental directory [ -d $HOME/emptydir ] || mkdir $HOME/emptydir rsync --delete -a $HOME/emptydir/ $BSERVER::$USER/$BACKUPDIR/ rmdir $HOME/emptydir # now the actual transfer rsync $OPTS $BDIR $BSERVER::$USER/current
Backup to a spare disk
I do local backups on several of my machines using rsync. I have an extra disk installed that can hold all the contents of the main disk. I then have a nightly cron job that backs up the main disk to the backup. This is the script I use on one of those machines.
#!/bin/sh export PATH=/usr/local/bin:/usr/bin:/bin LIST="rootfs usr data data2" for d in $LIST; do mount /backup/$d rsync -ax --exclude fstab --delete /$d/ /backup/$d/ umount /backup/$d done DAY=`date "+%A"` rsync -a --delete /usr/local/apache /data2/backups/$DAY rsync -a --delete /data/solid /data2/backups/$DAY
The first part does the backup on the spare disk. The second part backs up the critical parts to daily directories. I also backup the critical parts using a rsync over ssh to a remote machine.
Mirroring vger CVS tree
The vger.rutgers.edu cvs tree is mirrored onto cvs.samba.org via anonymous rsync using the following script.
#!/bin/bash cd /var/www/cvs/vger/ PATH=/usr/local/bin:/usr/freeware/bin:/usr/bin:/bin RUN=`lps x | grep rsync | grep -v grep | wc -l` if [ "$RUN" -gt 0 ]; then echo already running exit 1 fi rsync -az vger.rutgers.edu::cvs/CVSROOT/ChangeLog $HOME/ChangeLog sum1=`sum $HOME/ChangeLog` sum2=`sum /var/www/cvs/vger/CVSROOT/ChangeLog` if [ "$sum1" = "$sum2" ]; then echo nothing to do exit 0 fi rsync -az --delete --force vger.rutgers.edu::cvs/ /var/www/cvs/vger/ exit 0
Note in particular the initial rsync of the ChangeLog to determine if anything has changed. This could be omitted but it would mean that the rsyncd on vger would have to build a complete listing of the cvs area at each run. As most of the time nothing will have changed, I wanted to save the time on vger by only doing a full rsync if the ChangeLog has changed. This helped quite a lot because vger is low on memory and generally quite heavily loaded, so doing a listing on such a large tree every hour would have been excessive.
Automated backup at home
The cron job looks like this:
#!/bin/sh cd ~stine { echo date dest=~/backup/`date +%A` mkdir $dest.new find . -xdev -type f \( -mtime 0 -or -mtime 1 \) -exec cp -aPv "{}" $dest.new \; cnt=`find $dest.new -type f | wc -l` if [ $cnt -gt 0 ]; then rm -rf $dest mv $dest.new $dest fi rm -rf $dest.new rsync -Cavze ssh . samba:backup } >> ~/backup/backup.log 2>&1
Note that most of this script isn't anything to do with rsync, it just creates a daily backup of Stine's work in a ~stine/backup/ directory so she can retrieve any version from the last week. The last line does the rsync of her directory across the modem link to the host samba. Note that I am using the -C option which allows me to add entries to .cvsignore for stuff that doesn't need to be backed up.
Fancy footwork with remote file lists
One little known feature of rsync is the fact that when run over a remote shell (such as rsh or ssh) you can give any shell command as the remote file list. The shell command is expanded by your remote shell before rsync is called. For example, see if you can work out what this does:
rsync -avR remote:'`find /home -name "*.[ch]"`' /tmp/
note that that is backquotes enclosed by quotes (some browsers don't show that correctly).
Rsync exit values
Note: Obtained via the man page. Capture exit value with `$?`
.
0 Success 1 Syntax or usage error 2 Protocol incompatibility 3 Errors selecting input/output files, dirs 4 Requested action not supported: an attempt was made to manipulate 64-bit files on a platform that cannot support them; or an option was specified that is supported by the client and not by the server. 5 Error starting client-server protocol 6 Daemon unable to append to log-file 10 Error in socket I/O 11 Error in file I/O 12 Error in rsync protocol data stream 13 Errors with program diagnostics 14 Error in IPC code 20 Received SIGUSR1 or SIGINT 21 Some error returned by waitpid() 22 Error allocating core memory buffers 23 Partial transfer due to error 24 Partial transfer due to vanished source files 25 The --max-delete limit stopped deletions 30 Timeout in data send/receive 35 Timeout waiting for daemon connection
For example, if one were using the "--max-delete
" option for rsync(1), one could check a call's return value to see whether rsync(1) hit the threshold for deleted file count and write a message to a logfile appropriately:
$ rsync --archive --delete --max-delete=5 source destination $ if (($? == 25)); then printf '%s\n' 'Deletion limit was reached' >"$logfile" fi
Variations
rdiff and rdiff-backup
There also exists a utility called rdiff, which uses the rsync algorithm to generate delta files with the difference from file A to file B (like the utility diff, but in a different delta format). The delta file can then be applied to file A, turning it into file B (similar to the patch utility).
Unlike diff, the process of creating a delta file has two steps: first a signature file is created from file A, and then this (relatively small) signature and file B is used to create the delta file. Also unlike diff, rdiff works well with binary files.
Using rdiff, a utility called rdiff-backup has been created, capable of maintaining a backup mirror of a file or directory over the network, on another server. rdiff-backup stores incremental rdiff deltas with the backup, with which it is possible to recreate any backup point.
See Automated Backups With rdiff-backup for example usage.
See also
- mt
- ssh
- Unison — allows bidirectional synchronization
- Xdelta — alternative implementation of file differencing and delta encoding
- duplicity — encrypted bandwidth-efficient backup using the rsync algorithm
References
- ↑ Tridgell A, Paul Mackerras P (1998). "The rsync algorithm". Department of Computer Science, Australian National University, Canberra, ACT 0200, Australia.
External links
- rsync homepage
- Tutorial: Using rsync
- Tutorial: Mirroring with rsync
- Tutorial: Backing up files with rsync
- Easy Automated Snapshot-Style Backups with Linux and Rsync
- rsync algorithm
- OpenSSH/Cookbook/Automated Backup on Wikibooks
- wikipedia:rsync
Examples
- Mirroring an Entire Site using Rsync over SSH — by AskApache