How to Transfer Only New Files with rsync
The rsync
command is a powerful tool for file synchronization, but sometimes you only need to transfer the files that are new to the destination. This is where the --update
option comes in handy.
Why Use rsync --update
?
Imagine you have a large folder containing thousands of files, and you want to copy them to a remote server. You could use a simple scp
command, but this will transfer all files, even if they already exist on the server. This can be time-consuming and wasteful, especially if your network connection is slow.
The rsync --update
option lets you transfer only the files that are newer on the source than the destination. This is great for:
- Saving time and bandwidth: By only transferring new files, you can significantly reduce the transfer time and network traffic.
- Maintaining consistency: If you have multiple copies of a dataset,
rsync --update
helps ensure that they remain synchronized, without accidentally overwriting data on the destination. - Incremental backups: You can use
rsync --update
to create incremental backups, transferring only the files that have changed since the last backup.
How Does rsync --update
Work?
The rsync
command compares the timestamps of the files on the source and destination. If a file on the source is newer than the corresponding file on the destination, rsync
will transfer the newer file.
Here is a breakdown of how it works:
- Checksum Calculation:
rsync
calculates checksums for all files in the source and destination directories. - Comparison:
rsync
compares the checksums of the files in the source and destination. - Transfer Only New Files: If a file on the source has a different checksum than the corresponding file on the destination (meaning it's newer),
rsync
transfers the newer file.
Using rsync --update
Here's a simple example of how to use rsync --update
to transfer only new files from a local directory to a remote server:
rsync --update -avz /local/directory user@remote_server:/remote/directory
Explanation:
--update
: This option tellsrsync
to only transfer files that are newer on the source than the destination.-a
: This option includes all files and metadata (permissions, timestamps, etc.).-v
: This option provides verbose output, showing the files being transferred.-z
: This option enables compression, which can significantly reduce transfer time.
Additional Tips:
- Exclude Specific Files: You can use the
--exclude
option to exclude specific files or directories from the transfer. For example:
rsync --update -avz --exclude='*.log' /local/directory user@remote_server:/remote/directory
This will transfer all files except those with the .log
extension.
- Dry Run: Before running
rsync --update
on a large dataset, it's a good idea to perform a dry run using the--dry-run
option. This will show you which files would be transferred without actually transferring them.
rsync --update -avz --dry-run /local/directory user@remote_server:/remote/directory
- Delete Files: If you want to delete files on the destination that are no longer present on the source, you can use the
--delete
option. However, use this option with caution, as it can permanently delete files from the destination.
Conclusion
The rsync --update
option is a powerful and efficient way to transfer only new files. It saves time and bandwidth, helps maintain consistency, and allows for incremental backups. By understanding the rsync --update
command, you can streamline your file synchronization tasks and avoid unnecessary data transfers.