There is a max-size
option:
--max-size=SIZE don't transfer any file larger than SIZE
So:
# rsync -rv --max-size=1.5m root@tss01:/tmp/dm
Will send only files less than 1.5m.
Regarding sizes from man:
The suffixes are as follows: "K" (or "KiB") is a kibibyte (1024), "M" (or "MiB") is a mebibyte (1024*1024), and "G" (or "GiB") is a gibibyte (1024*1024*1024). If you want the multiplier to be 1000 instead of 1024, use "KB", "MB", or "GB". (Note: lower-case is also accepted for all values.) Finally, if the suffix ends in either "+1" or "-1", the value will be offset by one byte in the indicated direction*
You can do very well with Bacula/Amanda. Hitting your requirements:
Revisions (SVN-style): a file has to be backed up each time it gets modified (and multiples versions of the same file can exist on the server, in fact they must)
Bacula and Amanda will grab a file each time it changes.
Scalability: if I attach an USB drive to the computer, I want it's data to be backed up as well (well... That on linux might be quite easy, simply backup all /media/ except cds and dvds, but for windows?)
Not bad on Unix (Just back up everything under / and it will grab the media), but probably not possible on Windows -- I believe you need to specify the drives you want to grab because the filesystem isn't a tree hierarchy under a specific root (there's a root for each drive).
That said, it's probably NOT a good idea (What if you attach a full 1TB drive to a machine being backed up? Your backups just ballooned).
Near real-time (~ 5 minutes at max) file backup: I lost a latex report and was hard to reconstruct it from scratch
Not happening -- You CAN specify a 5 minute backup window, but your logs will be filled with jobs being killed because there's already a duplicate running.
You can schedule nightly backups, or even every 12 hours without much trouble.
(Even Apple's Time Machine only does hourly backups... think about the largest file that may change and have to be shoved over the wire...)
No-Duplication: for instance if I backup the USB's disk content from 2 different computers, I do not want the data to be backed up twice (symlink instead of hard copy in worst case)
Bacula doesn't have deduplication at this time. Not sure about Amanda.
Manual restore / automatic restore: it's the same for me (simply not like described here below)
Restores are (and should be) a manual process. I have no idea what an "automatic restore" would look like (the backup server decides on its own to restore a file? :)
Maybe ability to remove / exclude large files from backups
You can include or exclude specific parts of the filesystem (down to file-level granularity) in Bacula.
Good logs
Database-backed lists of jobs and results, with the ability to write to log files, email, etc. in the event of errors.
BackupPC may also be able to hit these requirements (not certain - haven't used it) - other commercial backup solutions almost certainly can as well.
You may also want to consider tarsnap, though I'm not sure how the Windows support is.
Best Answer
Ok, after several tries I sorted this up:
Another approach
in case you don't mind syncing empty dirs, just:
The key was to
--exclude=.svn/
before the--include
's