Linux – Robocopy exclude directory in recursive file structure

linuxrobocopywindows-server-2008

This seems to be a popular question on the net, but everyone else seems to be OK with being able to specify only a portion of the full path in order to achieve a sort of wildcard. I need to be more specific, I think, as I've got a recursive file structure, and can't use /XJ.

Context

We run nightly backups on all our servers from a Windows 2008 file server, which runs robocopy scripts to perform differential backups via Samba file shares. The particular backup I am having issues with is for a Linux shared webserver.

The peculiar thing about backing up this particular server is that it is using a recursive file structure, using symbolic links. The reason for this is that the server is running suPHP in a chroot jail, and a recursive file structure is required in order to trick suPHP into working.

An example of the directory structure is like so:

Home directory for user: /home/websites/wwww.customersite.com/home/wwww.customersite.com/

Their website is in: /home/websites/www.customersite.com/home/www.customersite.com/public_html/

The recursion begins at: /home/websites/www.customersite.com/home/websites/

And so when this recursion begins, in my robocopy log I see entries like so:

\\10.230.0.25\backup-share\home\websites\customer\home\websites\customer\home\websites\customer\home\websites\customer\home\websites\customer\home\websites\customer…

Now, it does seem that it goes on like that for a while and then reaches some directory depth limit, and actually backs up the file. But I would like to get rid of this as it's causing the backups to run into day-time quite extensively, and no doubt putting extra load on the server.

…\home\websites\customer\home\websites\customer\home\websites\customer\home\customer\public_html\administrator\components\com_jce\img\

What I've Tried

I've seen this kind of recursive file structure issue using robocopy before copying from Windows Vista+ based systems. For example, this issue when backing up C:\Users:

https://superuser.com/questions/478882/delete-recursive-directory-created-by-robocopy-when-the-file-name-is-too-long

As mentioned there, I've been previously able to use /XJ to resolve the issue. However, I'm guessing that since the files are being copied from a Linux system, the recursive sym-link does not appear as a junction-point to robocopy.

I haven't found a way to make the sym-link appear as junction points as of yet.

Of course, I can't see a way of using /XD here either, since I need to be able to backup all data in /home/websites/ so doing /XD /home/websites/ would surely negate the backup of that folder.

Ideally, I'd want something like:

/XD /home/websites/.+/home/websites/.+

Or even just:

/XD /home/websites/.+/home/websites/

Any help appreciated.

Thanks!

Best Answer

Perhaps an rsync cron job on the linux server that dumps it to a tar that your robocopy can grab? Then you can simply exclude that entire original path.

For me I do the following:

I have a few Linux servers were I just need ETC and a few folders backed up. I do an rsync to tar and then use PSCP to simply grab the tar file (since it will change nightly). Then my PSCP script deletes anything it grabbed over 7 days old (since the Windows backups grab the "dumped to" folder, so technically I have backups of the backup offsite.

Hope that helps.