BASH Shell Scripting assistance, backup script extracts meta data from filename & moves file accordingly

backupbashscripting

So I need to build a shell script (a skill I am rubbish at, I think too linearly and make everything a pipe) that will connect to a remote machine to a specific directory slurp up all the files older than 5 minutes, extract information from the file's name (encoding details below) and scatter the files into relevant directories based off of that, or create the directories if they don't exist on the local backup host.

On a dozen machines I have directory (let us call it /Prod/Data/) full of files thousands of files named data-HOST-v.7.mmddyy.csv

example: date-web2-v.7.052509.csv

Files older than 5 minutes need to be pulled from the remote machines to a local folder /backup/archive/host/year/month/day/csvs

example /backup/archive/web2/2009/05/29/csvs

I'm sure I can do something like ls -1 | cut -d"." -f3 to extract the date section of the file, then use sed or awk to isolate each section and produce the date variables to pick what directories to dump the files in, do something similar to grab the host, but I am not sure how to go about making that correlate with a file on which to execute a move on. Not sure how to execute that remotely, perhaps it is better to scp all the files over from the remote machine first (short of any file younger than 5 minutes, perhaps a find -mmin +5 statement can be used to suss that out?) then do the sorting when everything is on the backup machine.

Would someone be so kind as to point me in the direction of an example script that may provide similar functionality? Everything I write tends to be command | command | command | etc… and I imagine this task will require some dimensionality.

Thank you for your time.

Best Answer

Pure Bash Solution, using parameter expansion. See this for an explanation of PE.

foo='date-web2-v.7.052509.csv'
file=${foo%*.csv}
date=${file##*.}

month=${date:0:2}
day=${date:2:2}
year=${date:4:2}

I would probably use Perl for this and use parenthesis to capture what I want in groups from a regular expression.