How to process a space delimited environment variable with xargs

xargs

I am trying to execute a long running process multiple times in parallel. The parameter for each execution of the process is stored in a space separated environment variable. Here is a contrived example of what I am trying to execute:

$ echo -e 1 2 3 4 | xargs --max-procs=3 --max-args=1 --replace=% echo % is the number being processed

Here is the output from that command:

1 2 3 4 is the number being processed

Why is max-args seemingly ignored? I then tried to explicitly set the delimiter which gives better results:

$ echo -e 1 2 3 4 | xargs -d " " --max-procs=3 --max-args=1 --replace=% echo % is the number being processed
1 is the number being processed
2 is the number being processed
3 is the number being processed
4
 is the number being processed

What is xargs doing when it processes the 4th argument?

After some searching, I did manage to almost get what I want. The arguments are processed correctly, but parallelism does not work (verified with another command not shown here):

$ echo -e 1 2 3 4 | xargs -n 1 | xargs --max-procs=3 --max-args=1 --replace=% echo % is the number being processed
1 is the number being processed
2 is the number being processed
3 is the number being processed
4 is the number being processed

What am I missing?

Best Answer

Does

echo -e 1 2 3 4 | sed -e 's/\s\+/\n/g' | xargs --max-procs=3 --max-args=1 --replace=% echo % is the number being processed

accomplish the task? The output seems about right:

1 is the number being processed
2 is the number being processed
3 is the number being processed
4 is the number being processed

I also tried replacing echo with sleep to confirm it executes in parallel, and it does:

echo -e 1 2 3 4 5 6 7 8 9 9 9 9 9 9 9 9 9 9 9 9 | sed -e 's/\s\+/\n/g' | xargs --max-procs=20 --max-args=1 --replace=% sleep 1
Related Topic