Bash – Parsing Command Output in Bash Script

bashparse-server

I need a bash script that takes the output of a shell command and parses that output to pull out the id, and website url for each line in the table that can then be used to execute additional bash commands.

Here's an example of the command output.

+----+-------------------------------+----------------------------------------+---------+
| id | name                          | url                                    | version |
+----+-------------------------------+----------------------------------------+---------+
| 25 | example.com                   | http://www.example.com/                | 3.8     |
| 34 | anotherexample.com            | https://anotherexample.com/            | 3.2     |
| 62 | yetanotherexample.com         | https://yetanotherexample.com/         | 3.9     |
+----+-------------------------------+----------------------------------------+---------+

Pseudo code for the script would be along the lines of:

$output = `command --list'
for each row in $output {
    $siteid=extracted_id
    $url=extracted_url

    $process_result = 'new_command $siteid'
    log "$siteid, $url, $process_result" > log.txt
endif

Note that the numeric id could be more than 2 digits.

Is anyone able to give me a starting point on how to parse each line of the original output command and pull the id and url as variables while ignoring the first 3 lines and last line that are the table border and header?

I can figure the rest out, it's just parsing each line that I'm stuck on.

Any suggestions / advice would be greatly appreciated.

Thanks in advance.

Best Answer

Welcome Phill Coxon,

Method 1

This pure bash script seem to fit your needs

#!/usr/bin/env bash
declare id
declare name
declare url
declare version

while read line; do
  if [[ ! ${line} =~ ^[\+\| ]]; then
    if [[ ${line} =~ \|[[:space:]]*([[:digit:]]+)[[:space:]]*\|[[:space:]]+([[:alnum:]\.]+)[[:space:]]+\|[[:space:]]+(https?:\/\/(www\.)?[[:alnum:]]+\.[[:alpha:]]+\/?)[[:space:]]*\|[[:space:]]*([[:digit:]](\.[[:digit:]])?)[[:space:]]*\|  ]]; then
      id="${BASH_REMATCH[1]}"
      name="${BASH_REMATCH[2]}"
      url="${BASH_REMATCH[3]}"
      version="${BASH_REMATCH[5]}"
      echo "${id}:${name}:${url}:${version}"
    fi
  fi
done

Method 2

You can too create a bash function and use it in your script as follow

#!/usr/bin/env bash
parse_result(){
  local id
  local name
  local url
  local version

  while read line; do
    if [[ ! ${line} =~ ^[\+\| ]]; then
      if [[ ${line} =~ \|[[:space:]]*([[:digit:]]+)[[:space:]]*\|[[:space:]]+([[:alnum:]\.]+)[[:space:]]+\|[[:space:]]+(https?:\/\/(www\.)?[[:alnum:]]+\.[[:alpha:]]+\/?)[[:space:]]*\|[[:space:]]*([[:digit:]](\.[[:digit:]])?)[[:space:]]*\|  ]]; then
        id="${BASH_REMATCH[1]}"
        name="${BASH_REMATCH[2]}"
        url="${BASH_REMATCH[3]}"
        version="${BASH_REMATCH[5]}"
        echo "${id}:${name}:${url}:${version}"
      fi
    fi
  done
}

parse_result < <(cat cmd.out)

Here IĀ use process substitution but you can use pipe

Result and discussion

As example cmd.out is the command output to parse. In your case you have to replace cat cmd.out by your command

result 1:

$ cat cmd.out | ./app.bash
25:example.com:http://www.example.com/:3.8
34:anotherexample.com:https://anotherexample.com/:3.2
62:yetanotherexample.com:https://yetanotherexample.com/:3.9

result 2:

$ bash app2.bash
25:example.com:http://www.example.com/:3.8
34:anotherexample.com:https://anotherexample.com/:3.2
62:yetanotherexample.com:https://yetanotherexample.com/:3.9