Bash – Doesn’t work regular expression for domain with hyphens

bashregexregular expressionsshellshell-scripting

I have some smart script, that check name of server and get domain name.
For example, i have name of server: example.ru01. I need to get: example.ru
My scipt:

#!/bin/bash

hostname=example.com01
echo $hostname
reg0="\(\(\w*\.[a-z]*\)\|\(\w*\.[a-z]*\.[a-z]*\)\)"
domain=`expr match $hostname $reg0`
echo $domain

It is ok. in output i have:

example.com01
example.com

But, in my infrastructure, i have some domains with hyphens. For example: test-test.com01. But it doesn't working in my script. How to resolve this problem ? Please help. I made some changes in my regular expression, like this:

\(\(\w*\.[a-z_-]*\)\|\(\w*\.[a-z_-]*\.[a-z_-]*\)\)

But it doesn't work. Where i have error ? Please help. Thanks for your attention.

Best Answer

The problem with your regular expression is that you tell it that it must first match your string with zero to infinity amounts of \w which "Matches any word character including underscore", followed by a literal dot .. (\w*\.)

in the case of test-test.com01 it does not match because of the hyphen, so if you change it to match also - then it will work the way you want it to:

\(\([a-z_-]*\.[a-z_-]*\)\|\([a-z_]*\.[a-z_-]*\.[a-z_-]*\)\)
      ^ replace \w            ^ replace \w

There are several ways to improve this regular expression but IMO the amount of time you should put into making a good one is proportional to the complexity of the text you parse.