Here is Wikipedia's handy chart of the pursuit of nines:
Interestingly, only 3 of the top 20 websites were able to achieve the mythical 5 nines or 99.999% uptime in 2007. They were Yahoo, AOL, and Comcast. In the first 4 months of 2008, some of the most popular social networks, didn't even come close to that.
From the chart, it should be evident how ridiculous the pursuit of 100% uptime is...
I would choose a consistent approach across the entire environment. Both solutions work fine and will remain compatible with most applications. There is a difference in manageability, though.
I go with the short name as the HOSTNAME setting, and set the FQDN as the first column in /etc/hosts
for the server's IP, followed by the short name.
I have not encountered many software packages that enforce or display a preference between the two. I find the short name to be cleaner for some applications, specifically logging. Maybe I've been unlucky in seeing internal domains like server.northside.chicago.rizzomanufacturing.com
. Who wants to see that in the logs or a shell prompt?
Sometimes, I'm involved in company acquisitions or restructuring where internal domains and/or subdomains change. I like using the short hostname in these cases because logging, kickstarts, printing, systems monitoring, etc. do not need full reconfiguration to account for the new domain names.
A typical RHEL/CentOS server setup for a server named "rizzo" with internal domain "ifp.com", would look like:
/etc/sysconfig/network:
HOSTNAME=rizzo
...
-
/etc/hosts:
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
172.16.100.13 rizzo.ifp.com rizzo
-
[root@rizzo ~]# hostname
rizzo
-
/var/log/messages snippet:
Dec 15 10:10:13 rizzo proftpd[19675]: 172.16.100.13 (::ffff:206.15.236.182[::ffff:206.15.236.182]) - Preparing to
chroot to directory '/app/upload/GREEK'
Dec 15 10:10:51 rizzo proftpd[20660]: 172.16.100.13 (::ffff:12.28.170.2[::ffff:12.28.170.2]) - FTP session opened.
Dec 15 10:10:51 rizzo proftpd[20660]: 172.16.100.13 (::ffff:12.28.170.2[::ffff:12.28.170.2]) - Preparing to chroot
to directory '/app/upload/ftp/SRRID'
Best Answer
I think heartbeat / pacemaker would be the best solution, since they can take care a lot of a lot of race conditions, fencing, etc for you in order to ensure the job only runs on one host at a time. It's possible to design something yourself, but it likely won't account for all the scenarios those packages do, and you'll eventually end up replacing most of, if not all, of the wheel.
If you don't really care about such things and you want a simpler setup. I suggest staggering the cron jobs on the servers by a few minutes. Then when the job starts on the primary it can somehow leave a marker on whatever shared resource the jobs operate on (you don't specify this, so I'm being intentionally vague). If it's a database, they can update a field in a table or if it's on a shared filesystem lock a file.
When the job runs on the second server, it can check for the presence of the marker and abort if it is there.