Linux – How to find the cause for a huge difference in performance between two identical Ubuntu servers

javalinuxperformanceUbuntuubuntu-10.04

I am running two Dell R410 servers in the same rack of a data center (behind a load balancer). Both have the same hardware configuration, run Ubuntu 10.4, have the same packages installed and run the same Java web servers (no other load) and I'm seeing a substantial performance difference between the two.

The performance difference is most obvious in the average response times of both servers (measured in the Java app itself, without network latencies): One of them is 20-30% faster than the other, very consistently.
I used dstat to figure out, if there are more context switches, IO, swapping or anything, but I see no reason for the difference. With the same workload, (no swapping, virtually no IO), the cpu usage and load is higher on one server.

So the difference appears to be mainly CPU bound, but while a simple cpu benchmark using sysbench (with all other load turned off) did yield a difference, it was only 6%. So maybe it is not only CPU but also memory performance.

So far I've checked:

  • Firmware revisions on all components (identical)
  • BIOS settings (I did a dump using dmidecode, and that showed no differences)
  • I compared /proc/cpuinfo, no difference.
  • I compared the output of cpufreq-info, no difference.
  • Java / JVM Parameters (same version and parameters on both systems)

Also, I completely replaced the RAM some months ago, without any effect.

I am lost. What can I do to figure out, what is going on?

UPDATE:
Yay! Both servers perform equally now. It was the "power CRAP" settings as jim_m_somewhere named them in the comments. The BIOS options for "Power Management" were on "Maximum Performance" on the fast server, and on "Active Power Controller" (default setting from Dell) on the other one. Obviously I forgot, that I made that setting two years ago, and I didn't do that on all servers. Thanks to all for your very helpful input!

Best Answer

Two ideas, depending on how far you want to go with this:

  1. Swap the disks of both servers and see if the speed performance stays on the hardware or moves with the software.

  2. Compare the output of /opt/dell/toolkit/bin/syscfg -o complete-bios-config.out if you can somehow trick this package to install.

Related Topic