Java fat client slow when connecting to localhost, fast with remote

javasles

I'm having problems with a (usually) latency bound desktop java application connecting to a custom database server.

While it's working on a remote host (windows XP) it's fast (big form opens in under 2 seconds). When it's running on same host the database is on (using X11vnc and NX) it is very slow (the same form opens in aroung 20 seconds). Server is running SuSE Linux Enterprise Server 10.

What I checked:

  • iptables are clean (no rules in filter, raw, mangle or nat, all tables on ACCEPT)
  • routing is normal (just default route and local network)
  • brtables are not even installed
  • tc is clean
  • ping latency to localhost is aroung 0.007ms for both 64 byte and 1500 byte packets, latency to remote host is around 0.8ms
  • loopback throughput is aroung 500MiB/s (tested with netcat)
  • different java VMs (both 1.5 and 1.6)

While looking in atop there doesn't seem to be any bottlenecks:

  • CPU utilisation of the java and database processes is very low (java: ~20%, database <5%), the CPU utilisation is moderate on remote access (around 30% for the database) server is a Quad Core 2.66Ghz, client is a Core 2 Duo 2.33Ghz, system besides that is idle
  • there are hardly any disk reads/writes during the long query (in total about 5-10 reads)

The only thing that differs between the remote and local run is network utilisation, while the local process pulls data at about 1200kbps, the remote is doing it at about 15Mbps.

I'm currently working on duplicating the problem with my hardware, so any tips on those lines are welcome too.

EDIT: Changing the lo interface MTU from the default 16k to 1500 fixes the issue.
The issue had been duplicated on Debain lenny 64bit.

Best Answer

All I've got for you are more debugging ideas in no particular order...

  • is it really loopback that's the problem? Have you tried connecting to the network IP address of the server rather than 127.0.0.1? (this uses the loopback interface on my system, but it should rule out odd DNS issues)
  • is the system load high during this time (io issues can cause system load to be higher than all the individual processes together would appear to be)?
  • Check netstat's information for each process, if client's Recv-Q is high, the client process isn't reading from its socket on a regular basis.
  • Try watching tcpdump -i lo, it might help figure out whether there's an obvious pattern to the packets being transmitted.
  • Does "vmstat 1" show drastically different behavior for the remote access vs the local access versions (i.e. is holding the dataset in RAM in both the db server and the java client forcing you to swap)?
  • Try increasing the MTU on the loopback device, mine defaults to 16436. This won't help much if you do lots of little bitty packets. Little bitty packets seems to have problems of their own. I'm not a Java programmer, so I don't know how one would do it but try setting TCP_NODELAY (setsockopt system call) on the connection. This one seems like it has a cargo cult following, but supposedly if the communication is one-way, the client will respond with TCP ACKs more regularly and keep the data flowing.
  • Another thing to try tweaking: echo 1 > /proc/sys/net/ipv4/tcp_low_latency
  • While playing with setsockopt, see what happens if you increase the sending and receiving buffers in the client?
  • You're not using some sort of ancient 2.2 kernel are you? There was apparently some sort of huge loopback bug fixed back in 2.3.x according to the 2.4 TODO
  • Maybe there's a bug in the client (or in Java), have you run the exact same client on a separate Linux system with the same java runtime?
Related Topic