The only way to get more info from haproxy than what you have, would be to use the show sess
or show sess <id>
command periodically to watch the state of each tcp connection, though I'm not sure if you would get any more useful information.
The cD
termination state is the most helpful piece of info you have. What it means exactly is that an established connection with the client was timed out. This is controlled in haproxy via the timeout client
parameter in the config, set globally, or in a frontent or listen section.
You said that you don't see concurrent connections go over 7, and this log entry show that the failure happened when there were only 3 connections, so I doubt you have a connection limit problem (even outside of haproxy's control).
So what it looks like is happening, is that occasionally the pool adds a new connection, which handles some queries, and then sits idle. When that connection sits idle longer than the timeout client
setting in haproxy, haproxy is going to terminate the connection itself. Next time that connection is used from the pool, you get the above error.
Haproxy has a default timeout of 10 seconds (and the example configs have 50 seconds I think). I'm not too familiar with JDBC, but going from Tomcat's docs, there is a setting minEvictableIdleTimeMillis
, which will evict idle connection from the pool, and defaults to 60 seconds, and may be up to 65 seconds because the timeBetweenEvictionRunsMillis
is 5 seconds by default. Basically, you need to make sure your haproxy timeout is high enough to account for these idle connections in the pool.
Another approach would be to use testWhileIdle
and valildationQuery
to keep the connections active, since a few packets of traffic every few seconds would probably alleviate the issue as well.
[edit] In response to @quanta's additional information:
Even though the haproxy timeout is now 75sec, your are definitely still getting session timeouts. There may be some addition play in the total lifetime of a JDBC connection that I'm not aware of. Since there are very few connections needed for this type of service, there is also nothing wrong with increasing the timeouts to something extremely high, on the order of an hour or more. If the JDBC pool really is having problems releasing old connections, this would only be masking the issue, but it could be an easy fix as well.
Best Answer
How long are you actually waiting? Your configuration means you could have to wait up to 10 minutes, since test frequency is set to 600 seconds (10 minutes). Connection creation retry frequency relates to a scenario in which you restarted weblogic during the DB outage (or any other trouble connecting to the DB) and the data source didn't get created during startup. This parameter tells weblogic the frequency with which it'll retry the connection creation. With your configuration, the data source would indeed stay down forever, but it wouldn't be in suspended state, it wouldn't even show up in the monitoring tab because it didn't get initialized. And your managed server would start up in ADMIN mode. I personally like to set this parameter to something > 0 in all of my data sources. In my default wlst script, I set it to 300 seconds (5 minutes).