TNS-12535: TNS:operation timed out — Oracle DB

oracle

I'm seeing a lot of queries from a .Net data access server I maintain timing out. It appears to be totally random with no relationship to data or data locks. For example, the following query timed out!

 SELECT NULL FROM DUAL 

The system logs show that when it happened CPU was at 20%, memory 42%, Disk 3%. What is going on?

The DB is version 10.2.0.3.0 on HPUX.
The ODP driver is 2.111.6.20 (11g driver)

I checked out the sqlnet.log and found a large number of these error messages:

***********************************************************************
Fatal NI connect error 12170.

  VERSION INFORMATION:
        TNS for HPUX: Version 10.2.0.3.0 - Production
        Oracle Bequeath NT Protocol Adapter for HPUX: Version 10.2.0.3.0 - Production
        TCP/IP NT Protocol Adapter for HPUX: Version 10.2.0.3.0 - Production
  Time: 29-JUN-2009 06:42:04
  Tracing not turned on.
  Tns error struct:
    ns main err code: 12535
    TNS-12535: TNS:operation timed out
    ns secondary err code: 12606
    nt main err code: 0
    nt secondary err code: 0

Best Answer

The particular error and the fact that it happens to all applications running against the database would strongly point to a network hiccup as the source of the problem.

  • How are TNS aliases resolved? Are you using a local tnsnames.ora file?

  • Assuming you are using a local tnsnames.ora file, is the TNS alias for the database using an IP address or a host name? Using an IP address eliminates the need to hit DNS, so it may be worthwhile to try that in case the problem is that your DNS server is briefly going nuts.

  • You may also try configuring a backup listener and adding a failover option to the TNS alias. If the problem is that the network hiccups and loses the packets from the client to the listener communication randomly, having a failover option that can be tried may resolve the vast majority of issues without needing to figure out what piece of the network is going flaky. Of course, that assumes that the problem corrects itself quickly enough that the next connection attempt succeeds, but that may well be a reasonable assumption. If adding a backup listener resolves the problem, you can be all but certain that it's a network issue.