OracleClient not flushing bad connections from pool

B

Barry Frank

We're using System.Data.OracleClient and are seeing the following
behavior. It's driving us crazy and we'd very much appreciate any
help. TIA.

We have a .NET service running on a middle-tier Windows server, that
executes database requests (on behalf of clients over remoting, and
also for internal requests within the service). The Oracle database is
on a different machine, a Solaris.

When there's a problem with connecting to Oracle, e.g. the listener
goes down, or the network is cut, the service will respond with the
correct error, namely "TNS-12535: TNS:blush:peration timed out".

However, when the outage is over, we will still get the same errors. A
tnsping works fine, but our service keeps responding with TNS-12535.
Restarting the service fixes the problem. I suspect the connection
pool retains the bad connections, and restarting the process refreshes
the pool.

1. Is this a bug? Shouldn't the connection pool clear itself out
anyway after a few minutes of non-use? Even if we pause the service
for twenty minutes, the problem does not correct itself. (It does
eventually correct itself, but it can take an hour).

2. How can we manually clear the connection pool, and *automate it* so
we do not have to manually intervene when there's an outage recovery?
(Short of automating a service reboot).
 
M

Miha Markic

Hi Barry,

AFAIK the connection should be removed from pool when you try the connection
and it fails (ado.net should see that it is a bad connection).
As a workaround you might add a space (blank) char to connection string when
connection fails - it will force a new connection.
 
G

General Protection Fault

Barry said:
We're using System.Data.OracleClient and are seeing the following
behavior. It's driving us crazy and we'd very much appreciate any
help. TIA.

We have a .NET service running on a middle-tier Windows server, that
executes database requests (on behalf of clients over remoting, and
also for internal requests within the service). The Oracle database is
on a different machine, a Solaris.

When there's a problem with connecting to Oracle, e.g. the listener
goes down, or the network is cut, the service will respond with the
correct error, namely "TNS-12535: TNS:blush:peration timed out".

However, when the outage is over, we will still get the same errors. A
tnsping works fine, but our service keeps responding with TNS-12535.
Restarting the service fixes the problem. I suspect the connection
pool retains the bad connections, and restarting the process refreshes
the pool.

1. Is this a bug? Shouldn't the connection pool clear itself out
anyway after a few minutes of non-use? Even if we pause the service
for twenty minutes, the problem does not correct itself. (It does
eventually correct itself, but it can take an hour).

2. How can we manually clear the connection pool, and *automate it* so
we do not have to manually intervene when there's an outage recovery?
(Short of automating a service reboot).

Program your service to end when a connection fails.

On the Service properties, on the Recovery tab, set it to Restart the
Service.
 
A

Angel Saenz-Badillos

Barry,
Unfortunatelly this is a bug. there is a QFE fix available for v1.1 of
the Oracle managed provider, please contact PSS. I believe this is
Q829195.

The problem is that we where not checking for all
possible Oracle exceptions that indicate that a connection is no
longer
valid. When we received an error that was not in our expected list but
that
still killed the connection we would fail to recognize it and the pool
would
get corrupt. We are now checking for the following set of oracle
exceptions,
if you are aware of any other exception that may kill the connection
please
contact me. Thank you.

oracle exceptions checked
case 18: // max sessions exceeded
case 19: // max session licenses exceeded
case 24: // server is in single process mode
case 28: // session has been killed
case 436: // oracle isn't licensed
case 1012 //not logged on error
case 1033: // startup/shutdown in progress
case 1034: // oracle not available
case 1075: // currently logged on
case 3113: // end-of-file on communication channel
case 3114: // not connected to ORACLE
case 12xxx //any error starting with 12 thousand

Thanks,
 
A

Alvaro

Hello:

I have a problem
I'm using System.Data.OracleClient with a client oracle
local instalation, the oracle database is on a unix
system, however .Net framework send the following message

Unable to load DLL (oci.dll).
Exception Details: System.DllNotFoundException: Unable to
load DLL (oci.dll).

the oracle client works fine, i can connect using
sqlplusw without problems.

this behavior ocurrs when the web application run on a
windows 2000 server, if i use windows XP there is no
problem.

Thanks.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top