Connection pooling questions

C

crbd98

Hello All,

Some time ago, I implemented a data access layer that included a simple
connectin pool. At the time, I did it all by myself: I created N
connections, each connection associated with a worker thread that would
execute the db commands. The pool was fixed and all the connections
were created when the db access class was instantiated. The
connections remained opened during the whole execution. If a connection
was not available at the moment, then the request would block until one
connection was freed.

In my approach, I did not use the SQL server connection polling (I
actually disabled it in my connection strings).

That was done in c++ using ado. Now, I am in a position where I need to
implement a db access layer in .NET. I was wondering what kind of
support I can get from .NET.

I want to implement a similar model. I would appreciate if you could
weigh on my approach and on the ways to accomplish something similar in
ado.net. I would also appreciate your comments on the approach itself.

Thank you
CD
 
W

William \(Bill\) Vaughn

I have a few thoughts:
1.. Is this a Windows Forms or ASP.NET application. The former does not really need a connection pooling mechanism as the entire app might be able to share a common connection.
2.. If it's ASP.NET, then it makes little sense to code, debug, test and deploy your own pooler. The code built into ADO.NET's .NET data providers should work fine.

--
____________________________________
William (Bill) Vaughn
Author, Mentor, Consultant
Microsoft MVP
INETA Speaker
www.betav.com/blog/billva
www.betav.com
Please reply only to the newsgroup so that others can benefit.
This posting is provided "AS IS" with no warranties, and confers no rights.
__________________________________
Visit www.hitchhikerguides.net to get more information on my latest book:
Hitchhiker's Guide to Visual Studio and SQL Server (7th Edition)
and Hitchhiker's Guide to SQL Server 2005 Compact Edition (EBook)
 
C

Cor Ligthert [MVP]

CRB,

In my idea do you have another interpretation of Connection pooling than we,
you see it from your side of the application. Mostly by regulars here is
thought in the way the database is handling all connections it needs. AFAIK
is that set to 100, therefore often fair enough to handle all applications
in the same time as all connections are open (this can even be with more
ASPNet applications running). Therefore is than the advice to close as often
as possible (in other words all the time that it is not needed and that is
long in a disconnected situation in computer time) your connection.

I hope I answer your question a little bit with this.

Cor
 
C

crbd98

Hello Cor,

Thank you for your message.

You are correct. I am considering connection pooling at the client
side. I am aware of,and have used connection pooliing at the server
level as you describe below. In my case, for example, I open 10
connections and all of them remain open for the life of the
application. Each of them associated with a worker thread that executes
the client's db operations.

I know that this is very different from what I hear in general. Open a
connection for every transaction and close as soon as possible.
Nevertheless, I had some motivations for implementing this pooling at
the client side:

- The machine where SQL server was installed was shared with other
applications and I found easier to tune the cpu load by controlling the
throughput at the client side.
- I did not want my transaction to fail when the maximum size of the
pool was reached. This is the typical behavior when all connections in
the pool were being used.
- In some cases (when a failover occured, for example) I ended up with
"bogus" connections reenlisted in the pool. When those connections were
reused, the application would simply get stuck and had to be restarted.
The only way for me to eliminate this problem back then was to disable
connecion pooling at the server.
- By having all connections already open, I thought I would save some
time.

This was my experience. I am not saying that this is what I would like
to do again or whether these motivations are still valid with .NET. The
nature of my application might have been the difference: It was a
real-time video application.

The nature of the application that I want to implement now is a typical
client-server using .NET (no ASP, no WEB services, no Internet,
everything in the local network). The server will be very busy due to
the projected load of users (automatic and people).

Can you see any situation where my approach would be benefitial?
Am I just making things more complicated for me?
Do I gain anything for keeping the connections open?
Are my original motivatinons valid?

Your input is greatly appreciated.

Kind regards
 
C

crbd98

Hello Bill,

Thank you for your message.

No this will be a simple client-server application using .NET over a
local network without ASP or Windows Forms.

I would appreciate if you could comment on the reply I sent to "Cor" in
this thread.

Thank you
CD
 
M

Marc Gravell

Personally, I would *tend* to just use the inbuilt pooling in this
scenario, and just use the "open, use, close, dispose" model which in
turn uses the inbuilt pooling. To throttle the connections, I might put
a semaphore (or equivalent) around my DAL, such that only <x> threads
can be using the DAL. Assuming correct "close, dispose" steps, this
should effectively limit the connection size, and you may find that in
reality it uses *less* connections than your cap. You could even use a
shared semaphore (perhaps named using the server/db pair) to limit the
throughput per client (rather than per process/appdomain) - compared to
network IO the cost of a Win32 semaphore (as opposed to a .Net counter)
is negligible.
Re your bullets, I believe this addresses them:
- throttle: the semaphore
- failing on aquire: thread block on aquiring the semaphore [although
if you are hitting the pool limit it sounds like you are simply using
too many connections?]
- dead connections: hmmm... not sure how to "fix" this one; but
assuming you handle the immediate exception, does the connection not
de-pool itself after finding itself invalid?
- already open: they will be


Just my thoughts,

Marc
 
C

Cor Ligthert [MVP]

Hi crbd,
Can you see any situation where my approach would be benefitial? No,

Am I just making things more complicated for me?
Very very much yes
Do I gain anything for keeping the connections open?
Yes more problems
Are my original motivatinons valid?
I cannot see but if it solved something it should be valid.

I hope this helps,

Cor
 
W

William \(Bill\) Vaughn

Let's review: (see >>>)

--
____________________________________
William (Bill) Vaughn
Author, Mentor, Consultant
Microsoft MVP
INETA Speaker
www.betav.com/blog/billva
www.betav.com
Please reply only to the newsgroup so that others can benefit.
This posting is provided "AS IS" with no warranties, and confers no rights.
__________________________________
Visit www.hitchhikerguides.net to get more information on my latest book:
Hitchhiker's Guide to Visual Studio and SQL Server (7th Edition)
and Hitchhiker's Guide to SQL Server 2005 Compact Edition (EBook)
-----------------------------------------------------------------------------------------------------------------------

Hello Cor,

Thank you for your message.

You are correct. I am considering connection pooling at the client
side. I am aware of,and have used connection pooliing at the server
level as you describe below. In my case, for example, I open 10
connections and all of them remain open for the life of the
application. Each of them associated with a worker thread that executes
the client's db operations.
I know that this is very different from what I hear in general. Open a
connection for every transaction and close as soon as possible.
Nevertheless, I had some motivations for implementing this pooling at
the client side:

- The machine where SQL server was installed was shared with other
applications and I found easier to tune the cpu load by controlling the
throughput at the client side.
- I did not want my transaction to fail when the maximum size of the
pool was reached. This is the typical behavior when all connections in
the pool were being used.
- In some cases (when a failover occured, for example) I ended up with
"bogus" connections reenlisted in the pool. When those connections were
reused, the application would simply get stuck and had to be restarted.
The only way for me to eliminate this problem back then was to disable
connecion pooling at the server.
- By having all connections already open, I thought I would save some
time.


This was my experience. I am not saying that this is what I would like
to do again or whether these motivations are still valid with .NET. The
nature of my application might have been the difference: It was a
real-time video application.
The nature of the application that I want to implement now is a typical
client-server using .NET (no ASP, no WEB services, no Internet,
everything in the local network). The server will be very busy due to
the projected load of users (automatic and people).

Can you see any situation where my approach would be benefitial?
 
C

crbd98

Hello Bill,

Thank you for your detailed reply.

Based on your response, maybe the only advantage of keeping the
connections to the db open for the life of the application would be to
avoid the delay of opening and closing the connection. Without caching,
this would be prohibitive, of course. With caching, I immagine that the
time to open and close is significantly reduce. I wonder, however if
this time would still compund significantly in the case of applications
that interact with the db with a very high frequency. For instance, my
application is driven by an (multithreaded) automatic process that can
issue more that 100 transactions/second (sustained). Would the compound
cost of opening/closing/ the connection in this scenario justify my
idea of keeping a pool of open connections at the client ? So, the
question at the center of my problem is:"What is the cost for
opening/executing/closing a connection in the case caching is used?"

Thank you
CD
 
M

Marc Gravell

Depends on what you mean by caching in this scenario... do you mean the
in-built pooling? In which case the answer would tend to be "not much
as long as you re-use the same connection string", where "not much"
obviously includes a few lookups inside the pool, which will be
absolutely trivial compared to network IO.

But: in general, the performance of something operating in a tight loop
can only really be assessed by testing it... since it wouldn't take
much code (less, in fact), I would suggest giving it a go...

Marc
 
W

William \(Bill\) Vaughn

If you use your own Connection pooler (for this specific case), you'll find
that the additional cost of re-authenticating the SSPI credentials are
eliminated, but you'll also have to deal with server state persistence
issues. The built-in pooler deals with lot of issues but since you are
holding connections open and trying to share them, you really need to
consider these other issues as well. Architecturally, I would find a
strategy to use a dedicated connection for those high-traffic operations.

Only you and your own tests can really measure the impact of any pooling
architecture configuration. Remember not to measure the speed in debug mode
as it's notoriously slow.
____________________________________
William (Bill) Vaughn
Author, Mentor, Consultant
Microsoft MVP
INETA Speaker
www.betav.com/blog/billva
www.betav.com
Please reply only to the newsgroup so that others can benefit.
This posting is provided "AS IS" with no warranties, and confers no rights.
__________________________________
Visit www.hitchhikerguides.net to get more information on my latest book:
Hitchhiker's Guide to Visual Studio and SQL Server (7th Edition)
and Hitchhiker's Guide to SQL Server 2005 Compact Edition (EBook)
 
C

crbd98

Hello All,

Thank you for all your input.

Based on your suggestions, I decided to perform a simple test to asses
the impact of pooling. I tested 3 scenarios:

(1) SQL Pooling disabled and doing open/execute/close (I knew this was
going to be the one with worst performance but I just wanted to
quantify)
(2) SQL Pooling enabled and doing open/execute/close
(3) SQL Pooling off and using a global connection (opening at the
beginning of the application and closing it only at the end)

In each scenario, I executed 1000 and 10000 queries in a loop (the same
in each case). Here are the results(the times in milliseconds)

#queries (3) (2) (1)
1000 172 200 1900
10000 984 1250 18406

I should note that in my tests both the database and the client
application resided in the SAME MACHINE. I do not know whether this is
a factor.

THE FUNDAMENTAL QUESTION:
In scenario (2), is any network access required to authenticate the
credentials when the connection is opened? Or everything is done at the
client side without requiring a round trip? If a validation at the
server is required, then the times in (2) will be considerably higher.

If all the validation is performed at the client side, then the results
above are valid in general. They show an overhead of about 20% when
using the built-in pooling (2) when compared to the global connection
approach (3). I think this would not justify the implementation of my
own connection pool. Once I implement my full own client side
connection pool with all synchronizations, threads, etc, I will
probably loose the advantage.

Please let me know what you think.
Kind regards
CD
 
W

William \(Bill\) Vaughn

See >>>

--
____________________________________
William (Bill) Vaughn
Author, Mentor, Consultant
Microsoft MVP
INETA Speaker
www.betav.com/blog/billva
www.betav.com
Please reply only to the newsgroup so that others can benefit.
This posting is provided "AS IS" with no warranties, and confers no rights.
__________________________________
Visit www.hitchhikerguides.net to get more information on my latest book:
Hitchhiker's Guide to Visual Studio and SQL Server (7th Edition)
and Hitchhiker's Guide to SQL Server 2005 Compact Edition (EBook)
-----------------------------------------------------------------------------------------------------------------------

Hello All,

Thank you for all your input.

Based on your suggestions, I decided to perform a simple test to asses
the impact of pooling. I tested 3 scenarios:

(1) SQL Pooling disabled and doing open/execute/close (I knew this was
going to be the one with worst performance but I just wanted to
quantify)
(2) SQL Pooling enabled and doing open/execute/close
(3) SQL Pooling off and using a global connection (opening at the
beginning of the application and closing it only at the end)

In each scenario, I executed 1000 and 10000 queries in a loop (the same
in each case). Here are the results(the times in milliseconds)

#queries (3) (2) (1)
1000 172 200 1900
10000 984 1250 18406


I should note that in my tests both the database and the client
application resided in the SAME MACHINE. I do not know whether this is
a factor.
THE FUNDAMENTAL QUESTION:
In scenario (2), is any network access required to authenticate the
credentials when the connection is opened? Or everything is done at the
client side without requiring a round trip? If a validation at the
server is required, then the times in (2) will be considerably higher.

If all the validation is performed at the client side, then the results
above are valid in general. They show an overhead of about 20% when
using the built-in pooling (2) when compared to the global connection
approach (3). I think this would not justify the implementation of my
own connection pool. Once I implement my full own client side
connection pool with all synchronizations, threads, etc, I will
probably loose the advantage.

Chapter 9 in my book provides a lot more detail on getting connected,
monitoring the pool, performance issues and much more.

hth
 
R

Roger Wolter[MSFT]

As Bill said, SQL Server cleans up pooled connection before each use so if
you do your own connection pooling you will have to deal with the
consequences of not doing that. For example, if someone leaves a
transaction open, any uncommitted updates will be added to your new
transaction. The rechecking of security credentials is to prevent the
security hole of a user being deleted from Windows but still being able to
access SQL Server because he has a connection is still open in the cache.
This is a pretty small hole so it's probably not something you would need to
do in your solution.
 
C

CD

Hello Bill,

Thank you for your message. Based on all the discussions, I decided to
use the built-in pooling (and also buy your book :) ).

I tried my tests in a database over the network and found some results
that surprised me quite a bit. I compared the times of (1) reusing a
global connection versus the time of (2) using a connection from the
connection pool. In each case, I perfom a sequence of queries (1000 and
10000 respectivelly). The times are in milliseconds.

# of queries (1) (2)
1000 562 578
10000 6109 5735

I did this many times. I was expecting that the global connection
approach would be faster (as it was in the case where the db server
was in the same machine as the client). But they are basically
equivalent, as the numbers show. I monitored the activity using the SQL
profiler at the server and perfmon at the client. I noticed the extra
sp_reset_connection sent for every new query.

Does this make sense?

Cheers
CD
 
W

William \(Bill\) Vaughn

Yup. The folks at MS have really tuned up the pooler. Note that the new
version (2.0) supports the ability (and intelligence) to flush the pool if
any of the connections fail to connect. This means that the server has gone
down. I expect there are a dozen hidden features and issues that you would
have to replicate (and support) if you take your own (slower) approach...

If you like the book, please post a review on Amazon.
Thanks

--
____________________________________
William (Bill) Vaughn
Author, Mentor, Consultant
Microsoft MVP
INETA Speaker
www.betav.com/blog/billva
www.betav.com
Please reply only to the newsgroup so that others can benefit.
This posting is provided "AS IS" with no warranties, and confers no rights.
__________________________________
Visit www.hitchhikerguides.net to get more information on my latest book:
Hitchhiker's Guide to Visual Studio and SQL Server (7th Edition)
and Hitchhiker's Guide to SQL Server 2005 Compact Edition (EBook)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top