Iterating through selects - what is the best approach?

G

Guest

Hi;

I am porting some code (from java) and I need to do the following on SQL
databases:
I am passed the DbConnection object so I only have the one connection.
Call a select to return a list of rows.
Iterate forward and backwards through the returned rows.
While on a row, perform a second select and iterate through the inner
select. The nested selects can be N layers deep but are not interleaved - an
inner select will be completed and closed before getting the next/previous
row from the outer select.
I retrieve column values based on column name and/or column index. I never
need the results to be placed in a DataSet or any other .net object.

So what is the best approach? In the java world I did this all with
low-level JDBC calls. But from reading the docs it looks like DbDataReader
cannot do a previous and that I cannot nest calls to DbCommand.ExecuteReader
on a single connection.

A lot of the sample code I have seen does not call DbConnection.Open() or
Close(). I assume the Open() is called for me when ExecuteReader() is called.
But don't I have to call Close() to close the connection?

Anything else?
 
S

Sahil Malik [MVP]

I am passed the DbConnection object so I only have the one connection.

Terrible idea, please see -
http://codebetter.com/blogs/sahil.malik/archive/2005/01/30/48948.aspx
Call a select to return a list of rows.
Iterate forward and backwards through the returned rows.
While on a row, perform a second select and iterate through the inner
select. The nested selects can be N layers deep but are not interleaved -
an
inner select will be completed and closed before getting the next/previous
row from the outer select.
I retrieve column values based on column name and/or column index. I never
need the results to be placed in a DataSet or any other .net object.

Is there a specific need that requires you to iterate over the same row, and
on the very same connection execute another query? How much data (number of
rows) are you dealing with? Unless you need the selects to run on
transactions that are on the same isolation level .. you're much better off
with either filling disconnected data (in case you don't have many 100 MB of
data), or simply run multiple commands on multiple connections (highly not
recommended if you can go infinite levels deep). The ideal solution would be
to rearchitect so this dependency on keeping your connection open for an
extremely long time is removed.
A lot of the sample code I have seen does not call DbConnection.Open() or
Close(). I assume the Open() is called for me when ExecuteReader() is
called.
But don't I have to call Close() to close the connection?

No it isn't - for ExecuteReader to work, you need a connection to be open.
Also, you must call Close when you are done, or use
CommandBehavior.CloseConnection, and close the datareader.

- Sahil Malik [MVP]
ADO.NET 2.0 book -
http://codebetter.com/blogs/sahil.malik/archive/2005/05/13/63199.aspx
 
G

Guest

Hello;

First off, the connections are not held open indefinitely. This is for the
product www.windwardreports.com and what happens is the caller passes us the
connection and report template. We populate the template with data using the
selects, and then return to the caller where they hopefully close the
connection.

So we do a bunch of selects on the connection, but we are hitting it
non-stop and returning so this is not a case of keeping an open connection
but not using it.

Second, the entire architecture is implemented around getting one connection
and performing nested selects and doing next/previous on the returned rows.
This is a piece of cake with JDBC and so everything was designed assuming
this capability.

We are .NET 2.0 only. We make no transaction calls. The code calling us does
have the ability to wrap the call to us in a transaction. And all we do is
read the database, no update/delete/etc. Just selects.

But it can be a lot of data being read from the database. Tens of megabytes
is not uncommon. And while hundreds of megabytes may not be occuring today -
I don't want to rule it out in the future - or worse say that is only
possible with our java based solution.
 
S

Sahil Malik [MVP]

Yes but they are open for a considerable time --- you should try and
minimize the time the connection is actually open, and the standard way of
doing that is to fetch the data in a disconnected object, and iterate over
the disconnected object instead. At some point you have to make the
judgement call of keeping an open connection versus higher memory usage ~~
in your case the data is too huge to warrant the use of disconnected
objects.
So we do a bunch of selects on the connection, but we are hitting it
non-stop and returning so this is not a case of keeping an open connection
but not using it.

Umm okay .. it truly depends on your architecture, if you're comfortable
with it, then fine.
Second, the entire architecture is implemented around getting one
connection
and performing nested selects and doing next/previous on the returned
rows.
This is a piece of cake with JDBC and so everything was designed assuming
this capability.

Bad idea !!! The "one connection" - lets reinvent the wheel is typical
java/oracle mentality, server side scrolling, pessimistic locking are
brothers and sisters of that mentality. A global SqlConnection instance in
..NET is a bad idea.
We are .NET 2.0 only. We make no transaction calls. The code calling us
does
have the ability to wrap the call to us in a transaction. And all we do is
read the database, no update/delete/etc. Just selects.

But it can be a lot of data being read from the database. Tens of
megabytes
is not uncommon. And while hundreds of megabytes may not be occuring
today -
I don't want to rule it out in the future - or worse say that is only
possible with our java based solution.


Okay that draws a good picture. With tens of megabytes - I'd be a bit
uncomfy suggesting datatables. Now here's a question - How many levels would
you typically go? 10? 100? 1000?

Why is that important? because if you are going many - many - many levels
deep - say anytime over 1000 levels, you may hit the maximum connection pool
setting - EVEN with MARS. Not only that, if you didn't use MARS you would
hit the limit a lot sooner. But not using MARS gives you a simpler paradigm,
i.e. just open a new connection and execute a command - then close the
connection. This way subsequent requests can leverage the same physical
connection because you closed it and now it's poolable.

However if you were using MARS and NOT closing connections, you would hit
the internal session pool limit of about 10, and then internally you would
open another connection and start using that. That way, your physical
database connection count stays low - but you use MARS (eww). Why I say eww
to MARS is because in simplistic scenarios - like yours, which is pretty
much readonly with implicit transactions, it may not be such a problem. But
I am wondering if ever some developer will want to come in and start adding
update/insert statements on the same connection(s), while all that code is
running. That will create serious issues because there is more than meets
the eye about MARS.

So what I'd recommend is, fully understand MARS, and if you feel comfortable
going ahead with it - then use it. That will give you a much higher
concurrently nested limit, and may be technically the better solution.
But if your need is anymore than selects (or can ever be anymore than
selects only) - and your concurrently nested levels will never go over a
100 - then just open new connections and keep a simpler paradigm for a
possibly more predictable system in the future. There will also be a very
slight performance hit - small enough to ignore.

Finally - the Max connection pool size is configurable via the connection
string. BTW, all of this is explained in depth in my book.

And if this is too much information to digest - then just do MARS and pray
to god LOL :)

Alternatively, you can try and do this recursive stuff inside the database -
I assume you are on SQL2k5 (you need that for MARS anyway), just run a big
FOR XML query/CTE - process the results as tabular data - and you wouldn't
have to deal with all this headache - that honestly if you can implement
that would be a better solution.

- Sahil Malik [MVP]
ADO.NET 2.0 book -
http://codebetter.com/blogs/sahil.malik/archive/2005/05/13/63199.aspx
 
C

Cor Ligthert [MVP]

David,

What you tell looks to me as a from Cobol derived application with minimum
architecture changes.

In my opinion will that only cost extra it that is done again for .Net.
Would you not first review your architecture where can be the bottlenecks
than just adepting it straight into another development environment.

Without those reviews there will not be a best approach.

Just my thought,

Cor
 
G

Guest

Nope, not COBOL. Written originally in java - but we have to continue in java
with our existing API as we have a ton of existing customers. We are porting
over to .net also, not moving to just .net.

And because JDBC allows multiple nested selects, we used it <g>.
 
G

Guest

Hi;

Thank you for the detailed answer - it helped me a lot. First off, we can't
use MARS because we allow connections to any database so we have to go with
the capabilities that they all support.

We generally nest 1, maybe 2 layers deep. It depends on how the customer
designs their report but I would be surprised at more than 2 layers. I think
we are going to have to say in the .net world they have to pass us the
DbConnection class type and connection string and we create the connection
object(s). It's a bit more limiting than the java API but it should work for
99% of our potential customers.

As to the previous() method call, I think I will have to copy the last N
rows on a next (I do know the value of N and it will usually be 1 - 5). Any
suggestions on the best way to get a copy of a row of data?

Again - thanks to you and Cor - you have helped me a lot (and I have ordered
your book).
 
S

Sahil Malik [MVP]

For 1 or 2 layers deep - just open a new connection :), it's a no-brainer at
that point.
As to the previous() method call, I think I will have to copy the last N
rows on a next (I do know the value of N and it will usually be 1 - 5).
Any
suggestions on the best way to get a copy of a row of data?

Search my blog for "Databindable datareader" - that may be the midway
approach you need. It's also there in Chapter 5 of my book. And thanks for
ordering it, I hope you will like it - I did put a lot of work into it :)


--

- Sahil Malik [MVP]
ADO.NET 2.0 book -
http://codebetter.com/blogs/sahil.malik/archive/2005/05/13/63199.aspx
 
G

Guest

Hi;

This is great. If I understand your code right, the DbDataRecord is still
the old values even when the DbDataReader.Next is called. That is very sweet
and gives me everything I need.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top