Filtering DB Call based on local Data

B

Bill

This may be a simple problem ( let's hope ). I am trying to extract
a
filtered set of records from a Sql Server Database. The set of rows,
however, is to be filtered based on data external to the database.

Specifically, this is an Excel Add-in.


The user selects a range which contains order numbers ( single
column ). Loading that data into a DataTable or Array is easy
enough.


Using those ( say 30 ) order_id values, I then want to get data
specific to the order from the order table on a Sql Server Database
( assume 1:1
relationship ).


I am new to ADO, I do not know how best to represent this data
( which
objects ) within the code.


If I use a DataSet where the short list of order ids is in one
DataTable with another table connected to the Sql Server order table,
will I end up loading all 300K of the orders before I get a chance to
filter the set.


Is there an easier way to select the data from the database using
different design options?


Ultimately it breaks down to a query like this:
Can I somehow relate data located only in memory/.Net Objects to a
table within the database?

The query could be written as below but the possibility is that the
list of orders could be as much as 10K.

SELECT a, b, c FROM orders WHERE order_id IN ( 100, 200, 300, ...,
999 )


Is there a way to do this without building that query?
 
N

Nicholas Paldino [.NET/C# MVP]

Bill,

Not really. You should create some sort of dynamic query, using an IN
statement, appending a new parameter for each of the items you want to
filter on. SQL Server will take care of the rest.

I think that SQL Server has a limitation on the number of parameters
that you can use on dynamic queries (around 2048 I think) so you might
exceed that limitation.

If you are using SQL Server 2008, you can use table-valued parameters,
but I doubt you are doing that.

If you are using SQL Server 2005, you can create a table-valued function
in SQL Server which will take a comma-delimited string and return a table of
ids, on which you can do a join (if the list of ids is unique) or perform an
IN operation on. The one I wrote is a CLR function and is quite fast, even
parsing 10,000 items (it beats out any other method that you can code with
T-SQL).

Then, all you have to do is take your delimited string, pass it as a
single parameter to your query (it can be a stored procedure or dynamic
query at this point) and you should be done.
 
L

LVP

depending on the speed you are looking for,

Send the Ids as XML, parse it in a SP into a temp table then Join and return
your result set.

or depending on the number of Ids you can choose one method or another
method

How long does it take to put 10K into a temp table via many means? then
Join

LVP


Nicholas Paldino said:
Bill,

Not really. You should create some sort of dynamic query, using an IN
statement, appending a new parameter for each of the items you want to
filter on. SQL Server will take care of the rest.

I think that SQL Server has a limitation on the number of parameters
that you can use on dynamic queries (around 2048 I think) so you might
exceed that limitation.

If you are using SQL Server 2008, you can use table-valued parameters,
but I doubt you are doing that.

If you are using SQL Server 2005, you can create a table-valued
function in SQL Server which will take a comma-delimited string and return
a table of ids, on which you can do a join (if the list of ids is unique)
or perform an IN operation on. The one I wrote is a CLR function and is
quite fast, even parsing 10,000 items (it beats out any other method that
you can code with T-SQL).

Then, all you have to do is take your delimited string, pass it as a
single parameter to your query (it can be a stored procedure or dynamic
query at this point) and you should be done.


--
- Nicholas Paldino [.NET/C# MVP]
- (e-mail address removed)

Bill said:
This may be a simple problem ( let's hope ). I am trying to extract
a
filtered set of records from a Sql Server Database. The set of rows,
however, is to be filtered based on data external to the database.

Specifically, this is an Excel Add-in.


The user selects a range which contains order numbers ( single
column ). Loading that data into a DataTable or Array is easy
enough.


Using those ( say 30 ) order_id values, I then want to get data
specific to the order from the order table on a Sql Server Database
( assume 1:1
relationship ).


I am new to ADO, I do not know how best to represent this data
( which
objects ) within the code.


If I use a DataSet where the short list of order ids is in one
DataTable with another table connected to the Sql Server order table,
will I end up loading all 300K of the orders before I get a chance to
filter the set.


Is there an easier way to select the data from the database using
different design options?


Ultimately it breaks down to a query like this:
Can I somehow relate data located only in memory/.Net Objects to a
table within the database?

The query could be written as below but the possibility is that the
list of orders could be as much as 10K.

SELECT a, b, c FROM orders WHERE order_id IN ( 100, 200, 300, ...,
999 )


Is there a way to do this without building that query?
 
N

Nicholas Paldino [.NET/C# MVP]

I've done the testing using XML, and the overhead on generating the XML
on the client side, as well as parsing it on the server side is WAY too
large to justify using it in this way. Granted, it takes a little less
code, since you can use the XML parsing in SQL Server, but the overhead just
kills it.

Also, in using XML, you will more than likely complicate the client
code, as generating a comma delimited string (with say a StringBuilder) is
probably much easier than generating an XML document (that conforms to the
appropriate schema, which you SHOULD have).

--
- Nicholas Paldino [.NET/C# MVP]
- (e-mail address removed)

LVP said:
depending on the speed you are looking for,

Send the Ids as XML, parse it in a SP into a temp table then Join and
return your result set.

or depending on the number of Ids you can choose one method or another
method

How long does it take to put 10K into a temp table via many means? then
Join

LVP


Nicholas Paldino said:
Bill,

Not really. You should create some sort of dynamic query, using an IN
statement, appending a new parameter for each of the items you want to
filter on. SQL Server will take care of the rest.

I think that SQL Server has a limitation on the number of parameters
that you can use on dynamic queries (around 2048 I think) so you might
exceed that limitation.

If you are using SQL Server 2008, you can use table-valued parameters,
but I doubt you are doing that.

If you are using SQL Server 2005, you can create a table-valued
function in SQL Server which will take a comma-delimited string and
return a table of ids, on which you can do a join (if the list of ids is
unique) or perform an IN operation on. The one I wrote is a CLR function
and is quite fast, even parsing 10,000 items (it beats out any other
method that you can code with T-SQL).

Then, all you have to do is take your delimited string, pass it as a
single parameter to your query (it can be a stored procedure or dynamic
query at this point) and you should be done.


--
- Nicholas Paldino [.NET/C# MVP]
- (e-mail address removed)

Bill said:
This may be a simple problem ( let's hope ). I am trying to extract
a
filtered set of records from a Sql Server Database. The set of rows,
however, is to be filtered based on data external to the database.

Specifically, this is an Excel Add-in.


The user selects a range which contains order numbers ( single
column ). Loading that data into a DataTable or Array is easy
enough.


Using those ( say 30 ) order_id values, I then want to get data
specific to the order from the order table on a Sql Server Database
( assume 1:1
relationship ).


I am new to ADO, I do not know how best to represent this data
( which
objects ) within the code.


If I use a DataSet where the short list of order ids is in one
DataTable with another table connected to the Sql Server order table,
will I end up loading all 300K of the orders before I get a chance to
filter the set.


Is there an easier way to select the data from the database using
different design options?


Ultimately it breaks down to a query like this:
Can I somehow relate data located only in memory/.Net Objects to a
table within the database?

The query could be written as below but the possibility is that the
list of orders could be as much as 10K.

SELECT a, b, c FROM orders WHERE order_id IN ( 100, 200, 300, ...,
999 )


Is there a way to do this without building that query?
 
P

Peter Bromberg [C# MVP]

I'd have to agree with Nick; using a TVF like "fn_Split" that accepts a
single delimited string of items and returns you a table that you can do a
join on is fast and easy. Moreover, with a delimited string you have a lot
less glop going over the wire.
-- Peter
Site: http://www.eggheadcafe.com
UnBlog: http://petesbloggerama.blogspot.com
MetaFinder: http://www.blogmetafinder.com


LVP said:
depending on the speed you are looking for,

Send the Ids as XML, parse it in a SP into a temp table then Join and return
your result set.

or depending on the number of Ids you can choose one method or another
method

How long does it take to put 10K into a temp table via many means? then
Join

LVP


Nicholas Paldino said:
Bill,

Not really. You should create some sort of dynamic query, using an IN
statement, appending a new parameter for each of the items you want to
filter on. SQL Server will take care of the rest.

I think that SQL Server has a limitation on the number of parameters
that you can use on dynamic queries (around 2048 I think) so you might
exceed that limitation.

If you are using SQL Server 2008, you can use table-valued parameters,
but I doubt you are doing that.

If you are using SQL Server 2005, you can create a table-valued
function in SQL Server which will take a comma-delimited string and return
a table of ids, on which you can do a join (if the list of ids is unique)
or perform an IN operation on. The one I wrote is a CLR function and is
quite fast, even parsing 10,000 items (it beats out any other method that
you can code with T-SQL).

Then, all you have to do is take your delimited string, pass it as a
single parameter to your query (it can be a stored procedure or dynamic
query at this point) and you should be done.


--
- Nicholas Paldino [.NET/C# MVP]
- (e-mail address removed)

Bill said:
This may be a simple problem ( let's hope ). I am trying to extract
a
filtered set of records from a Sql Server Database. The set of rows,
however, is to be filtered based on data external to the database.

Specifically, this is an Excel Add-in.


The user selects a range which contains order numbers ( single
column ). Loading that data into a DataTable or Array is easy
enough.


Using those ( say 30 ) order_id values, I then want to get data
specific to the order from the order table on a Sql Server Database
( assume 1:1
relationship ).


I am new to ADO, I do not know how best to represent this data
( which
objects ) within the code.


If I use a DataSet where the short list of order ids is in one
DataTable with another table connected to the Sql Server order table,
will I end up loading all 300K of the orders before I get a chance to
filter the set.


Is there an easier way to select the data from the database using
different design options?


Ultimately it breaks down to a query like this:
Can I somehow relate data located only in memory/.Net Objects to a
table within the database?

The query could be written as below but the possibility is that the
list of orders could be as much as 10K.

SELECT a, b, c FROM orders WHERE order_id IN ( 100, 200, 300, ...,
999 )


Is there a way to do this without building that query?
 
L

LVP

I agree with you, hence MVP vs. LVP


Nicholas Paldino said:
I've done the testing using XML, and the overhead on generating the XML
on the client side, as well as parsing it on the server side is WAY too
large to justify using it in this way. Granted, it takes a little less
code, since you can use the XML parsing in SQL Server, but the overhead
just kills it.

Also, in using XML, you will more than likely complicate the client
code, as generating a comma delimited string (with say a StringBuilder) is
probably much easier than generating an XML document (that conforms to the
appropriate schema, which you SHOULD have).

--
- Nicholas Paldino [.NET/C# MVP]
- (e-mail address removed)

LVP said:
depending on the speed you are looking for,

Send the Ids as XML, parse it in a SP into a temp table then Join and
return your result set.

or depending on the number of Ids you can choose one method or another
method

How long does it take to put 10K into a temp table via many means? then
Join

LVP


Nicholas Paldino said:
Bill,

Not really. You should create some sort of dynamic query, using an
IN statement, appending a new parameter for each of the items you want
to filter on. SQL Server will take care of the rest.

I think that SQL Server has a limitation on the number of parameters
that you can use on dynamic queries (around 2048 I think) so you might
exceed that limitation.

If you are using SQL Server 2008, you can use table-valued
parameters, but I doubt you are doing that.

If you are using SQL Server 2005, you can create a table-valued
function in SQL Server which will take a comma-delimited string and
return a table of ids, on which you can do a join (if the list of ids is
unique) or perform an IN operation on. The one I wrote is a CLR
function and is quite fast, even parsing 10,000 items (it beats out any
other method that you can code with T-SQL).

Then, all you have to do is take your delimited string, pass it as a
single parameter to your query (it can be a stored procedure or dynamic
query at this point) and you should be done.


--
- Nicholas Paldino [.NET/C# MVP]
- (e-mail address removed)

This may be a simple problem ( let's hope ). I am trying to extract
a
filtered set of records from a Sql Server Database. The set of rows,
however, is to be filtered based on data external to the database.

Specifically, this is an Excel Add-in.


The user selects a range which contains order numbers ( single
column ). Loading that data into a DataTable or Array is easy
enough.


Using those ( say 30 ) order_id values, I then want to get data
specific to the order from the order table on a Sql Server Database
( assume 1:1
relationship ).


I am new to ADO, I do not know how best to represent this data
( which
objects ) within the code.


If I use a DataSet where the short list of order ids is in one
DataTable with another table connected to the Sql Server order table,
will I end up loading all 300K of the orders before I get a chance to
filter the set.


Is there an easier way to select the data from the database using
different design options?


Ultimately it breaks down to a query like this:
Can I somehow relate data located only in memory/.Net Objects to a
table within the database?

The query could be written as below but the possibility is that the
list of orders could be as much as 10K.

SELECT a, b, c FROM orders WHERE order_id IN ( 100, 200, 300, ...,
999 )


Is there a way to do this without building that query?
 
L

LVP

I agree with you, hence MVP vs. LVP

Peter Bromberg said:
I'd have to agree with Nick; using a TVF like "fn_Split" that accepts a
single delimited string of items and returns you a table that you can do a
join on is fast and easy. Moreover, with a delimited string you have a lot
less glop going over the wire.
-- Peter
Site: http://www.eggheadcafe.com
UnBlog: http://petesbloggerama.blogspot.com
MetaFinder: http://www.blogmetafinder.com


LVP said:
depending on the speed you are looking for,

Send the Ids as XML, parse it in a SP into a temp table then Join and
return
your result set.

or depending on the number of Ids you can choose one method or another
method

How long does it take to put 10K into a temp table via many means? then
Join

LVP


in
message news:[email protected]...
Bill,

Not really. You should create some sort of dynamic query, using an
IN
statement, appending a new parameter for each of the items you want to
filter on. SQL Server will take care of the rest.

I think that SQL Server has a limitation on the number of parameters
that you can use on dynamic queries (around 2048 I think) so you might
exceed that limitation.

If you are using SQL Server 2008, you can use table-valued
parameters,
but I doubt you are doing that.

If you are using SQL Server 2005, you can create a table-valued
function in SQL Server which will take a comma-delimited string and
return
a table of ids, on which you can do a join (if the list of ids is
unique)
or perform an IN operation on. The one I wrote is a CLR function and
is
quite fast, even parsing 10,000 items (it beats out any other method
that
you can code with T-SQL).

Then, all you have to do is take your delimited string, pass it as a
single parameter to your query (it can be a stored procedure or dynamic
query at this point) and you should be done.


--
- Nicholas Paldino [.NET/C# MVP]
- (e-mail address removed)

This may be a simple problem ( let's hope ). I am trying to extract
a
filtered set of records from a Sql Server Database. The set of rows,
however, is to be filtered based on data external to the database.

Specifically, this is an Excel Add-in.


The user selects a range which contains order numbers ( single
column ). Loading that data into a DataTable or Array is easy
enough.


Using those ( say 30 ) order_id values, I then want to get data
specific to the order from the order table on a Sql Server Database
( assume 1:1
relationship ).


I am new to ADO, I do not know how best to represent this data
( which
objects ) within the code.


If I use a DataSet where the short list of order ids is in one
DataTable with another table connected to the Sql Server order table,
will I end up loading all 300K of the orders before I get a chance to
filter the set.


Is there an easier way to select the data from the database using
different design options?


Ultimately it breaks down to a query like this:
Can I somehow relate data located only in memory/.Net Objects to a
table within the database?

The query could be written as below but the possibility is that the
list of orders could be as much as 10K.

SELECT a, b, c FROM orders WHERE order_id IN ( 100, 200, 300, ...,
999 )


Is there a way to do this without building that query?
 
M

Marc Gravell

using a TVF like "fn_Split" that accepts a
single delimited string of items

Complete aside: I've just had a similar conversation over in
sqlserver.xml; I've just got to say it again: why the *heck* isn't
there an in-built, optimised "Split" function in TSQL. You can write
it easily enough with SUBSTRING and CHARINDEX, but TSQL isn't really
optimal at this (it is designed for set-based), and CLR is massively
overkill.

</rant>

Marc
 
N

Nicholas Paldino [.NET/C# MVP]

Marc,

What prompts you to say that CLR is massive overkill? It actually is
suited perfectly for the task at hand, which is something that is a non-set
function, and computationally intensive, which SQL Server has always lacked
in performing.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top