Deduping with Linq / Where Exists

R

Rich P

Here is a Tsql sample to query a table for listing 2 employees from each
department. How would this look using Linq if I pulled the data
contents below to a dataTable in my C# app? Note: the Emp column is
unique

create table #tmp1 (Emp int unique, Dept varchar(5))

insert into #tmp1
select 1, 'a' union all
select 2, 'a' union all
select 3, 'a' union all
select 4, 'a' union all
select 5, 'a' union all
select 6, 'b' union all
select 7, 'b' union all
select 8, 'b' union all
select 9, 'b' union all
select 10, 'b' union all
select 11, 'c' union all
select 12, 'c' union all
select 13, 'c' union all
select 14, 'c' union all
select 15, 'c'


SELECT * FROM #tmp1 t1 WHERE EXISTS
(SELECT * FROM (SELECT TOP 2 Emp, Dept FROM #tmp1 t2 WHERE t2.Dept =
t1.Dept Order By t1.Dept) t3 WHERE t3.Emp = t1.Emp)

Here is the output:

Emp Dept
1 a
2 a
6 b
7 b
11 c
12 c

Thanks,

Rich
 
M

Mr. Arnold

Rich said:
Here is a Tsql sample to query a table for listing 2 employees from each
department. How would this look using Linq if I pulled the data
contents below to a dataTable in my C# app? Note: the Emp column is
unique


<http://dotnetarchitect.wordpress.com/2009/03/18/using-linq-to-manipulate-data-in-datasetdat

Datatable again? :)

Why don't just use Linq-2-SQL?

Just add the items needed to your existing project and have at it.

<http://msdn.microsoft.com/en-us/vbasic/bb688085.aspx>

<http://www.codeproject.com/KB/database/LINQtoSQL.aspx>
 
R

Rich P

Thank you for your reply and for the links. But I am still having a
problem with the Where Exists subquery. I modified my scenario a little
bit. Lets say I have the following xml file (which is valid/formatted/I
just tested it and its ready for pickup - same datalist as my original
post except in xml format)

<?xml version="1.0" encoding="utf-8" ?>
<Emp_Dept>
<EmpDept>
<emp>1</emp>
<dept>a</dept>
</EmpDept>
<EmpDept>
<emp>2</emp>
<dept>a</dept>
</EmpDept>
<EmpDept>
<emp>3</emp>
<dept>a</dept>
</EmpDept>
<EmpDept>
<emp>4</emp>
<dept>a</dept>
</EmpDept>
<EmpDept>
<emp>5</emp>
<dept>a</dept>
</EmpDept>
<EmpDept>
<emp>6</emp>
<dept>b</dept>
</EmpDept>
<EmpDept>
<emp>7</emp>
<dept>b</dept>
</EmpDept>
<EmpDept>
<emp>8</emp>
<dept>b</dept>
</EmpDept>
<EmpDept>
<emp>9</emp>
<dept>b</dept>
</EmpDept>
<EmpDept>
<emp>10</emp>
<dept>b</dept>
</EmpDept>
<EmpDept>
<emp>11</emp>
<dept>c</dept>
</EmpDept>
<EmpDept>
<emp>12</emp>
<dept>c</dept>
</EmpDept>
<EmpDept>
<emp>13</emp>
<dept>c</dept>
</EmpDept>
<EmpDept>
<emp>14</emp>
<dept>c</dept>
</EmpDept>
<EmpDept>
<emp>15</emp>
<dept>c</dept>
</EmpDept>
</Emp_Dept>

Once I read this into a dataset I want to perform a query like the
following one and display the output below in a datagridview control.

SELECT * FROM #tmp1 t1 WHERE EXISTS
(SELECT * FROM (SELECT TOP 2 Emp, Dept FROM #tmp1 t2 WHERE t2.Dept =
t1.Dept Order By t1.Dept) t3 WHERE t3.Emp = t1.Emp)

Here is the output:

Emp Dept
1 a
2 a
6 b
7 b
11 c
12 c

how would I do this with linq?


Rich
 
M

Mr. Arnold

Rich said:
Once I read this into a dataset I want to perform a query like the
following one and display the output below in a datagridview control.

SELECT * FROM #tmp1 t1 WHERE EXISTS
(SELECT * FROM (SELECT TOP 2 Emp, Dept FROM #tmp1 t2 WHERE t2.Dept =
t1.Dept Order By t1.Dept) t3 WHERE t3.Emp = t1.Emp)

I don't think you can do it against a table in a dataset.

The "Exists" seems to be pointing to a Linq statement used in a
Linq-2-SQL query where a SQL Server table, an entity, on the virtual
database "model" is being queried using T-SQL statements that are
generated by the Linq-2-SQL query to query a SQL server table.

The same would be true if using ADO.NET Entity Framework where T-SQL
statements would be generated by the Linq-2-Object to actually query an
entity/SQL server table on the virtual database 'model'.

I could be wrong.
 
R

Rich P

Thank you again for your response. I think my real question should have
been if it is possible to use linq against a dataTable. I googled
around and saw some posts with solutions that I tried but did not work
out.

At this point, I am trying to figure out the benefit of the Entity thing
over writing plain Tsql on a sqlDataAdapter. That seems pretty easy and
straightforward compared to the entity thing, and it appears I would
have more control over my query in this scenario using Tsql. Or would
linq still offer some benefit here? Not knocking linq (I already have
it in use very nicely for some other apps), but how would I benefit from
linq/Entities in my current situation? Is it possible to write a linq
Query against a dataTable contained within the app? How to do that?

Rich
 
M

Mr. Arnold

Rich said:
Thank you again for your response. I think my real question should have
been if it is possible to use linq against a dataTable. I googled
around and saw some posts with solutions that I tried but did not work
out.

At this point, I am trying to figure out the benefit of the Entity thing
over writing plain Tsql on a sqlDataAdapter. That seems pretty easy and
straightforward compared to the entity thing, and it appears I would
have more control over my query in this scenario using Tsql. Or would
linq still offer some benefit here? Not knocking linq (I already have
it in use very nicely for some other apps), but how would I benefit from
linq/Entities in my current situation?

The end results of Linq-2-SQL or even ADO.NET Entity Framework is that
it would generate the T-SQL itself, execute it, populate an object or
List<T> of objects to present back to you. Either way you look at it
Link-2-SQL or you doing inline or T-SQL or from an sproc, T-SQL was
executed.

You load the data to a datatable only to use Linq to query it. On the
other hand, you use Linq-2-SQL to query the database table or tables and
come back with an entity or entities in a collection.

You can do a New Shape and only come out with the properties you want
for the object and still load it to a List<T>. If you have to query
again, you can query the List<T>.

Look at the example in the linq, and what is going to be generated and
executed behind the Linq-2-SQL query is T-SQL.

<http://msdn.microsoft.com/en-us/vbasic/bb737925.aspx>
Is it possible to write a linq
Query against a dataTable contained within the app? How to do that?

Did you not look at the link below I gave you in the previous post? :)

<http://dotnetarchitect.wordpress.com/2009/03/18/using-linq-to-manipulate-data-in-datasetdatatable/>

I am just the opposite of you. I will not use a dataset, datatable,
adapter, inline T-SQL or a sproc as long as there is Link-2-SQL or
ADO.NET Entity Framework, because objects are easy to use and are too
powerful not to use.


I made this post to someone earlier today. I'll make it to you too.

<copied>

The purpose of Linq is what?

Microsoft LINQ defines a set of query operators that can be used to
query, project and filter data in arrays, enumerable classes, XML,
relational database, and third party data sources. While it allows any
data source to be queried, it requires that the data be encapsulated as
*objects*. So, if the data source does not natively store data as
objects, the data must be mapped to the object domain. Queries written
using the query operators are executed either by the LINQ query
processing engine or, via an extension mechanism, handed over to LINQ
providers which either implement a separate query processing engine or
translate to a different format to be executed on a separate data store
(such as on a database server as SQL queries). The results of a query
are returned as a collection of in-memory *objects* that can be
enumerated using a standard iterator function such as C#'s foreach.

What is Linq-to-SQL?

LINQ to SQL, a component of Visual Studio Code Name "Orcas", provides a
run-time infrastructure for managing relational data as *objects*
without losing the ability to query. It does this by translating
language-integrated queries into SQL for execution by the database, and
then translating the tabular results back into objects you define. Your
application is then free to manipulate the objects while LINQ to SQL
stays in the background tracking your changes automatically.


The above holds true for ADO.NET Entity Framework, an ORM, and
Linq-2-Object and object encapsulation, inheritance and all that comes
with an object.

What is Object-oriented-programming?

(OOP) is a programming paradigm that uses "objects" and their
interactions to design applications and computer programs.

The key concepts of OOP are the following:

Class
Object
Instance
Method
Message passing
Inheritance
Abstraction
Encapsulation
Polymorphism
Decoupling

http://en.wikipedia.org/wiki/Object-oriented_programming

One gets use to using objects and things become simple. :)
 
R

Rich P

I have experimented briefly with Entities,a nd I actually have a .dbml
file in my "testing" app from AdventureWorks DB. Like
everything/anything -- I will have to play with it for a bit (which is
technically what I am doing here with the linq against a dataTable).

And yes, I did go to

http://dotnetarchitect.wordpress.com/2009/03/18/using-linq-to-manipulate
-data-in-datasetdatatable/

It was a lot of information, so I went to a few other articles, with
less info -- none of which worked for me, so back to your article :). I
will give this a go, and then (when I come up from air and the day job
people aren't bugging me for something) I will experiment some more with
the .dbml file.

Thanks

Rich
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top