Compare Two Identical Datatable By Content

I

inpuarg

I have 2 datatables. They are identical. I want to compare them by
cell's content. They are all same.

But dt1 == dt2 or
dt1.GetHashCode() == dt2.GetHashCode() doesn 't work.

There are big amount of rows in theese datatables . So i don 't want
to enumerate each rows. This is not efficient and unacceptable for my
current application.

So how can i compare theese datatables by their contents?



Let me describe more in detail :

I have a datatable. dtOld.
In the form load event i am filling this datatable from database and
bound it to a grid.

User may change some data in the grid. Then press save button. At this
point, i can sense which row changed (.GetChanges) and update this row
to database.

So - here is my problem begins :

After this update operation i am requerying the new table from
database and name is as dtNew datatable.

And i already have a datatable which is bound to grid. called dtOld.
I am accepting changes on dtOld.

If no other user made any changes , dtOld and dtNew are same.

In debug mode i can see that they have exact same data.
for instance both of them have 10 rows. and same data in columns.

But when i compared them c# says they are not equal. In theory they
might not equal by reference . Right. But i want to compare by their
content.

How can i manage this in efficient way ?

Regards.
 
G

Guest

Hi
You can perform a virtuall union operation between the two datatables. If
the no. of rows in the resulting datatable is same then both the databales
are same, otherwise the row count in the resulting datatable will increase.

here is one implementation of the union function we created , it takes two
datatables as input and returns a datatable with rows after performing union
on the two datables.


public DataTable Union (DataTable First, DataTable Second)

{
//Result table
DataTable table = new DataTable("Union");
//Build new columns
DataColumn[] newcolumns = new DataColumn[First.Columns.Count];
for(int i=0; i < First.Columns.Count; i++)
{
newcolumns = new DataColumn(First.Columns.ColumnName,
First.Columns.DataType);
}
//add new columns to result table
table.Columns.AddRange(newcolumns);
table.BeginLoadData();
//Load data from first table
foreach(DataRow row in First.Rows)
{
table.LoadDataRow(row.ItemArray,true);
}
//Load data from second table
foreach(DataRow row in Second.Rows)
{
table.LoadDataRow(row.ItemArray,true);
}
table.EndLoadData();
return table;
}
 
I

inpuarg

To be sure you would need to go through each cell in the entire table till
you find the first difference.
This is not effective for me cause there are too many rows in the
datatables.
You could put shortcuts in first like,
object.Equals to check if they are actually the same table, then compare row
counts and column counts, compare the schema (column datatypes). You could
try random sampling but it doesnt give you certainty.
I can see in debug window that data are same. But c# says theese are
not equal. Hashcodes are not same also. I think this is about .net 's
object compare model. Cause here is a sample :


Suppose that i have a class called Person :

public class Person
{
public string Name = "";
}

and let me test == operation :


Person a = new Person();
a.Name = "Person1";

Person b = new Person();
v.Name = "Person1";

if (a == b)
{
MessageBox.Show ("Equal");
}
else
{
MessageBox.Show("a : " + a.GetHashCode () + Environment.NewLine +
"b: " + b.GetHashCode() );
}


If you run this code you 'Ll see that theese objects are not equal.
Cause .net is not comparing theese objects due to their name 's
values (which is expected result - i am not against)
But there must be a way of overriding == operation for Person class
and make this comparison over Name 's values. And i know that this
exist too.

So here i am asking that is this possible for Datatable object ?
Is there such a method ? way , workarround etc.
 
I

inpuarg

thank you for your offer. But as i 've mentioned before this is not an
effective way for millions of records. Otherwise if i would loop
through each rows i can manually compare cells by using
cell1.ToString() == otherCell1.ToString() methods.

But i 'll keep in my mind. Thank you.


Hi
You can perform a virtuall union operation between the two datatables. If
the no. of rows in the resulting datatable is same then both the databales
are same, otherwise the row count in the resulting datatable will increase.

here is one implementation of the union function we created , it takes two
datatables as input and returns a datatable with rows after performing union
on the two datables.


public DataTable Union (DataTable First, DataTable Second)

{
//Result table
DataTable table = new DataTable("Union");
//Build new columns
DataColumn[] newcolumns = new DataColumn[First.Columns.Count];
for(int i=0; i < First.Columns.Count; i++)
{
newcolumns = new DataColumn(First.Columns.ColumnName,
First.Columns.DataType);
}
//add new columns to result table
table.Columns.AddRange(newcolumns);
table.BeginLoadData();
//Load data from first table
foreach(DataRow row in First.Rows)
{
table.LoadDataRow(row.ItemArray,true);
}
//Load data from second table
foreach(DataRow row in Second.Rows)
{
table.LoadDataRow(row.ItemArray,true);
}
table.EndLoadData();
return table;
}






inpuarg said:
I have 2 datatables. They are identical. I want to compare them by
cell's content. They are all same.

But dt1 == dt2 or
dt1.GetHashCode() == dt2.GetHashCode() doesn 't work.

There are big amount of rows in theese datatables . So i don 't want
to enumerate each rows. This is not efficient and unacceptable for my
current application.

So how can i compare theese datatables by their contents?



Let me describe more in detail :

I have a datatable. dtOld.
In the form load event i am filling this datatable from database and
bound it to a grid.

User may change some data in the grid. Then press save button. At this
point, i can sense which row changed (.GetChanges) and update this row
to database.

So - here is my problem begins :

After this update operation i am requerying the new table from
database and name is as dtNew datatable.

And i already have a datatable which is bound to grid. called dtOld.
I am accepting changes on dtOld.

If no other user made any changes , dtOld and dtNew are same.

In debug mode i can see that they have exact same data.
for instance both of them have 10 rows. and same data in columns.

But when i compared them c# says they are not equal. In theory they
might not equal by reference . Right. But i want to compare by their
content.

How can i manage this in efficient way ?

Regards.
 
I

Ignacio Machin \( .NET/ C# MVP \)

Hic,

inpuarg said:
This is not effective for me cause there are too many rows in the
datatables.

Then how do you expect to know if there is a difference or not?

Iterate is the only way, and believe even if you do not do it and find a
framework method that does it for you at the end somebody will have to
iterate in ALL the rows and compare the values of the columns.
 
I

Ignacio Machin \( .NET/ C# MVP \)

Hi,

amit_mitra said:
Hi
You can perform a virtuall union operation between the two datatables. If
the no. of rows in the resulting datatable is same then both the databales
are same, otherwise the row count in the resulting datatable will
increase.

If I read the OP correctly he wants to compare the values of ALL the columns
of all the rows. The only way of doing it is by iterating and comparing
 
I

Ignacio Machin \( .NET/ C# MVP \)

Hi,


inpuarg said:
thank you for your offer. But as i 've mentioned before this is not an
effective way for millions of records. Otherwise if i would loop
through each rows i can manually compare cells by using
cell1.ToString() == otherCell1.ToString() methods.

Can you provide more details of what you are trying to do? I mean where the
data comes from, that it consist of., etc
 
I

inpuarg

I have a datatable on a windows forms called newDt.
I am filling data from the database at form_load event.
Then i 'm binding theese datatable to a datagrid.

User manipulate data in a multi user environment. After user clicked
save button, i am updating changed rows to database. At this time i
have a datatable which is bounded to grid called oldDt. and i requery
database and have a new datatable called newDt.

i want to compare this two identical datatables by their contents.
cause it is possible that when user1 manipulating data on a grid,
another user might delete a row or add a row.
If i can compare theese two datatables :
i 'Ll check that if they are same - then this means no any other user
changed the database table. so i don 't need to refresh grid. If
theese are not same , if there are modified rows , added rows or
deleted rows at newDt , then i 'Ll refresh grid.
 
M

Marc Gravell

Why do it this way? Personally, I'd just look for the records that the
user has actively changed, and attempt to save them to the database,
letting the database tell me about concurrency issues (for instance,
by verifying a "timestamp" column at hte point of save). Otherwise, to
ensure that no users change things behind your back you would have to
lock all of the data while checking, which doesn't seem very
efficient. Of course, locking the rows in question is a good idea
(TransactionScope or ADO.Net transaction). Also, by the time you have
fetched all the data from the database, updating the UI is the least
of your issues, so if you have the data, just update it. But if the
data is large, you shouldn't really attempt to do it this way
anyway...

There are also (in 2.0) database-oriented change notifications (works
best with SqlServer 2005), but I'm not a big fan of these... not sure
about the scalability... maybe I'm prejudiced though...

Marc
 
M

Marc Gravell

Example using your names... note that "baseclass" and "derivedclass"
are a little vaguewhen removed from inhetance, but I thought it would
be clearer to keep the same for reference purposes. Note also that the
base (either BaseClass or IBaseClass) could be made cloneable
(ICloneable) if you needed to make copies of just the base section.

Marc

using System;
class Program
{
static void Main()
{
DerivedClass dc = new DerivedClass();
dc.PropertyA = "A";
dc.PropertyB = "B";
dc.PropertyC = "C";
IBaseClass bc = dc.BaseClass;
}
}
public interface IBaseClass
{
string PropertyA { get;set;}
string PropertyB {get;set;}
}
public class BaseClass : IBaseClass
{
private string propertyA;
public string PropertyA
{
get { return propertyA; }
set { propertyA = value; }
}
private string propertyB;
public string PropertyB
{
get { return propertyB; }
set { propertyB = value; }
}
}

public class DerivedClass : IBaseClass
{
public IBaseClass BaseClass { get { return baseClass; } }
private readonly IBaseClass baseClass;
public DerivedClass() : this(new BaseClass()) { }
public DerivedClass(IBaseClass baseClass)
{
if (baseClass == null) throw new ArgumentNullException();
this.baseClass = baseClass;
}
private string propertyC;
public string PropertyC
{
get { return propertyC; }
set { propertyC = value; }
}
public string PropertyA
{
get { return baseClass.PropertyA; }
set { baseClass.PropertyA = value; }
}
public string PropertyB
{
get { return baseClass.PropertyB; }
set { baseClass.PropertyB = value; }
}
}
 
I

inpuarg

Why do it this way? Personally, I'd just look for the records that the
user has actively changed, and attempt to save them to the database,
letting the database tell me about concurrency issues (for instance,
by verifying a "timestamp" column at hte point of save). Otherwise, to
ensure that no users change things behind your back you would have to
lock all of the data while checking, which doesn't seem very
efficient. Of course, locking the rows in question is a good idea
(TransactionScope or ADO.Net transaction).

Assume that i have 5 rows when i 've load form.
1
2
3
4
5

user changed number 4. and i 've updated this info to database. by the
way - user 2 deleted the record number 2 and user three updated the
number four.

I want current user to see fresh data. that 's why i 'm checking.
Also, by the time you have
fetched all the data from the database, updating the UI is the least
of your issues, so if you have the data, just update it. But if the
data is large, you shouldn't really attempt to do it this way
anyway...

data is too large.
There are also (in 2.0) database-oriented change notifications (works
best with SqlServer 2005), but I'm not a big fan of these... not sure
about the scalability... maybe I'm prejudiced though...

Marc
my application using both MSSQL2005 and Oracle and other databases
that supports stored procedure.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top