More Efficient: Hashtable or List

G

Guest

Hi,

My app needs to potentially store a large number of custom objects and be
able to iterate through them quickly. I was wondering which data structure
would be the most efficient to do this,a hashtable or a generic list.

Is using enumerators to iterate through the data structure a good idea?

I'd appreciate any suggesstions or advice,

Thanks
Macca
 
C

Champika Nirosh

List should be the lighter/ simpler object, but why not you use the most
generic type of the collection that is Arraylist.

when you say large does that mean it is larger than 80 K ?

Again why you need a collection why not use a static array, assume that
custom objects are of same type and the length of the array is known.. as
you may know collection store object by boxing them and need to unbox before
recognizing them.. ,, which is a costly operation.

Nirosh.
 
C

Champika Nirosh

again to itterate through a collection using enumerators is the recomended
way..


Nirosh.
Note: Until Generics come and remove this barior from collections later
 
G

Guest

If you want to iterate through an array of dynamic length use either List<T>
or ArrayList. I recomend List<T> over ArrayList if you know the type of the
objects that you are iterating through, else ArrayList would be fine.

I prefer List<T> of ArrayList over a Dictionary<T> or Hashtable in case you
do not want to search your objects using a key.

When coming to iteration technique, using index based iteration is 50%-75%
faster compared to iterating using IEnumerator for List<T> or ArrayList.

So make your choice wisely and based on the requirement.
 
C

Champika Nirosh

Noting you....
When coming to iteration technique, using index based iteration is 50%-75%
faster compared to iterating using IEnumerator for List<T> or ArrayList.
this is true only when you are iterating for nothing.. but if you are to use
the corresponding object, you need to unbox it. That is when IEnumerator or
foreach loop produce much faster effiecient results...

Nirosh.
 
G

Guest

Hi Nirosh,

I doubt that is the case. The index based iteration on a List<T> or
ArrayList is always faster than IEnumerator based iteration.

Check the code below. It has both index based iteration and IEnumerator
based iteration. Run it for yourself and see the difference.

ArrayList al = new ArrayList();
for(int i = 1;i <= 1000000; i++)
al.Add(i);

long st = DateTime.Now.Ticks;
// Enumerator based iteration
//
//IEnumerator enumu = al.GetEnumerator();
//int pop = 0;
//enumu.MoveNext();
//while(enumu.MoveNext())
//{
//pop = Convert.ToInt32(enumu.Current);
//}

// Index based iteration
//
int length = al.Count;
int pop = 0;
for(int k = 0; k < length; k++)
{
pop = Convert.ToInt32(al[k]);
}

long et = DateTime.Now.Ticks;
Console.WriteLine(pop);
Console.WriteLine("Time: {0}", (et-st)/10000);

I had put pop outside the while/for loop and printing the value after the
while/for loop, so that C# compiler won't optimize it out.

On my system (1.7 GHz, 1GB RAM) the IEnumerator approach takes an average of
100 milliseonds. The index based approach takes an average of 50-60
milliseonds.
 
C

Champika Nirosh

Your code is not written utilizing the real usage of the collections

first you are storing a value type (integer) inside a collection
speficically optimized for reference type object..
second you are using Convert.ToInt32 middle of the itteration but the better
way would be
foreach(int k in al)
{
pop = k;
}
so I am not going further with the sample, but half of your statement is
perfectly correct

According to your comment
You say that this applies to List<T> as well as ArrayList..

I Agreed if you only say that above statement only applies to List<T>.. you
may know what boxing and unboxing is.. when you use List<T> does it need
unboxing.. because the list store data in their original format.. so no
unboxing occurs, so it is closely similay to a static array of type T.. but
IEnumerator and foreach speficically build to reduce the perfomance lost
caused by unboxing and boxing when you itterate through a collection, which
store custom (or any) objects after boxing them to generic object type..

Nirosh.

Adityanand Pasumarthi said:
Hi Nirosh,

I doubt that is the case. The index based iteration on a List<T> or
ArrayList is always faster than IEnumerator based iteration.

Check the code below. It has both index based iteration and IEnumerator
based iteration. Run it for yourself and see the difference.

ArrayList al = new ArrayList();
for(int i = 1;i <= 1000000; i++)
al.Add(i);

long st = DateTime.Now.Ticks;
// Enumerator based iteration
//
//IEnumerator enumu = al.GetEnumerator();
//int pop = 0;
//enumu.MoveNext();
//while(enumu.MoveNext())
//{
//pop = Convert.ToInt32(enumu.Current);
//}

// Index based iteration
//
int length = al.Count;
int pop = 0;
for(int k = 0; k < length; k++)
{
pop = Convert.ToInt32(al[k]);
}

long et = DateTime.Now.Ticks;
Console.WriteLine(pop);
Console.WriteLine("Time: {0}", (et-st)/10000);

I had put pop outside the while/for loop and printing the value after the
while/for loop, so that C# compiler won't optimize it out.

On my system (1.7 GHz, 1GB RAM) the IEnumerator approach takes an average
of
100 milliseonds. The index based approach takes an average of 50-60
milliseonds.

--
Regards,
Aditya.P


Champika Nirosh said:
Noting you....

this is true only when you are iterating for nothing.. but if you are to
use
the corresponding object, you need to unbox it. That is when IEnumerator
or
foreach loop produce much faster effiecient results...

Nirosh.


"Adityanand Pasumarthi" <[email protected]>
wrote in message
 
I

Ignacio Machin \( .NET/ C# MVP \)

Hi,

It does depend of your objects, a list is just a linear collection of
objets, you could potentially have to iterate in the entire list to find an
element
a hashtable is a good start, even better if you find a good hash key in such
a way that you can divide the list in several sublist , like for example the
first letter of a string property, now you could have potentially 23
sublists. Of course this only helps if you search by the hashed property.
 
G

Guest

Hi Nirosh,

If I use List<int> and iterate through using foreach(int k in <ListObject>)
and compare its performance with the index based iterations on the same
List<int> object then the results are almost same, _but_ index based approach
slightly beats the foreach based approach by arounf 10-15%. This difference
is negligible while using List<T> with know types.

Where as if you do not know the type of the object you are placing in the
list then ArrayList would be your choice and if you are using ArrayList,
index based iteration performs better. Convert.ToInt32(...) in my sample does
not matter because I'm using the same for comparing IEnumerator and index
based iterations for ArrayList. For the argument sake even if I remove the
Convert.ToInt32(..) and just retrieve the object inside by ArrayList during
my iterations, index based approach is still 50-75% faster that IEnumerator
approach.

To summarize for ArrayList (for value or object types) I recommend index
based iterations. For List<T> I still prefer index based iterations, but the
easy and maintainable syntax of foreach may be better (it is again one's
choice).

--
Regards,
Aditya.P


Champika Nirosh said:
Your code is not written utilizing the real usage of the collections

first you are storing a value type (integer) inside a collection
speficically optimized for reference type object..
second you are using Convert.ToInt32 middle of the itteration but the better
way would be
foreach(int k in al)
{
pop = k;
}
so I am not going further with the sample, but half of your statement is
perfectly correct

According to your comment
You say that this applies to List<T> as well as ArrayList..

I Agreed if you only say that above statement only applies to List<T>.. you
may know what boxing and unboxing is.. when you use List<T> does it need
unboxing.. because the list store data in their original format.. so no
unboxing occurs, so it is closely similay to a static array of type T.. but
IEnumerator and foreach speficically build to reduce the perfomance lost
caused by unboxing and boxing when you itterate through a collection, which
store custom (or any) objects after boxing them to generic object type..

Nirosh.

Adityanand Pasumarthi said:
Hi Nirosh,

I doubt that is the case. The index based iteration on a List<T> or
ArrayList is always faster than IEnumerator based iteration.

Check the code below. It has both index based iteration and IEnumerator
based iteration. Run it for yourself and see the difference.

ArrayList al = new ArrayList();
for(int i = 1;i <= 1000000; i++)
al.Add(i);

long st = DateTime.Now.Ticks;
// Enumerator based iteration
//
//IEnumerator enumu = al.GetEnumerator();
//int pop = 0;
//enumu.MoveNext();
//while(enumu.MoveNext())
//{
//pop = Convert.ToInt32(enumu.Current);
//}

// Index based iteration
//
int length = al.Count;
int pop = 0;
for(int k = 0; k < length; k++)
{
pop = Convert.ToInt32(al[k]);
}

long et = DateTime.Now.Ticks;
Console.WriteLine(pop);
Console.WriteLine("Time: {0}", (et-st)/10000);

I had put pop outside the while/for loop and printing the value after the
while/for loop, so that C# compiler won't optimize it out.

On my system (1.7 GHz, 1GB RAM) the IEnumerator approach takes an average
of
100 milliseonds. The index based approach takes an average of 50-60
milliseonds.

--
Regards,
Aditya.P


Champika Nirosh said:
Noting you....

When coming to iteration technique, using index based iteration is
50%-75%
faster compared to iterating using IEnumerator for List<T> or
ArrayList.
this is true only when you are iterating for nothing.. but if you are to
use
the corresponding object, you need to unbox it. That is when IEnumerator
or
foreach loop produce much faster effiecient results...

Nirosh.


"Adityanand Pasumarthi" <[email protected]>
wrote in message
If you want to iterate through an array of dynamic length use either
List<T>
or ArrayList. I recomend List<T> over ArrayList if you know the type of
the
objects that you are iterating through, else ArrayList would be fine.

I prefer List<T> of ArrayList over a Dictionary<T> or Hashtable in case
you
do not want to search your objects using a key.

When coming to iteration technique, using index based iteration is
50%-75%
faster compared to iterating using IEnumerator for List<T> or
ArrayList.

So make your choice wisely and based on the requirement.

--
Regards,
Aditya.P


:

Hi,

My app needs to potentially store a large number of custom objects and
be
able to iterate through them quickly. I was wondering which data
structure
would be the most efficient to do this,a hashtable or a generic list.

Is using enumerators to iterate through the data structure a good
idea?

I'd appreciate any suggesstions or advice,

Thanks
Macca
 
I

Ignacio Machin \( .NET/ C# MVP \)

Hi,

I forgot to mention that the hashtable structure I mentioned in the other
post should have a generic list as the hashtable value, is this list the one
that contains the elements matching the hashing key
 
L

Laurent Bugnion

Hi,
Hi,

It does depend of your objects, a list is just a linear collection of
objets, you could potentially have to iterate in the entire list to find an
element
a hashtable is a good start, even better if you find a good hash key in such
a way that you can divide the list in several sublist , like for example the
first letter of a string property, now you could have potentially 23
sublists.

Depending on the implementation, you might even have 26 of them ;-)
Of course this only helps if you search by the hashed property.

Greetings,
Laurent
 
C

Champika Nirosh

Adityanand,

Your test is not really capable of testing the real usage of a foreach or a
IEnumerator..

do some thing like this

change the first part of the code as this
for (int i = 1; i <= 300000; i++)
{
Button b = new Button();
b.Text = "Test" + i;
al.Add(b);
}
then do a *repetitive field or property access*
take the timing.. you will get surprise by the result you see..

probably some thing like

for (int k = 0; k < length; k++)
{
Button bk = (Button)al[k];
s = bk.Text;
}

and

foreach (Button t in al)
{
s = t.Text;
}

Theory is ..
if you are storing reference type object then store them in a collection
if you are processing data inside the collection then use foreach or
IEnumerator to iterate accross objects..

this is obviuosly with the exception of Generics

Nirosh.

Adityanand Pasumarthi said:
Hi Nirosh,

If I use List<int> and iterate through using foreach(int k in
<ListObject>)
and compare its performance with the index based iterations on the same
List<int> object then the results are almost same, _but_ index based
approach
slightly beats the foreach based approach by arounf 10-15%. This
difference
is negligible while using List<T> with know types.

Where as if you do not know the type of the object you are placing in the
list then ArrayList would be your choice and if you are using ArrayList,
index based iteration performs better. Convert.ToInt32(...) in my sample
does
not matter because I'm using the same for comparing IEnumerator and index
based iterations for ArrayList. For the argument sake even if I remove the
Convert.ToInt32(..) and just retrieve the object inside by ArrayList
during
my iterations, index based approach is still 50-75% faster that
IEnumerator
approach.

To summarize for ArrayList (for value or object types) I recommend index
based iterations. For List<T> I still prefer index based iterations, but
the
easy and maintainable syntax of foreach may be better (it is again one's
choice).

--
Regards,
Aditya.P


Champika Nirosh said:
Your code is not written utilizing the real usage of the collections

first you are storing a value type (integer) inside a collection
speficically optimized for reference type object..
second you are using Convert.ToInt32 middle of the itteration but the
better
way would be
foreach(int k in al)
{
pop = k;
}
so I am not going further with the sample, but half of your statement is
perfectly correct

According to your comment
When coming to iteration technique, using index based iteration is
50%-75%
faster compared to iterating using IEnumerator for List<T> or
ArrayList.

You say that this applies to List<T> as well as ArrayList..

I Agreed if you only say that above statement only applies to List<T>..
you
may know what boxing and unboxing is.. when you use List<T> does it need
unboxing.. because the list store data in their original format.. so no
unboxing occurs, so it is closely similay to a static array of type T..
but
IEnumerator and foreach speficically build to reduce the perfomance lost
caused by unboxing and boxing when you itterate through a collection,
which
store custom (or any) objects after boxing them to generic object type..

Nirosh.

"Adityanand Pasumarthi" <[email protected]>
wrote in message
Hi Nirosh,

I doubt that is the case. The index based iteration on a List<T> or
ArrayList is always faster than IEnumerator based iteration.

Check the code below. It has both index based iteration and IEnumerator
based iteration. Run it for yourself and see the difference.

ArrayList al = new ArrayList();
for(int i = 1;i <= 1000000; i++)
al.Add(i);

long st = DateTime.Now.Ticks;
// Enumerator based iteration
//
//IEnumerator enumu = al.GetEnumerator();
//int pop = 0;
//enumu.MoveNext();
//while(enumu.MoveNext())
//{
//pop = Convert.ToInt32(enumu.Current);
//}

// Index based iteration
//
int length = al.Count;
int pop = 0;
for(int k = 0; k < length; k++)
{
pop = Convert.ToInt32(al[k]);
}

long et = DateTime.Now.Ticks;
Console.WriteLine(pop);
Console.WriteLine("Time: {0}", (et-st)/10000);

I had put pop outside the while/for loop and printing the value after
the
while/for loop, so that C# compiler won't optimize it out.

On my system (1.7 GHz, 1GB RAM) the IEnumerator approach takes an
average
of
100 milliseonds. The index based approach takes an average of 50-60
milliseonds.

--
Regards,
Aditya.P


:

Noting you....

When coming to iteration technique, using index based iteration is
50%-75%
faster compared to iterating using IEnumerator for List<T> or
ArrayList.
this is true only when you are iterating for nothing.. but if you are
to
use
the corresponding object, you need to unbox it. That is when
IEnumerator
or
foreach loop produce much faster effiecient results...

Nirosh.


"Adityanand Pasumarthi"
<[email protected]>
wrote in message
If you want to iterate through an array of dynamic length use either
List<T>
or ArrayList. I recomend List<T> over ArrayList if you know the type
of
the
objects that you are iterating through, else ArrayList would be
fine.

I prefer List<T> of ArrayList over a Dictionary<T> or Hashtable in
case
you
do not want to search your objects using a key.

When coming to iteration technique, using index based iteration is
50%-75%
faster compared to iterating using IEnumerator for List<T> or
ArrayList.

So make your choice wisely and based on the requirement.

--
Regards,
Aditya.P


:

Hi,

My app needs to potentially store a large number of custom objects
and
be
able to iterate through them quickly. I was wondering which data
structure
would be the most efficient to do this,a hashtable or a generic
list.

Is using enumerators to iterate through the data structure a good
idea?

I'd appreciate any suggesstions or advice,

Thanks
Macca
 
J

Jon Skeet [C# MVP]

Adityanand Pasumarthi said:
If you want to iterate through an array of dynamic length use either List<T>
or ArrayList. I recomend List<T> over ArrayList if you know the type of the
objects that you are iterating through, else ArrayList would be fine.

I prefer List<T> of ArrayList over a Dictionary<T> or Hashtable in case you
do not want to search your objects using a key.

When coming to iteration technique, using index based iteration is 50%-75%
faster compared to iterating using IEnumerator for List<T> or ArrayList.

So make your choice wisely and based on the requirement.

Iteration is rarely the performance bottleneck of the loop though,
unless you're doing almost nothing with the data. I'd make my choice
based mostly on readability - and that favours foreach in most
situations. I'd only look at performance when I knew it was an issue.
 
D

Dustin Campbell

Iteration is rarely the performance bottleneck of the loop though,
unless you're doing almost nothing with the data. I'd make my choice
based mostly on readability - and that favours foreach in most
situations. I'd only look at performance when I knew it was an issue.

And, if you run into performance issues with List<T>, I posted about this
on my blog: http://diditwith.net/PermaLink,guid,506c0888-8c5f-40e5-9d39-a09e2ebf3a55.aspx.


Best Regards,
Dustin Campbell
Developer Express Inc.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top