"cost" of generic dictionaries ?

  • Thread starter Thread starter Bill Woodruff
  • Start date Start date
B

Bill Woodruff

For me Generic Dictionaries in .NET (I'm still back with 2.0 as yet ... I do
play in my favorite key of C#, thanks) are very tasty, very useful.

But, I do find myself wondering what the "overhead" (memory, look-up times)
in using them is like. And, as often as I admonish myself that I really
should go look at what the compiler writes, as often shirk descent into
those op-code lands. Yes, I know I would be a better person if I got L.
Roeder's "Reflector" and ... :)

Obviously using a value type (including structs) as keys you are not going
boxing, as you would have done using the older flavour of non-generic
dictionary. For reference types used as keys I am less certain; I am only
aware that you might have to roll your own comparer depending on what you
were storing and ...

But, let's say that you have a program that is doing a lot multiple look-ups
in GD's. Perhaps even one of the GD involved has keys, or values, or both,
that is a GD.

Appreciate hearing if anyone has developed any heuristics about the
strategic use of GD's given certain types of application scenarios.

A friend recently gave me a talking-to about using 'foreach : telling me it
was much better to just go ahead and do a for loop in terms of performance.
This has stimulated my interest in thinking about other lately arrived .NET
features like GD's in terms of memory use and look-up times..

thanks !

Bill
 
[Dictionary]
It works pretty well; lets put it this way - I wouldn't want to try to write
anything quicker, if I had the same use case (key/value pairs wanting fast
lookup); I'd use Dictionary<TKey,TValue> happily. If the data volumes were
bigger I'd let the database worry about it.

Note that in .NET 3.5 there is a single-key=>multi-value
but note that the implementation provided is said:
A friend recently gave me a talking-to about
using 'foreach : telling me it was much better to
just go ahead and do a for loop in terms of performance.
Then your friend needs some educating; yes, "for" can
out-perform "foreach" for a few array scenarios - but not by
as much as you might think. Actually when I last profiled it,
the most significant factor was whether the type was known
and sealed, avoiding the virtcalls...

For example code (below) gives following results (last col is checksum):
For: 3486 4999500000000
Foreach: 3483 4999500000000

So with 10^9 operations, they are 3ms apart (in fact foreach was quicker on
this run, but it varies). Foreach is more flexible. Use it ;-p

Of course, IEnumerable/IEnumerable<T> is more flexible, but has the virtcall
overhead - but that is identical to passing an IList/IList<T> around.

Test rig:
using System.Diagnostics;
using System;
static class Program
{
static void Main()
{
const int SIZE = 10000, LOOPS = 100000;
int[] data = new int[SIZE];
for (int i = 0; i < SIZE; i++) data = i;

Stopwatch watch = new Stopwatch();
watch.Start();
long sum = 0;
for (int i = 0; i < LOOPS; i++)
{
for (int j = 0; j < SIZE; j++)
{
sum += data[j];
}
}
watch.Stop();
Console.WriteLine("For: {0}\t{1}", watch.ElapsedMilliseconds, sum);
watch.Reset();

watch.Start();
sum = 0;
for (int i = 0; i < LOOPS; i++)
{
foreach(int val in data)
{
sum += val;
}
}
watch.Stop();
Console.WriteLine("Foreach: {0}\t{1}", watch.ElapsedMilliseconds,
sum);
watch.Reset();
}
}
 
avoiding the virtcalls...
Or pehaps (more likely?) it was the JIT doing its thing; but either way,
there really isn't much in it.

I also forgot to say that "foreach" better expresses intent; consider a
linked list that provided an int indexer. I would expect "foreach" to simply
walk the nodes yielding (very quick and simple) - however, it is also
entirely likely that the indexer starts at the beginning each time (since
there is no notional state to retain) - hence the indexer (for) would
actually telescope with O(n^2) performance, where-as foreach would be linear
O(n). Of course, the standard LinkedList<T> doesn't provide an indexer, so
it is a mute point - but ir highlights that you should be worrying about
what you want to achieve, and (until proven otherwise by repeatable
benchmarks) let the runtime and base-libraries worry about the rest.

Anything else is premature optimisation.

Marc
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Back
Top