Peter,
You misunderstood probably because I never explained it well enough.
Yes, it's hard to know what a person is talking about when they don't
explain themselves.
If I understand your suggestion correctly, you are proposing creating an
array that will be scanned searching for the index value, at which point
the count can be updated.
Nope, no scanning whatsoever.
If you know you'll have say 1000 possible indices, then define an array
of
1000 structs, then when you see a value of 605, you simply do
"array[605].count++" (i.e. your basic histogram code). I bet that is
faster
than any dictionary implementation.
Yes, that's probably true. Based on your updated description, I'd agree
the dictionary implementation will be slower. However, it will only be
slower by a very tiny amount (assuming the rest of the code is doing
anything interesting, it would be difficult to even measure the
difference...the Dictionary class is still constant order just like your
lookup-array), and your implementation will use a lot more memory than a
dictionary would unless the range of values is completely used or at least
nearly so.
It also won't be flexible if the input data changes for any reason, since
the implementation itself is so dependent on the input data.
Finally, one could use the SortedDictionary class for simplicity. It
would be much slower for the counting part of the algorithm (O(log n)
instead of constant order), but when you're done counting the data would
already be sorted. In the end, I'd be surprised if the total time spent
was significantly different (for sure, the order of the algorithm would be
the same: O(n log n) for all three variations).
"Way faster" isn't a well-defined phrase, but I doubt that any reasonable
interpretation of "way faster" would properly describe the minimal
difference in performance between using a dictionary and your lookup-array
proposal.
In fact, you don't even need the struct
if most of the work is 'reading' the value and the sorting doesn't need
to
be 'high-performance' - then you can define an array of ints and not
structs
and spend a little more time doing the sorting. To decide, we'd need to
understand more about the nature of the data.
Well, personally I would almost never implement a design like the one you
proposed. I certainly would never do it based solely on performance
reasons. If my input data was a reasonably narrow range (say, 16-bit
values or less), _and_ I expected the input values to cover the range with
very few exceptions, then I might choose a solution like the one you
proposed. But mainly because it's simpler, not because it might be a
little faster.
Frankly, when you write stuff like "Being a CF engineer, I'm always
looking to optimize", you are REALLY insulting to those of us who don't
write CF code. As if we aren't concerned with optimization as well.
Furthermore, it's pretty clear from this thread and some others we've seen
that you have a very specific and narrow idea of what "optimize" means,
and if someone doesn't find your choice of trade-offs useful, you dismiss
them as someone who isn't optimizing their code.
The fact is, here you've suggested a solution that isn't really going to
gain any significant performance, and which has its own costs that may not
be appropriate for the situation (and in fact is likely to be
inappropriate for most situations, _especially_ in CF where memory is at a
much greater premium than on a desktop computer).
Calling the dictionary solution "inefficient" is just plain dumb; it might
not be quite as fast as a straight array lookup, but it's not like it's a
_slow_ design. It's not "inefficient" at all. It's just a different way
of approaching the problem and your implication that anyone who might use
a dictionary implementation is writing inefficient code is very annoying.
So, if you're wondering why your posts seem to generate what you think are
"flames", you might take a good look at how you write your "suggestions"
and whether they really qualify as "more efficient" anyway.
Pete