Use BinarySearch to find an entry. Work backward from the entry to
find the first entry. Work forward from the entry that BinarySearch
found to find the last entry. Done.
There is no guarantee that a binary search will find the first entry,
since it doesn't examine all of the elements, it can't possibly. Only
a linear search will guarantee to find the first matching entry.
Well, sure a binary search could find the first entry. It could do
exactly what you propose the OP do, which is to upon finding a matching
element, work its way back until it knows where the first matching one
is. It all just depends on how the behavior is defined. Microsoft
decided to go with the faster-but-ambiguous "first match found" design,
but they could just as easily have defined the behavior to always return
the "first match in array" result instead.
How long is this ArrayList, by the way?
If the ArrayList is truly huge, and this search is a common operation,
consider the following: trade memory for speed. Use a SortedList of
objects, where each object in the main list contains the key and a
list of items with that key. When you need all items with a certain
key, hash into the SortedList (no searching required) and grab the
list of equivalent items. Lookups will be as fast as they can possibly
be.
I'm not exactly clear on what you propose here. First of all, according
to the docs, SortedList does not allow duplicate keys. So the question of
finding the first of a given key is moot.
Second, hashing only works if you've stored the items according to the
hash value in the first place. I see no advantage in applying some sort
of hash value to a SortedList. Now, a Hashtable would be different and of
course much faster. But then that's true generally and is a completely
different issue.
Finally, as near as I can tell the implementation of SortedList involves a
separate array containing just the keys and indexes into the array
containing the actual value data. This means that every time something is
added to, or removed from the collection, the key index array has to be
adjusted, shifting all the items after the insertion or removal. This is
probably okay if the array is built once and then persists, but it could
get very expensive if there's a lot of data and the array is in constant
flux. Note that this issue is worst in the very situation in which you're
suggesting one use SortedList: "If the ArrayList is truly huge".
And of course, given that you can't just "hash into the SortedList",
searching a SortedList is going to take the same time as searching a
sorted ArrayList, since both would require a binary search for optimal
results.
All of the above is with respect to the non-generic SortedList. There is
also a generic SortedList<> that would be useful in some situations, but
which has similar problems (in particular, the requirement that keys be
unique). It's not clear from the documentation whether it's a balanced
binary tree or not, but if it's not then it would also have the problem
that the worst-case scenario for searching is considerably worse than for
a binary search.
If constant-order searching is desired, then a hashtable-based data
structure needs to be used, like Hashtable or the generic Dictionary<>.
Pete