find closest item in keyed collection

  • Thread starter Thread starter Steve Richter
  • Start date Start date
S

Steve Richter

Does the .NET framework provide a class which will find the item in
the collection with a key which is closest ( greater than or equal,
less than or equal ) to the keys of the collection?

ex: collection keys are 20, 30, 40, 50, 60, 70, 80

find key which is >= 35.

would return the 30 key.

thanks,
 
find the item in
the collection with a key which is closest (greater than or equal,
less than or equal) to the keys of the collection?

ex: collection keys are 20, 30, 40, 50, 60, 70, 80
find key which is >= 35 would return the 30 key.

First off; they're values, not keys; However, I don't understand what
exactly you want: what do you mean "which is closest to the keys (...)
of the collection"? Closest to *what* in the collection? Also, why
would something ">= 35" return 30?

Do you mean "the key closest to the 35" (using the > or < to choose
between 30 and 40?).

If you are using .NET 3.5 and C# 3, then there may be some LINQ ways
to do what you want... again without understanding your exact query I
can't write it 100% (for instance, should the ">= 35" be a "where"?) -
but:

Marc

// setup
int[] vals = { 20, 30, 40, 50, 60, 70, 80 };
int find = 35;

// find min distance to "find", and prefer lower
// values in a tie situation
var query = from i in vals
orderby Math.Abs(find-i), i
select i;
int answer = query.Single();

/* (method syntax)
int altAnswer = vals.OrderBy(i => Math.Abs(find - i))
.ThenBy(i => i).Single();
*/
 
First off; they're values, not keys

Oh, right - I think I see what you mean. This also works (since this
part of LINQ is on IEnumerable<T>):

Dictionary<int, string> dict = NotShown();
var query = from i in dict.Keys
orderby Math.Abs(find-i), i
select i;
int key = query.Single();

Marc
 
Marc Gravell said:
find the item in
the collection with a key which is closest (greater than or equal,
less than or equal) to the keys of the collection?

ex: collection keys are 20, 30, 40, 50, 60, 70, 80
find key which is >= 35 would return the 30 key.

First off; they're values, not keys; However, I don't understand what
exactly you want: what do you mean "which is closest to the keys (...)
of the collection"? Closest to *what* in the collection? Also, why
would something ">= 35" return 30?

Do you mean "the key closest to the 35" (using the > or < to choose
between 30 and 40?).

If you are using .NET 3.5 and C# 3, then there may be some LINQ ways
to do what you want... again without understanding your exact query I
can't write it 100% (for instance, should the ">= 35" be a "where"?) -
but:

Marc

// setup
int[] vals = { 20, 30, 40, 50, 60, 70, 80 };
int find = 35;

// find min distance to "find", and prefer lower
// values in a tie situation
var query = from i in vals
orderby Math.Abs(find-i), i
select i;
int answer = query.Single();

/* (method syntax)
int altAnswer = vals.OrderBy(i => Math.Abs(find - i))
.ThenBy(i => i).Single();
*/

If the poster is looking for the nearest property to a particular address,
then his numbers may indeed be keys. For example, I'm looking for the
building at address 1234 in a collection of addresses on Main Street of
{1210, 1220, 1233, 1245, 1260}. They could also be phone numbers or any
other indexing scheme.
 
If the poster is looking for the nearest property to a particular
address,
then his numbers may indeed be keys.

Yes; I was thrown by the OP *only* mentioning a single series of
numeric data, rather than two.
I followed up with a post acknowledging this. Although lets be fair -
a lot of things could have been a little clearer in the original
question.

Marc
 
Marc Gravell said:
Yes; I was thrown by the OP *only* mentioning a single series of
numeric data, rather than two.
I followed up with a post acknowledging this. Although lets be fair -
a lot of things could have been a little clearer in the original
question.

Marc

You are right on both counts. Sorry, I responded before seeing the other
posts. You gave a slick solution.
 
Steve Richter said:
Does the .NET framework provide a class which will find the item in
the collection with a key which is closest ( greater than or equal,
less than or equal ) to the keys of the collection?

ex: collection keys are 20, 30, 40, 50, 60, 70, 80

find key which is >= 35.

would return the 30 key.

Maintain a SortedList containing these keys, SortedList can find the closest
(or following, or preceding) entry in O(log N) time.
 
Ben Voigt said:
Maintain a SortedList containing these keys, SortedList can find the
closest (or following, or preceding) entry in O(log N) time.


with what method? do you mean by iterating the list ?

I think the OP was looking for a function of the sort you used to see in
xBASE dialects ... think it was called SOFTSEEK(), which would return either
the key sought or the next highest-valued key, if any ... I'm not aware of
such a function in the Framework
 
with what method? do you mean by iterating the list ?

I think the OP was looking for a function of the sort you used to see in
xBASE dialects ... think it was called SOFTSEEK(), which would return either
the key sought or the next highest-valued key, if any ... I'm not aware of
such a function in the Framework

thanks for the responses ( and sorry for the confusion on what I
intended as <= 35 ).

what I have is a text file concatenated as a single string. As I scan
and parse the string I want to be able to determine the line number
and position in the line of any offset in the string. As I concat the
text lines, the line number and start/end offset in the string of the
line would be stored in a collection. When I find something at offset
2022 in the string, I want to find in the collection of "line numbers/
start offset in the string" the offset which is <= 2022. That would
tell me the text file line number.

-Steve
 
One-way searches are fortunately easier than 2-way proximity... you
get to stop sooner ;-p
If this is volume usage, then the LINQ approach I posted earlier would
be a poor choice (it would sort each time); Ben mentioned a SortedList
which may be helpful... I can't only see equality methods myself, but
you could perhaps simply walk the list until you find one that is >
2022 and return the previous (if you see what I mean...).

Of course, if you are adding to the list in sequence, you could just
use a regular list, and do the same walk.
If you are adding to the list while iterating, it is probably just the
last element.

Just for completeness (to contrast to my prior post); in LINQ terms
this is:

IDictionary<int, string> data = new SortedList<int,
string> {
{20, "a"}, {30, "b"}, {40, "c"}, {50, "d"},
{60, "e"}, {70, "f"}, {80, "g"}
};
KeyValuePair<int, string> result =
data.TakeWhile(x => x.Key <= 35).Last();

(I'm using TakeWhile here since we know the data is sorted; hence it
will stop reading when it finds something that doesn't meet the <= 35
condition).

Of course, if you are generally accessing recent data you may be
better keeping the list inverted (perhaps even just a List<T> and
Insert at 0; or use a custom comparer to automatically invert the list
in-place [very easy]) and check for the first match:

KeyValuePair<int, string> result =
data.SkipWhile(x => x.Key > 35).First();

Marc
 
Marc Gravell said:
One-way searches are fortunately easier than 2-way proximity... you get to
stop sooner ;-p
If this is volume usage, then the LINQ approach I posted earlier would be
a poor choice (it would sort each time); Ben mentioned a SortedList which
may be helpful... I can't only see equality methods myself, but you could
perhaps simply walk the list until you find one that is > 2022 and return
the previous (if you see what I mean...).

Ack! SortedList.IndexOfKey doesn't return the index for not found items,
while List.BinarySearch does. But you can't run BinarySearch on a sorted
list! An extension method would be good...

So either reimplement BinarySearch to work on a SortedList or use List and
List.BinarySearch and keep the list sorted yourself.
Of course, if you are adding to the list in sequence, you could just use a
regular list, and do the same walk.
If you are adding to the list while iterating, it is probably just the
last element.

Just for completeness (to contrast to my prior post); in LINQ terms this
is:

IDictionary<int, string> data = new SortedList<int, string> {
{20, "a"}, {30, "b"}, {40, "c"}, {50, "d"},
{60, "e"}, {70, "f"}, {80, "g"}
};
KeyValuePair<int, string> result =
data.TakeWhile(x => x.Key <= 35).Last();

(I'm using TakeWhile here since we know the data is sorted; hence it will
stop reading when it finds something that doesn't meet the <= 35
condition).

Of course, if you are generally accessing recent data you may be better
keeping the list inverted (perhaps even just a List<T> and Insert at 0; or
use a custom comparer to automatically invert the list in-place [very
easy]) and check for the first match:

KeyValuePair<int, string> result =
data.SkipWhile(x => x.Key > 35).First();

Marc
 
// Array.BinarySearch (find next closest match)

Array i2 = Array.CreateInstance(typeof(string), 8);

i2.SetValue("Anne", 0);
i2.SetValue("Bob", 1);
i2.SetValue("Chris", 2);
i2.SetValue("Donna", 3);
i2.SetValue("Elizabeth", 4);
i2.SetValue("Glenn", 5);
i2.SetValue("Franklin", 6);
i2.SetValue("Helene", 7);

int j = Array.BinarySearch(i2, "DonnaB");
if (j<0)
j = ~j;

MessageBox.Show(i2.GetValue(j).ToString()); // ==> Elizabeth




Ben Voigt said:
Marc Gravell said:
One-way searches are fortunately easier than 2-way proximity... you get
to stop sooner ;-p
If this is volume usage, then the LINQ approach I posted earlier would be
a poor choice (it would sort each time); Ben mentioned a SortedList which
may be helpful... I can't only see equality methods myself, but you could
perhaps simply walk the list until you find one that is > 2022 and return
the previous (if you see what I mean...).

Ack! SortedList.IndexOfKey doesn't return the index for not found items,
while List.BinarySearch does. But you can't run BinarySearch on a sorted
list! An extension method would be good...

So either reimplement BinarySearch to work on a SortedList or use List and
List.BinarySearch and keep the list sorted yourself.
Of course, if you are adding to the list in sequence, you could just use
a regular list, and do the same walk.
If you are adding to the list while iterating, it is probably just the
last element.

Just for completeness (to contrast to my prior post); in LINQ terms this
is:

IDictionary<int, string> data = new SortedList<int, string> {
{20, "a"}, {30, "b"}, {40, "c"}, {50, "d"},
{60, "e"}, {70, "f"}, {80, "g"}
};
KeyValuePair<int, string> result =
data.TakeWhile(x => x.Key <= 35).Last();

(I'm using TakeWhile here since we know the data is sorted; hence it will
stop reading when it finds something that doesn't meet the <= 35
condition).

Of course, if you are generally accessing recent data you may be better
keeping the list inverted (perhaps even just a List<T> and Insert at 0;
or use a custom comparer to automatically invert the list in-place [very
easy]) and check for the first match:

KeyValuePair<int, string> result =
data.SkipWhile(x => x.Key > 35).First();

Marc
 
Ben, Tested that on the original q, as I understand it, the "close
find" shuld give 40 if asked 39...

Array i2 = Array.CreateInstance(typeof(short),7);

// given 20, 30, 40, 50, 60, 70, 80

i2.SetValue((short)20, 0);
i2.SetValue((short)30, 1);
i2.SetValue((short)40, 2);
i2.SetValue((short)50, 3);
i2.SetValue((short)60, 4);
i2.SetValue((short)70, 5);
i2.SetValue((short)80, 6);

// given 35
int j = Array.BinarySearch(i2, (short)35);
if (j < -1)
j = ~j-1;

Console.WriteLine("35 -> " +
i2.GetValue(j).ToString()); // 30

// testing low 10
j = Array.BinarySearch(i2, (short)10);
if (j < -1)
j = ~j - 1;
else // Oops
j = 0;
Console.WriteLine("10 -> " +
i2.GetValue(j).ToString()); // 20

// testing high 100
j = Array.BinarySearch(i2, (short)100);
if (j < -1)
j = ~j - 1;

Console.WriteLine("100 -> " +
i2.GetValue(j).ToString()); // 80

// testing close 39
j = Array.BinarySearch(i2, (short)39);
if (j < -1)
j = ~j - 1;

Console.WriteLine("39 -> " +
i2.GetValue(j).ToString()); // 30 = Fail, should return 40 or what?

//CY
 
Ben, Tested that on the original q, as I understand it, the "close
find" shuld give 40 if asked 39...

that was my doing, not Ben's; yes, you should get 40 if the predicate is
39, and you do; what I'm not clear on is why you're subtracting 1 from the
bitwise complement of the result ... and why you're testing for "j < -1"
if (j < -1)
j = ~j-1;

should be:

if (j < 0)
j = ~j;

BUT, I should have done some bounds checking where the predicate is greater
than the highest valued item in the array:

if (j < i2.Length)
MessageBox.Show(i2.GetValue(j).ToString());
else
MessageBox.Show("Cannot return a value");
 
BinarySearch doesn't give you the closest element by default, it gives you
the address of the closest greater element. The element right before is
(due to sorting) the closest smaller element, and you may therefore use
whichever of the two you prefer. This is still O(log N), because there is
no extra expense as the length of the list increases.

If your list is less than about 64 elements you are better off with a single
pass linear search.

If your elements are fairly evenly distributed (or according to a known
distribution) then you can do much better than a binary search by estimating
the location of the element sought.
 
Liz said:
// Array.BinarySearch (find next closest match)

Array i2 = Array.CreateInstance(typeof(string), 8);

i2.SetValue("Anne", 0);
i2.SetValue("Bob", 1);
i2.SetValue("Chris", 2);
i2.SetValue("Donna", 3);
i2.SetValue("Elizabeth", 4);
i2.SetValue("Glenn", 5);
i2.SetValue("Franklin", 6);
i2.SetValue("Helene", 7);
<snip>
<off topic>
Why would you declare an array of strings with that syntax?
I mean, it works, but why subject yourself to that icky looking code.
How about
string[] X = new string[]{"Anne", "Bob", "Chris", "Donna", "Elizabeth",
"Glenn", "Franklin", "Helene"};

Just curious
Bill
 
Why would you declare an array of strings with that syntax?
I mean, it works, but why subject yourself to that icky looking code.
How about
string[] X = new string[]{"Anne", "Bob", "Chris", "Donna", "Elizabeth",
"Glenn", "Franklin", "Helene"};


The original question presented was how to get a "soft" match in an array
using BinarySearch, i.e.: if no match, return the next highest value in the
array; so far as I know, there's no (straightforward) way to get there from
string[] x = {..,..,.. } syntax ...

who cares about the "ickiness" of syntax? it's all machine code out the
door; in a real-world example, you'd iterate it up with no muss, no fuss in
5-10 lines of code wrapping array.SetValue(a,n) in a loop ... but if you
have a better solution to do the soft match, I'm all ears ... assume the
array would be of significant size, not length = 8 ...
 
Liz said:
Why would you declare an array of strings with that syntax?
I mean, it works, but why subject yourself to that icky looking code.
How about
string[] X = new string[]{"Anne", "Bob", "Chris", "Donna",
"Elizabeth", "Glenn", "Franklin", "Helene"};


The original question presented was how to get a "soft" match in an
array using BinarySearch, i.e.: if no match, return the next highest
value in the array; so far as I know, there's no (straightforward)
way to get there from string[] x = {..,..,.. } syntax ...

I did Add "off topic" right before I asked (which you edited out).
I have no quibble over the Binary search suggestion, It is what I would
have suggested.
My question has nothing to do with that.
I simply wanted to know why you would Array.CreateInstance instead of
the more common syntax for array creation?
Is there some benefit that I am unaware of?

As I said "Just Curious"

Bill
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Back
Top