need suggestion on data structure for this data

bahoo · Aug 11, 2007

Hi,

My data is organized like this, where A always goes from 1 to some
number N:
A 1 2 3 ...
B 53,26 42,18 15,86 ...
C 59 43 31 ...

I often need to look up in this way:
Given B=18, what is the corresponding value of A and C? (Answer is 2
and 43).
My current implementation is a for-loop that goes through all A's and
checks if the B value is the query, and if yes, return the
corresponding A and C. But this is very inefficient.
Could anyone please suggest me a better data structure for doing this,
better be a built-in one?
Observe that B always appears in pairs, and when I look up, I may use
either value within the pair.
Thanks!
bahoo

Peter Duniho · Aug 11, 2007

bahoo said:
[...]
My current implementation is a for-loop that goes through all A's and
checks if the B value is the query, and if yes, return the
corresponding A and C. But this is very inefficient.
Could anyone please suggest me a better data structure for doing this,
better be a built-in one?

Are the values of B unique? I assume they are, otherwise even your
brute-force algorithm seems to have problems.

The usual way to address something like this is to create an index. In
..NET, the Dictionary<> generic class would work well for this. Using
the values from B as your index, and the corresponding column number (A)
as your value.

You would have to scan through all of the data once, adding two new
entries to the Dictionary for each column (one for each value of B).
Then given a value for B to look up, you would simply use that value as
the index for the Dictionary<>, which would return your A value, which
you could then use to also get the C value.

There are lots of other ways to implement an index; don't feel you have
to use Dictionary<>. It just happens to be relatively convenient. An
alternative might be to maintain an array sorted based on the B values,
and containing a data structure that includes the A value. Then a
binary search would quickly find the right array element based on the B
value, which would then yield the A value.

Any index will, of course, improve performance at the cost of memory
usage. You don't say how many columns of data you might have, but if
it's small, your brute-force iterative method might well be just fine.

Pete

=?ISO-8859-1?Q?Arne_Vajh=F8j?= · Aug 11, 2007

bahoo said:
My data is organized like this, where A always goes from 1 to some
number N:
A 1 2 3 ...
B 53,26 42,18 15,86 ...
C 59 43 31 ...

I often need to look up in this way:
Given B=18, what is the corresponding value of A and C? (Answer is 2
and 43).
My current implementation is a for-loop that goes through all A's and
checks if the B value is the query, and if yes, return the
corresponding A and C. But this is very inefficient.
Could anyone please suggest me a better data structure for doing this,
better be a built-in one?
Observe that B always appears in pairs, and when I look up, I may use
either value within the pair.

Make a class Info with 4 fields/properties A, B1, B2 and C.

Have a Dictionary<int,Info> instance where you insert each
object twice using B1 and B2 as keys.

Arne

bahoo · Aug 11, 2007

bahoo said:
bahoo said:

[...]
My current implementation is a for-loop that goes through all A's and
checks if the B value is the query, and if yes, return the
corresponding A and C. But this is very inefficient.
Could anyone please suggest me a better data structure for doing this,
better be a built-in one?

Click to expand...

Are the values of B unique? I assume they are, otherwise even your
brute-force algorithm seems to have problems.

The usual way to address something like this is to create an index. In
.NET, the Dictionary<> generic class would work well for this. Using
the values from B as your index, and the corresponding column number (A)
as your value.

You would have to scan through all of the data once, adding two new
entries to the Dictionary for each column (one for each value of B).
Then given a value for B to look up, you would simply use that value as
the index for the Dictionary<>, which would return your A value, which
you could then use to also get the C value.

There are lots of other ways to implement an index; don't feel you have
to use Dictionary<>. It just happens to be relatively convenient. An
alternative might be to maintain an array sorted based on the B values,
and containing a data structure that includes the A value. Then a
binary search would quickly find the right array element based on the B
value, which would then yield the A value.

Any index will, of course, improve performance at the cost of memory
usage. You don't say how many columns of data you might have, but if
it's small, your brute-force iterative method might well be just fine.

Pete

Thanks Pete, but sorry that I didn't mention the B values are NOT
unique: different A's can have the same B value. When I do look up, I
wish to retrieve all these A's and C's that correspond to the query
B.
What I did in my brute-force approach is to scan through all the B's,
and every time I find a match, I record the A's and C's in a list.
If "Dictionary" doesn't apply here, what would you suggest next?

Thanks!!
bahoo

bahoo · Aug 11, 2007

bahoo said:
bahoo said:

[...]
My current implementation is a for-loop that goes through all A's and
checks if the B value is the query, and if yes, return the
corresponding A and C. But this is very inefficient.
Could anyone please suggest me a better data structure for doing this,
better be a built-in one?

Click to expand...

Are the values of B unique? I assume they are, otherwise even your
brute-force algorithm seems to have problems.

The usual way to address something like this is to create an index. In
.NET, the Dictionary<> generic class would work well for this. Using
the values from B as your index, and the corresponding column number (A)
as your value.

You would have to scan through all of the data once, adding two new
entries to the Dictionary for each column (one for each value of B).
Then given a value for B to look up, you would simply use that value as
the index for the Dictionary<>, which would return your A value, which
you could then use to also get the C value.

There are lots of other ways to implement an index; don't feel you have
to use Dictionary<>. It just happens to be relatively convenient. An
alternative might be to maintain an array sorted based on the B values,
and containing a data structure that includes the A value. Then a
binary search would quickly find the right array element based on the B
value, which would then yield the A value.

Any index will, of course, improve performance at the cost of memory
usage. You don't say how many columns of data you might have, but if
it's small, your brute-force iterative method might well be just fine.

Pete

Thanks Pete, but sorry that I didn't mention the B values are NOT
unique: different A's can have the same B value. When I do look up, I
wish to retrieve all these A's and C's that correspond to the query
B.
What I did in my brute-force approach is to scan through all the B's,
and every time I find a match, I record the A's and C's in a list.
If "Dictionary" doesn't apply here, what would you suggest next?

Thanks!!
bahoo

=?ISO-8859-1?Q?G=F6ran_Andersson?= · Aug 11, 2007

bahoo said:
Thanks Pete, but sorry that I didn't mention the B values are NOT
unique: different A's can have the same B value. When I do look up, I
wish to retrieve all these A's and C's that correspond to the query
B.
What I did in my brute-force approach is to scan through all the B's,
and every time I find a match, I record the A's and C's in a list.
If "Dictionary" doesn't apply here, what would you suggest next?

You can use a Dictionary<int, List<...>> where you put the items in the
lists.

For each item you add you check if the dictionary contains the key. If
it does, just add the item to the existing list, if it doesn't, create a
list, add the item to it and add the list to the dictionary.

Peter Duniho · Aug 11, 2007

bahoo said:
[...]
What I did in my brute-force approach is to scan through all the B's,
and every time I find a match, I record the A's and C's in a list.
If "Dictionary" doesn't apply here, what would you suggest next?

Well, as I mentioned, using an array sorted by the B values, with a
binary search to look them up, is an alternative. With duplicated
values, you'd have to scan forward and back in the array once you've
found a match, to ensure that you retrieve all of the records. But it
would be reasonably fast.

There may be some hashtable-based structure in .NET that allows for
duplicate values. If there is, I don't know it off the top of my head
and I don't have time to look it up. However, generally speaking that
would be an option too (whether you have to implement it yourself or
not, I don't know).

I will also reiterate my suggestion that if the brute-force method works
for the size of data you have, there may be no need to do anything more
complicated than that.

Pete

Peter Duniho · Aug 11, 2007

Göran Andersson said:
You can use a Dictionary<int, List<...>> where you put the items in the
lists.

Yup, that's a pretty good approach too. I'm embarrassed to not have
mentioned it myself, if only because I have used that exact technique in
at least one of my own programs.

Pete

Efficient Data Structures	22	Dec 13, 2005
Excel How to find the most common pairs and triplets numbers?	5	Nov 23, 2018
Data Structures	6	Feb 2, 2009
data structure	3	Jul 20, 2008
Data structure suggestion wanted...	2	Sep 17, 2006
How do most programmers protect their internal data structures?	17	Feb 15, 2011
how to exchange a structure between c++/cli and c, c++	1	Jun 6, 2008
Expand Data Structure to One Record per Time Period	6	Jun 17, 2013

need suggestion on data structure for this data

bahoo

Peter Duniho

=?ISO-8859-1?Q?Arne_Vajh=F8j?=

bahoo

bahoo

=?ISO-8859-1?Q?G=F6ran_Andersson?=

Peter Duniho

Peter Duniho

Ask a Question

Similar Threads