selecting a group of records that have a constant average

  • Thread starter Thread starter jseger22
  • Start date Start date
J

jseger22

I have a table that changes daily and contains 2 fields: ID(unique
value), Score(with a range of 20 to 200)

I want to be able to choose the maximum number of records that will
always have an average score of 45.
 
You have to have a series a values before you can use the "average" funtion. Don't think you can do that if you always have one score for each record. Are you updating numbers in "score" everyday?
 
Not an easy problem. It seems a variation of the well known 'back pack"
problem, which is a complex algorithm (and generally, people only use an
heuristic, giving an approximate solution, rather than the exact algorithm).

Indeed, define, for each record, the variable x such that x=0 if we
don't pick the record and x=1 if we pick it.

We want:

Optimize (Maximize)
SUM( x ) ' the sum is about those x=1,
so this is the total number of
records we picked
under
SUM( x * Score ) = 45 * SUM( x )
' the mean constraint
and
x element_of {0, 1}


but that is the same as:
(since MAX { cte*z } = cte* MAX{ z } )


Optimize
SUM( x * Score )
under
SUM( x * Score ) = 45 * SUM( x )
and
x element_of {0, 1}


which seems harder than the back-pack problem:

Optimize
SUM(x * Score)
under
SUM( x* Score ) <= W_constant
and
x element_of {0, 1}




So, for an exact solution, I would look for an algorithm based on Balas
family of algorithms, or other dynamic approach, but for all I know, that is
far stretched to do by SQL, unless you consider brutal implicit enumeration.



Sorry, but still, I can be wrong, it may just look harder than the backpack,
problem, while be easier, after all.


Vanderghast, Access MVP
 
I have a table that changes daily and contains 2 fields: ID(unique
value), Score(with a range of 20 to 200)

I want to be able to choose the maximum number of records that will
always have an average score of 45.

jseger22,

If you were going to calculate that on a piece of paper, how would you do it?


Sincerely,

Chris O.
 
I may be wrong on this but the way I understand your problem I will give you
a simple example. The UK national lottery involves choosing 6 balls out of
49. For me to get the same 6 numbers as those chosen I can choose one of 14
million different combinations.

Imagine each ball has a score attached to it like your records.

To get the maximum number of records to calculate the average, I can choose
ANY 1 ball or ANY pair of balls or ANY trio of balls all the way up to
choosing ALL 49 balls. Any combination of any group of numbers will need to
be checked and the average calculated.

If I had just 20 records that would mean 1048575 different groups of records
to check.

1 20
2 190
3 1140
4 4845
5 15504
6 38760
7 77520
8 125970
9 167960
10 184756
11 167960
12 125970
13 77520
14 38760
15 15504
16 4845
17 1140
18 190
19 20
20 1


You have 500 records.

If I understand you correctly you are trying to take on a massive task.
 
well there are 500 ways of picking the first number, times 499 ways of
picking the second, but it does not matter which order you picked those two
in, so you get to divide 500 * 499 by 2

there are 498 ways to pick the third, but now you get to divide by three,
because the order does not matter.

the pattern is :
500/1 * 499/2 * 498/3 * 497/4 ....

you may need a faster PC :-<
 
I have a table that changes daily and contains 2 fields: ID(unique
value), Score(with a range of 20 to 200)

I want to be able to choose the maximum number of records that will
always have an average score of 45.

This is a variant of the NP-complete "Satisfiability Problem", which is
strongly suspected (but not proved, after a century of effort) to have no
"efficient" solution. The other replies in this thread give you some feel for
the magnitude of the "brute force" solution, and I'd strongly suspect that if
you were to come up with a fast and elegant solution, you'ld be on the cover
of Time Magazine as the mathematical genius of the century.

In short... good luck. This probably cannot be done.

John W. Vinson [MVP]
 
This is a variant of the NP-complete "Satisfiability Problem", which is
strongly suspected (but not proved, after a century of effort) to have no
"efficient" solution. The other replies in this thread give you some feel for
the magnitude of the "brute force" solution, and I'd strongly suspect that if
you were to come up with a fast and elegant solution, you'ld be on the cover
of Time Magazine as the mathematical genius of the century.

In short... good luck. This probably cannot be done.

John W. Vinson [MVP]

Thanks for everyone's help. I think I will give up on this one :)
 
Thanks for everyone's help. I think I will give up on this one :)

The question is "How important is it?". If there are mega-bucks at stake, or
close is good enough, or if your data is very friendly, it might be possible
and worthwhile.

I had to face the "Travelling salesman" problem working out the optimum path
for a drilling machine with patterns of 2000 holes. It is beyond practical
solution. However I found a way that gave a very close answer within a
couple of minutes. That program made loads of bucks (for somebody else :-<)
How important is it?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads


Back
Top