Question for Math whizzes

M

Meenie

I have Excel 2003 - but doesn't make much difference for this question.

I know you aren't supposed to do an average of averages but we do.
Here's what we do and why we do it... is there a better way?

We have a number of units that do a monthly audit of things in patients
chart.
there are 26 questions on this audit and they have to do it on 20 charts
each month.
They turn in 5 audits a week.
the questions are all answered with yes, no or n/a. examples of questions
are: 1. Is the admission form completed. 2. Was the database signed by an RN.
3. If the patient's pain greater than 4, was the intervention documented.

Ok, so a score is assigned on each audit like this:
each audit is on 5 charts so there are 5 answers in each row.
If the question is answered "Yes" for each chart that would equal 100%
Each no = minus 20 so if there were 2 no's the score would be 60%
(seems like tough scoring, but all the answers should always be either yes
or n/a)
N/a does not count off anything.

then the scores for all 26 questions are averaged for the unit's average
weekly score.

These audits are done each week and at the end of the month, the average
scores for each week per unit are averaged for a score for the month. (that's
the 2nd time the numbers are averaged)

Each month the monthly averages are averaged for a YTD score for each unit.
(average #3)

Now, we aren't looking for an exact number. Just a kind of baseline score to
show how each unit is doing on these different questions - so is the
averaging of the averages a big deal?

We don't actually call them "averages" we call them "scores" and make a bar
chart as a "dashboard" of how things are going on each unit.
And in the long run, all the units come out well, but is there a better way
to do this without making it into something that is way too much work for a
simple check on how they're following procedures?

Thanks to anyone who has hung in this far :)
 
M

Mike H

Meenie,

Without any qualification whatsoever an average of an averge is very likely
to give you misleading results and you shouldn't do it. Consider this data

Col A Col B
1000 1
1000 1
1000
1000
1000
1000
1000
1000
1000
1000

Now with even doing the sums it's clear the average of column A is 1000 and
the average of column B is 1.

If we then average these 2 numbers we might conclude the average of columns
A & B is 1000+1 = 1001 /2 =500.5

But if we average all the numbers we would conclude that the correct average
of all of these numbers is in fact 833.5 a considerable variance from the
untrue result gained by averaging 2 averages.

Mike
 
J

joeu2004

Now, we aren't looking for an exact number. Just a kind of baseline
score to show how each unit is doing on these different questions -
so is the averaging of the averages a big deal?

No, at least not based on what you describe. I question whether your
description is entirely accurate. But even if it is not, as you say,
you are not looking for an exactly number.

(I do quibble with your scoring method. I suspect that is more likely
to introduce misleading results than the average of averages. But
that's another subject altogether. And it is just a feeling right
now; nothing concrete.)

First, an average of averages is mathematically the same as the
average of all the individual numbers when each average has the same
denominator.

You say that each of your first set of averages is each based on 5
data. I assume you mean a sampling of patients, with 5 in each
sample. We must presume that there is never less than 5 patients in a
week in any unit.

Be that as it may, __if__ the first set of averages is each based on 5
data, the average of those averages is an accurate average of all the
data together, at least mathematically. (See the caveat below.)

Likewise, the average of the monthly averages would be an accurate
representation of all the data if you use the same number of weeks per
month for the monthly averages. Note: Some months might be
considered to have 5 weeks, depending on how you count a week.

Caveat: The average of averages is accurate in this manner
__mathematically__. A difference may arise due to the way Excel does
arithmetic (i.e. binary computer arithmetic); but the difference is
not likely to be significant. A more significant difference arise is
you do an average of __rounded__ averages; for example, if you compute
=ROUND(AVERAGE(...),2) for the individual averages. Rounding is not a
significant issue if you rely on Excel formatting to perform the
rounding.

is there a better way to do this without making it into
something that is way too much work for a simple check
on how they're following procedures?

At a minimum, if you keep track of the number of data used to compute
each average, you could use SUMPRODUCT to compute a more accurate
average. Again, this should be needed only if the number of data in
each average varies.

(But this will not work if you explicitly round the averages.)

Arguably, a better approach is to keep track of the sum of the data as
well as the number of data used to compute each average. For example,
if you now compute AVERAGE(A1:A5), also compute SUM(A1:A5).

Each of those improvements do not sound like major changes to me. But
it depends on the layout of your worksheet. Also note that you can
hide the columns or row that contain these additional computations so
they do not cause confusion, if that's a concern.

If you need help with implementing these ideas, post back with some
details of the worksheet. Alternatively, you could send a file to me
at joeu2004 "at" hotmail.com.

But I want to reiterate that you probably do not need to implement
either improvement.


----- original posting -----
 
M

Mike H

Hi,
so is the averaging of the averages a big deal?
No, at least not based on what you describe

Yes it probably is a big deal. The OP refers to N/A as being a possible
answer which attracts no score so it therefore follows that there could be a
different number of scoring questions in each survey (some could all be N/A)
and that will lead to incorrect conclusions being drawn from taking an
average of an average.


Mike
 
J

joeu2004

Yes it probably is a big deal. The OP refers to N/A as being a possible
answer which attracts no score so it therefore follows that there could be a
different number of scoring  questions in each survey (some could all be N/A)
and that will lead to incorrect conclusions being drawn from taking an
average of an average.

I think you assume that "Meenie" is taking the average of the answers
(yes, no, N/A). I do not read it that way. My interpretation is that
"Meenie" is taking the average of 5 "scores", where each score is (to
put it more simply) 100% minus 20% for each "no" answer.

But I would agree that "Meenie's" explanation leaves much open to
interpretation.

Nonetheless, it does not matter if N/A is zero. In your
interpretation (as I understand it), the number of answers is the
same: 26.

The average of averages is the same as the average of all data if the
divisor is the same each average. It does not matter if some data are
zero.

That is not the case in the example you gave: an average of two
columns with different lengths.
 
J

joeu2004

Errata....

My interpretation is that "Meenie" is taking the average of 5 "scores",
where each score is (to put it more simply) 100% minus 20% for each
"no" answer.

What "Meenie" actually wrote is: "If the question is answered "Yes"
for each chart that would equal 100%[.] Each no = minus 20 so if
there were 2 no's the score would be 60% [...]. N/a does not count
off anything."

It really is not clear (to me) what "Meenie" is averaging. It still
sounds like the number of data being averaged is the same.
Nonetheless, I believe I gave "Meenie" all the information necessary
to correctly compute the average in either case.
 
M

Meenie

Let me clarify the "n/a" score.
Example: a question will ask if the pt was assessed for a risk for Falling
in the Emergency Room.
The audit is done on the unit after the patient is admitted. If the pt was
not admitted via the Emergency room, then the answer is n/a . (there is
another question later about the pt being evaluated for Falls on the unit
itself)
Thus an audit done on 5 pts, with one patient not being admitted from the
ER, and the other four all having been documented as being assessed for Falls
in the Er, scoring would be:
100% (four yeses and one n/a answer)
5 patients, one not from the ER and two of the others not assessed
appropriately for Falls in the Er would equal:
60% (two no answers would subtract 40% )
Scoring is based on the idea that you start with 100% and each negative
answer takes away 20 from that score.

NOW that 40% score would attract attention and the ER would have to document
what happened that these patients weren't appropriately evaluated for their
Fall Risk.
The averages of all the units and then the YTD are only meant to give an
overall snapshot of how all the units are doing as a whole. One poor number
would be helped tremendously by the majority of good numbers for the overall
average, but the individual numbers are looked at and evaluated also.
Just want to clarify that the overall average is not to outweigh the poor
numbers, but to put them in perspective.
Any single unit that consistently reports low numbers on the audits are
investigated.
 
J

joeu2004

Scoring is based on the idea that you start with 100% and
each negative answer takes away 20 from that score.

Wow, so I had it right the first time! Imagine my surprise.

(But I was wrong about how many answers were counted for each score.)

Just want to clarify that the overall average is not to
outweigh the poor numbers, but to put them in perspective.

How you interpret the statistic is between you and your peers and
management.

What you need to clarify is: does your first-level average always
involve the same number of data?

And does your monthly average always involve 4 weeks of data?

If "yes", then the average of the averages is an accurate
representation of the average of all data.

If "no", then the average of the averages is not. Can you use the
second technique that I described?

If "N/A", then ... :).

Bottom line: Have we answered your question, or do you still have a
problem or questions?
 
M

Meenie

What you need to clarify is: does your first-level average always
involve the same number of data?

And does your monthly average always involve 4 weeks of data?

Yes and Yes :0
If "yes", then the average of the averages is an accurate
representation of the average of all data.

If "no", then the average of the averages is not. Can you use the
second technique that I described?

If "N/A", then ... :).

Bottom line: Have we answered your question, or do you still have a
problem or questions?

Sounds great to me. Thanks
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top