Multiple State and product based analysis project

T

The Hit Man

I am doing a multiple state based analysis project. I can have thousands of
orders over multiple states (one, none or multiple orders per state are all
valid result possibilities). The result I am working with is an order
completion time (2.44 days, 0.56 days or 30.25 days for example) from a
master list.



I can determine the 90% rank of the entire list as follows:



"=SMALL('OOS Detail'!H:H,ROUNDUP((COUNT('OOS Detail'!H:H)*0.9),0))"



I have three product recaps that I create independent data columns for to
return the rank by product.



The problem I am having is I need to return 90% rank by state and I do not
want to have to create 50 independent data columns foreach state each week I
run the report. I cannot figure out how to get my formula above, which works
when based on a single state criteria to work when the list contains multiple
states.



I think I need to "=IF" by state variable and return the above formula rank
result by state from the original master list of values.

The data file is simple, one column state name (AL,AR,etc) and a
corresponding result value (1.01,2.02,etc). Maybe I am using the wrong
approach so suggestions are appreciated.

Goal: return the x% variable value (90%, 95%, etc) from a list of numeric
values for an entire list of values and for the same list of values based on
product and state variables.

State Result Product
AL 1.06 A
AL 2,o7 B
AR 1.99 A
 
L

Lars-Åke Aspelin

On Sun, 7 Sep 2008 08:03:01 -0700, The Hit Man <The Hit
I am doing a multiple state based analysis project. I can have thousands of
orders over multiple states (one, none or multiple orders per state are all
valid result possibilities). The result I am working with is an order
completion time (2.44 days, 0.56 days or 30.25 days for example) from a
master list.



I can determine the 90% rank of the entire list as follows:



"=SMALL('OOS Detail'!H:H,ROUNDUP((COUNT('OOS Detail'!H:H)*0.9),0))"



I have three product recaps that I create independent data columns for to
return the rank by product.



The problem I am having is I need to return 90% rank by state and I do not
want to have to create 50 independent data columns foreach state each week I
run the report. I cannot figure out how to get my formula above, which works
when based on a single state criteria to work when the list contains multiple
states.



I think I need to "=IF" by state variable and return the above formula rank
result by state from the original master list of values.

The data file is simple, one column state name (AL,AR,etc) and a
corresponding result value (1.01,2.02,etc). Maybe I am using the wrong
approach so suggestions are appreciated.

Goal: return the x% variable value (90%, 95%, etc) from a list of numeric
values for an entire list of values and for the same list of values based on
product and state variables.

State Result Product
AL 1.06 A
AL 2,o7 B
AR 1.99 A


If your simple data file only have two columns, one for state name and
the other for the result value, where do the products come from?

Lars-Åke
 
T

The Hit Man

I apologize, the dataset has many columns but the process I require needs
only two at a time. I need to return the result by state and then as a
seperate operation the result by product. The process should be the same
regardless of the criteria. Any operation will have one set of flag data
(state or product) and one shared set of results data.

Thank you
 
L

Lars-Åke Aspelin

Assuming that your result data are all greater than 0.
If G is the column where you have the "flag data" and H is the column
with your result data (order completion time), and C1 is the value of
the flag data for which you would like the 90 percentil, you may try
the following formula in cell D1

=LARGE((B:B)*(A:A=C1),ROUNDUP((COUNTIF(A:A,C1)*(1-0.9)),0))

Note that this is an array formula that has to be entered with
CTRL+SHIFT+ENTER rather than just ENTER.

If you want you can put the 0.9 in a cell that you refer to instead,
but in your original formula it was a constant so I just kept it that
way.

Hope this helps / Lars-Åke
 
T

The Hit Man

The results were not satisfactory but maybe I am doing something wrong.

Order State Order TT Group State 90% TT
AL 1.04 AL =LARGE((B:B)*(A:A=G2),ROUNDUP((COUNTIF(A:A,G2)*(1-0.9)),0))
AL 4.07 AR =LARGE((B:B)*(A:A=G3),ROUNDUP((COUNTIF(A:A,G3)*(1-0.9)),0))
AZ 0.11 AZ =LARGE((B:B)*(A:A=G4),ROUNDUP((COUNTIF(A:A,G4)*(1-0.9)),0))
AZ 0.11 CA =LARGE((B:B)*(A:A=G5),ROUNDUP((COUNTIF(A:A,G5)*(1-0.9)),0))
AZ 0.18 CO =LARGE((B:B)*(A:A=G6),ROUNDUP((COUNTIF(A:A,G6)*(1-0.9)),0))
CA 0.11 CT =LARGE((B:B)*(A:A=G7),ROUNDUP((COUNTIF(A:A,G7)*(1-0.9)),0))
CA 0.11 DC =LARGE((B:B)*(A:A=G8),ROUNDUP((COUNTIF(A:A,G8)*(1-0.9)),0))
CA 0.11 DE =LARGE((B:B)*(A:A=G9),ROUNDUP((COUNTIF(A:A,G9)*(1-0.9)),0))
CA 0.11 FL =LARGE((B:B)*(A:A=G10),ROUNDUP((COUNTIF(A:A,G10)*(1-0.9)),0))
CA 0.11 GA =LARGE((B:B)*(A:A=G11),ROUNDUP((COUNTIF(A:A,G11)*(1-0.9)),0))
CA 0.19 IA =LARGE((B:B)*(A:A=G12),ROUNDUP((COUNTIF(A:A,G12)*(1-0.9)),0))
CA 0.2 ID =LARGE((B:B)*(A:A=G13),ROUNDUP((COUNTIF(A:A,G13)*(1-0.9)),0))
CA 0.38 IL =LARGE((B:B)*(A:A=G14),ROUNDUP((COUNTIF(A:A,G14)*(1-0.9)),0))
CO 0.11 IN =LARGE((B:B)*(A:A=G15),ROUNDUP((COUNTIF(A:A,G15)*(1-0.9)),0))
CO 0.84 KS =LARGE((B:B)*(A:A=G16),ROUNDUP((COUNTIF(A:A,G16)*(1-0.9)),0))
DE 0.11 KY =LARGE((B:B)*(A:A=G17),ROUNDUP((COUNTIF(A:A,G17)*(1-0.9)),0))
DE 0.31 LA =LARGE((B:B)*(A:A=G18),ROUNDUP((COUNTIF(A:A,G18)*(1-0.9)),0))
FL 0.11 MA =LARGE((B:B)*(A:A=G19),ROUNDUP((COUNTIF(A:A,G19)*(1-0.9)),0))
GA 0.3 MD =LARGE((B:B)*(A:A=G20),ROUNDUP((COUNTIF(A:A,G20)*(1-0.9)),0))
IL 0.11 ME =LARGE((B:B)*(A:A=G21),ROUNDUP((COUNTIF(A:A,G21)*(1-0.9)),0))
IL 0.11 MI =LARGE((B:B)*(A:A=G22),ROUNDUP((COUNTIF(A:A,G22)*(1-0.9)),0))
IL 0.11 MN =LARGE((B:B)*(A:A=G23),ROUNDUP((COUNTIF(A:A,G23)*(1-0.9)),0))
IL 0.19 MO =LARGE((B:B)*(A:A=G24),ROUNDUP((COUNTIF(A:A,G24)*(1-0.9)),0))
IL 0.19 MS =LARGE((B:B)*(A:A=G25),ROUNDUP((COUNTIF(A:A,G25)*(1-0.9)),0))
IL 0.19 MT =LARGE((B:B)*(A:A=G26),ROUNDUP((COUNTIF(A:A,G26)*(1-0.9)),0))
IL 0.3 NC =LARGE((B:B)*(A:A=G27),ROUNDUP((COUNTIF(A:A,G27)*(1-0.9)),0))
IL 0.3 ND =LARGE((B:B)*(A:A=G28),ROUNDUP((COUNTIF(A:A,G28)*(1-0.9)),0))
IL 0.3 NE =LARGE((B:B)*(A:A=G29),ROUNDUP((COUNTIF(A:A,G29)*(1-0.9)),0))
IL 0.95 NH =LARGE((B:B)*(A:A=G30),ROUNDUP((COUNTIF(A:A,G30)*(1-0.9)),0))
IL 0.95 NJ =LARGE((B:B)*(A:A=G31),ROUNDUP((COUNTIF(A:A,G31)*(1-0.9)),0))
IL 0.95 NM =LARGE((B:B)*(A:A=G32),ROUNDUP((COUNTIF(A:A,G32)*(1-0.9)),0))
IL 1.06 NV =LARGE((B:B)*(A:A=G33),ROUNDUP((COUNTIF(A:A,G33)*(1-0.9)),0))
IL 1.06 NY =LARGE((B:B)*(A:A=G34),ROUNDUP((COUNTIF(A:A,G34)*(1-0.9)),0))
IL 1.06 OH =LARGE((B:B)*(A:A=G35),ROUNDUP((COUNTIF(A:A,G35)*(1-0.9)),0))
IL 1.12 OK =LARGE((B:B)*(A:A=G36),ROUNDUP((COUNTIF(A:A,G36)*(1-0.9)),0))
IL 1.57 OR =LARGE((B:B)*(A:A=G37),ROUNDUP((COUNTIF(A:A,G37)*(1-0.9)),0))
IL 2.15 PA =LARGE((B:B)*(A:A=G38),ROUNDUP((COUNTIF(A:A,G38)*(1-0.9)),0))
IN 0.25 RI =LARGE((B:B)*(A:A=G39),ROUNDUP((COUNTIF(A:A,G39)*(1-0.9)),0))
IN 2.06 SC =LARGE((B:B)*(A:A=G40),ROUNDUP((COUNTIF(A:A,G40)*(1-0.9)),0))
IN 3.06 SD =LARGE((B:B)*(A:A=G41),ROUNDUP((COUNTIF(A:A,G41)*(1-0.9)),0))
MA 1.13 TN =LARGE((B:B)*(A:A=G42),ROUNDUP((COUNTIF(A:A,G42)*(1-0.9)),0))
MD 0.22 UT =LARGE((B:B)*(A:A=G43),ROUNDUP((COUNTIF(A:A,G43)*(1-0.9)),0))
MD 0.23 VA =LARGE((B:B)*(A:A=G44),ROUNDUP((COUNTIF(A:A,G44)*(1-0.9)),0))
ME 1.17 VT =LARGE((B:B)*(A:A=G45),ROUNDUP((COUNTIF(A:A,G45)*(1-0.9)),0))
ME 3.28 WA =LARGE((B:B)*(A:A=G46),ROUNDUP((COUNTIF(A:A,G46)*(1-0.9)),0))
MI 0.11 WI =LARGE((B:B)*(A:A=G47),ROUNDUP((COUNTIF(A:A,G47)*(1-0.9)),0))
MI 0.86 WV =LARGE((B:B)*(A:A=G48),ROUNDUP((COUNTIF(A:A,G48)*(1-0.9)),0))
MI 1.27 WY =LARGE((B:B)*(A:A=G49),ROUNDUP((COUNTIF(A:A,G49)*(1-0.9)),0))

Assuming that your result data are all greater than 0.

(True)


If G is the column where you have the "flag data" and H is the column with
your result data (order completion time),

(My sample Column “A†= individual order state result and Column “B†=
individual order completion time)



and C1 is the value of the flag data for which you would like the 90
percentile,

(Column “G†G2 = AL, G3 = AR, etc)



you may try the following formula in cell D1



I modified the formula to read as below in red as I did not clearly
understand your reference values:



"=LARGE((B:B)*(A:A=G2),ROUNDUP((COUNTIF(A:A,G2)*(1-0.9)),0))"



Compared to your original below:



“=LARGE((B:B)*(A:A=C1),ROUNDUP((COUNTIF(A:A,C1)*(1-0.9)),0))â€

Note that this is an array formula that has to be entered with
CTRL+SHIFT+ENTER rather than just ENTER.

If you want you can put the 0.9 in a cell that you refer to instead,
but in your original formula it was a constant so I just kept it that
way.
 
L

Lars-Åke Aspelin

If you use the entire columns in the formula, like B:B, you can not
have any (non numerical) headers on row 1.
I suggest that you change B:B to B2:B49 and A:A to A2:A49 in two
places.
Like this:

=LARGE((B2:B49)*(A2:A49=G2),ROUNDUP((COUNTIF(A2:A49,G2)*(1-0.9)),0))

Adjust the 49 to fit your data size.
Don't forget to enter the formula as an array formula.

I got reasonable output using your sample date, e.g.
IL got 1.67
All other states have less than 10 samples so you just get to top
score if you use 90%. By changing 0.9 to 0.5 I got
the result 2.06 (the median) for IN as expected.

Lars-Åke
 
T

The Hit Man

Thanks, I was not aware of the header row limitation with array formulas. The
initial test results appear to be exactly what I expect so your method gets a
big two thumbs up!
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top