How long for the first group

P

PAL

I have a spreadsheet with many rows (1000s) and columns. I am trying to
determine how long it took for a percentage of franchises to open. Lets say
I have:

Column A: Company Name
Column B: Date when product was ready (ie Company A would have the same date
on each row )
Column C: Franchise Name
Column D: Date franchise opened.
Column E: Col D - Col B

If there were a 1000 rows, 500 of which were for company A. I would like to
find the amount of time it took for 50% (the first 50 sites for this example)
of the franchises to open. I am thinking it has to order the dates in column
D, then find the franchise that was 50th and then take column E as the
answer.

I get that it is an array, but get stuck after that.

Ideas.
 
B

Bernie Deitrick

Pal,

=INDEX(E:E,SUMPRODUCT((D1:D20000=(LARGE(IF(A1:A20000="Company
A",D1:D20000),COUNTIF(A1:A20000,"Company A")/2)))*(A1:A20000="Company
A")*ROW(A1:A20000)))

Array entered using Ctrl-Shift-Enter. "Company A" can also be a cell
reference, in case you want to make a table - use advanced filtering to
extract the unique list from your column of company names.

HTH,
Bernie
MS Excel MVP
 
B

Bernie Deitrick

Also, I should have noted that I have assumed that the dates are unique in
column D for any one company - i.e., they did not open multiple franchises
on the same day.

Bernie
 
P

PAL

This is great. Thank you. A few points to fine tune....

I changed "large" to "small" in order for it to start from the smallest
(10th, 25th percentile....).

Also,

1) multiple franchises can be opened on the same day.
2) it is possible that product ready, col B, may be blank. Anyway to force
a blank instead of #"Val"
3) if the number of franchise is "1" we get the "#NUM!". Anway to force a
blank.
4) I also noticed if the number of franchises is "2" or "3". That while it
calucates a 50th or 80th percentile it gives the "#NUM!" for the 25th
percentile.

Any thoughts to clean this up is appreciated, but regardless thanks much.
 
B

Bernie Deitrick

See my comments in-line....

This is great. Thank you.


You're welcome.
A few points to fine tune....

I changed "large" to "small" in order for it to start from the smallest
(10th, 25th percentile....).

Also,

1) multiple franchises can be opened on the same day.

On second thought, this really shouldn't matter, since the value in column E
should be the same.

2) it is possible that product ready, col B, may be blank. Anyway to
force
a blank instead of #"Val"

We weren't using column B andywhere... perhaps wrap the formula in

=IF(B2=""","",LongFormula)
3) if the number of franchise is "1" we get the "#NUM!". Anway to force a
blank.

=IF(COUNTIF(A1:A20000,"Company A")=1,"",IF(B2=""","",LongFormula))
or
=IF(COUNTIF(A1:A20000,"Company A")=1,E2,IF(B2=""","",LongFormula))

4) I also noticed if the number of franchises is "2" or "3". That while it
calucates a 50th or 80th percentile it gives the "#NUM!" for the 25th
percentile.

Use the same technique, along the lines of

=IF(COUNTIF(A1:A20000,"Company A")<3,"Something other than 25th
percentile",Rest of the formula here)

HTH,
Bernie
MS Excel MVP
 
P

PAL

Lots going on in this one. If I focus in on #2 only, I get the blank I am
looking for, but when it is not blank it returns a "false".

=IF(COUNTIF(A1:A20000,P2)=1,"",IF(B2=""",""",INDEX(H:H,SUMPRODUCT(($F$2:$F$1045=(SMALL(IF($A$2:$A$1045=P2,$F$2:$F$1045),COUNTIF($A$2:$A$1045,P2)/4)))*($A$2:$A$1045=P2)*ROW($A$2:$A$1045)))))
 
B

Bernie Deitrick

PAL,

Try changing

IF(B2=""","""

to

IF(B2="","",

You have three double quotes in a row instead of just two.

HTH,
Bernie
MS Excel MVP
 
P

PAL

Hi Bernie,

Wasn't sure if I should post this as a new thread....anyway....

The formula you provided works great. I have pasted in two of them: one
for 50%, the other for 90%. Both formulas seem to work 95% of the time.
Occasionally, I get a "0" for the 50% (or another percentage), while the 90%
works fine. I am not sure why. I thought this to be a stat problem, but the
number of cells it is using is greater than 10. The other fields look fine.
Any ideas.

50%

=IF(Work!G32="","",IF(COUNTIF(Work!$A$2:$A$2000,A33)<=5,"",INDEX(Work!H:H,SUMPRODUCT((Work!$F$2:$F$2000=(SMALL(IF(Work!$A$2:$A$2000=A33,Work!$F$2:$F$2000),COUNTIF(Work!$A$2:$A$2000,A33)/2)))*(Work!$A$2:$A$2000=A33)*ROW(Work!$A$2:$A$2000)))))

90%

=IF(Work!G32="","",IF(COUNTIF(Work!$A$2:$A$2000,A33)<=5,"",INDEX(Work!H:H,SUMPRODUCT((Work!$F$2:$F$2000=(SMALL(IF(Work!$A$2:$A$2000=A33,Work!$F$2:$F$2000),COUNTIF(Work!$A$2:$A$2000,A33)/1.11)))*(Work!$A$2:$A$2000=A33)*ROW(Work!$A$2:$A$2000)))))
 
B

Bernie Deitrick

It is very hard to diagnose the problems without the data - what I like to do is cut down the data
set to, say, 20 rows, and change the formula to reflect that restricted range. Get the problem to
manifest at that level, then select parts of the formula in Edit mode and press F9 to get them to
evaluate - then you can process the formula part by part to figure out what is happening.

HTH,
Bernie
MS Excel MVP
 
P

PAL

Near as I can tell its a stats problem. I have done it two ways, removed all
the data with the exception of the data generating these numbers or I kept it
all in. The 90th was working originally.

The numbers in the data set are
152,239,239,244,244,244,263,273,364,455,482,482

I can get the 50th and 90th percentile if I delete 3 of the numbers - 482,
244, 244. Obviously the dups are messing it up. If I remove 2 x 244 it
works. Obviously, I can't control the data so it seems like I have a
problem.

Not sure where to go from here.
 
B

Bernie Deitrick

What results would you expect? With those numbers I get 244 as the 50%ile and 455 as the 90%ile.
But if I change

COUNTIF(Work!$A$2:$A$2000,A33)/1.11

to

ROUND(COUNTIF(Work!$A$2:$A$2000,A33)/1.11,0)

then I get 482 as the 90%ile.

HTH,
Bernie
MS Excel MVP
 
P

PAL

Sorry for the delay. Got pulled away, but was probably good to get away.
Not sure what I did last time as it was the wrong data. I need back track
to your previous suggestion and recreate.

Within the big spreadsheet, for 50th, I get 0. For 90th, I get 552.
 
P

PAL

Same thing with the shortened data set....

249,336, 370,341,336,341,461,341,360,552,579,579

If I pull the numbers out and do straight percentile I get 350, 578.
ie =PERCENTILE(range,0.50) ETC...which is what I would expect.
 
B

Bernie Deitrick

PAL,

I'm not sure why you are getting a zero - perhaps you have duplicates, and the SUMPRODUCT is
actually summing for two or more rather than just finding one (that is one drawback to SUMPRODUCT).

Can you email me a book or sheet that shows the problem?

HTH,
Bernie
MS Excel MVP
 
P

PAL

Yes, I can send. Trying to remove the noise and the company name from
properties. Where to post? Thanks. My guess is your assumption is correct
 
B

Bernie Deitrick

PAL,

deitbe at consumer dot org

Bernie

PAL said:
Yes, I can send. Trying to remove the noise and the company name from
properties. Where to post? Thanks. My guess is your assumption is
correct
 
B

Bernie Deitrick

PAL,

It was that there are multiple rows returned by the sumproduct. You can
array enter:

=IF(Work!$B$2:$B$23="","",IF(COUNTIF(Work!$A$2:$A$23,A4)<=5,"",INDEX(Work!D:D,SUMPRODUCT(MAX((Work!$C$2:$C$23=(SMALL(IF(Work!$A$2:$A$23=A4,Work!$C$2:$C$23),COUNTIF(Work!$A$2:$A$23,A4)/2)))*(Work!$A$2:$A$23=A4)*ROW(Work!$A$2:$A$23))))))

The sumproduct in this will return the last entry that meets the criteria
(the MAX that I added), but since you are looking at identical numbers, it
doesn't really matter.

I will send you the fixed workbook example.

HTH,
Bernie
MS Exel MVP
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top