Identifying Outliers

G

Guest

I have a population that I belive it will be represent a Gaussian (normal)
distribution if I am able to identify and remove the outliers properly. I now
understand I cannot use the Grubb's Test as outlined in a couple of websites
I found (see http://www.graphpad.com/articles/outlier.htm and
http://www.itl.nist.gov/div898/software/dataplot/refman1/auxillar/grubtest.htm) because I have many outliers and the distribution is not normal.

Based on a discussion I found on this forums (subject: "how do I identify
statistical outliers with excel", dated: 1/7/2007), I decided to use the the
Interquartile Range method (IQR); but the following formula is giving me an
annoying #VALUE! error:

=if(or(D1 < A1 - 3*B1, D1 > A2 + 3*B1), "outlier?", "")

I modified that formula so that D1 is the value I am testing, A1 is the
value of the 1st quartile, B1 is the value between the 3rd quartile minus the
1st quartile, and A2 is the value of the 3rd quartile. But I got the same
error message. Then I removed the spaces in the formula, but I still get the
same error. It is very frustating, and I don't know how to fix this.

Any suggestions?
 
G

Guest

This works for me: =IF(OR(D1<(A1-(3*B1)),D1>(A2+(3*B1))),"outlier?","")

You're missing many parentheses in your formula below; also the spaces make
it hard to read what you're trying to do.

See if my proposal works.

Dave
 
G

Guest

Thanks Dave,

This is is very interesting, now I am not getting the error message #VALUE!;
now I am getting the word FALSE in the cell as a result instead getting the
text "outlier?".
 
G

Guest

Well, FALSE is an improvement because it indicates the formula is calculating.

IF statements work with the following logic: IF A, Then B, Else C. Stated
another way: If A, value if A is true, value if A is false. Absent a value
in the false argument, FALSE is returned.

So: check your formula and the values in the cells that the formula refers
to. Are the values in the cells numbers? Are they formatted as numbers? Do
you have all your parentheses in order? Etc.

Dave
 
G

Guest

It appears to be right. I don't know what is wrong. I am comparing:
=IF(OR(G3027<(G3035-(3*G3040)),G3027>(G3037+(3*G3040))),"outlier?","")
=IF(OR(D1<(A1-(3*B1)),D1>(A2+(3*B1))),"outlier?","")

My formula vs. your notation for the formula I am using.

The cells are all values resulting from other formulas on my spreadsheet;
all are numbers.

I was reading more about the OR function, it appears that the OR function
always return a True or FALSE value. But how do I make understand Excel that
it needs to treat the content of the OR formula as valid (in this case) so
that the result is printed as "outlier?" ???

Maybe there is something I am missing... do you think is because my cells
are not number-values but numbers resulting from other formulas? everything
else seems to be working properly on my Excel spreadsheet.
 
G

Guest

It's hard to say what the issue is. What are the actual values in G3027,
etc. Are they formatted as numbers, as opposed to text?

OR tests for TRUE or FALSE values: =OR(1+2=10,2+2=4) resolves to TRUE
because one calculation is TRUE. On the other hand =OR(1+2=2,2+2=24332)
resolves to FALSE because neither calculation is true. Therefore your
formula says: "If [calculation A] OR [calculation B] is correct, or both are
correct, then consider it an outlier, else return an empty string."

Dave
 
G

Guest

The following are the values and respective format I have on the cells:

G3027 value is 0.257 and Format Cell as Number using 3 decimal places.
G3035 value is 0.528 and Format Cell as Number using 3 decimal places.
G3040 value is 0.253 and Format Cell as Number using 3 decimal places.
G3037 value is 0.781 and Format Cell as Number using 3 decimal places.

Eugenio


Dave F said:
It's hard to say what the issue is. What are the actual values in G3027,
etc. Are they formatted as numbers, as opposed to text?

OR tests for TRUE or FALSE values: =OR(1+2=10,2+2=4) resolves to TRUE
because one calculation is TRUE. On the other hand =OR(1+2=2,2+2=24332)
resolves to FALSE because neither calculation is true. Therefore your
formula says: "If [calculation A] OR [calculation B] is correct, or both are
correct, then consider it an outlier, else return an empty string."

Dave
--
A hint to posters: Specific, detailed questions are more likely to be
answered than questions that provide no detail about your problem.


Quco said:
It appears to be right. I don't know what is wrong. I am comparing:
=IF(OR(G3027<(G3035-(3*G3040)),G3027>(G3037+(3*G3040))),"outlier?","")
=IF(OR(D1<(A1-(3*B1)),D1>(A2+(3*B1))),"outlier?","")

My formula vs. your notation for the formula I am using.

The cells are all values resulting from other formulas on my spreadsheet;
all are numbers.

I was reading more about the OR function, it appears that the OR function
always return a True or FALSE value. But how do I make understand Excel that
it needs to treat the content of the OR formula as valid (in this case) so
that the result is printed as "outlier?" ???

Maybe there is something I am missing... do you think is because my cells
are not number-values but numbers resulting from other formulas? everything
else seems to be working properly on my Excel spreadsheet.
 
G

Guest

Using the number values instead of the cell references,
=IF(OR(0.257<(0.528-(3*0.253)),0.257>(0.781+(3*0.253))),"outlier?","")
resolves to a blank.

Dave
--
A hint to posters: Specific, detailed questions are more likely to be
answered than questions that provide no detail about your problem.


Quco said:
The following are the values and respective format I have on the cells:

G3027 value is 0.257 and Format Cell as Number using 3 decimal places.
G3035 value is 0.528 and Format Cell as Number using 3 decimal places.
G3040 value is 0.253 and Format Cell as Number using 3 decimal places.
G3037 value is 0.781 and Format Cell as Number using 3 decimal places.

Eugenio


Dave F said:
It's hard to say what the issue is. What are the actual values in G3027,
etc. Are they formatted as numbers, as opposed to text?

OR tests for TRUE or FALSE values: =OR(1+2=10,2+2=4) resolves to TRUE
because one calculation is TRUE. On the other hand =OR(1+2=2,2+2=24332)
resolves to FALSE because neither calculation is true. Therefore your
formula says: "If [calculation A] OR [calculation B] is correct, or both are
correct, then consider it an outlier, else return an empty string."

Dave
--
A hint to posters: Specific, detailed questions are more likely to be
answered than questions that provide no detail about your problem.


Quco said:
It appears to be right. I don't know what is wrong. I am comparing:
=IF(OR(G3027<(G3035-(3*G3040)),G3027>(G3037+(3*G3040))),"outlier?","")
=IF(OR(D1<(A1-(3*B1)),D1>(A2+(3*B1))),"outlier?","")

My formula vs. your notation for the formula I am using.

The cells are all values resulting from other formulas on my spreadsheet;
all are numbers.

I was reading more about the OR function, it appears that the OR function
always return a True or FALSE value. But how do I make understand Excel that
it needs to treat the content of the OR formula as valid (in this case) so
that the result is printed as "outlier?" ???

Maybe there is something I am missing... do you think is because my cells
are not number-values but numbers resulting from other formulas? everything
else seems to be working properly on my Excel spreadsheet.

:

Well, FALSE is an improvement because it indicates the formula is calculating.

IF statements work with the following logic: IF A, Then B, Else C. Stated
another way: If A, value if A is true, value if A is false. Absent a value
in the false argument, FALSE is returned.

So: check your formula and the values in the cells that the formula refers
to. Are the values in the cells numbers? Are they formatted as numbers? Do
you have all your parentheses in order? Etc.

Dave
--
A hint to posters: Specific, detailed questions are more likely to be
answered than questions that provide no detail about your problem.


:

Thanks Dave,

This is is very interesting, now I am not getting the error message #VALUE!;
now I am getting the word FALSE in the cell as a result instead getting the
text "outlier?".


:

This works for me: =IF(OR(D1<(A1-(3*B1)),D1>(A2+(3*B1))),"outlier?","")

You're missing many parentheses in your formula below; also the spaces make
it hard to read what you're trying to do.

See if my proposal works.

Dave
--
A hint to posters: Specific, detailed questions are more likely to be
answered than questions that provide no detail about your problem.


:

I have a population that I belive it will be represent a Gaussian (normal)
distribution if I am able to identify and remove the outliers properly. I now
understand I cannot use the Grubb's Test as outlined in a couple of websites
I found (see http://www.graphpad.com/articles/outlier.htm and
http://www.itl.nist.gov/div898/software/dataplot/refman1/auxillar/grubtest.htm) because I have many outliers and the distribution is not normal.

Based on a discussion I found on this forums (subject: "how do I identify
statistical outliers with excel", dated: 1/7/2007), I decided to use the the
Interquartile Range method (IQR); but the following formula is giving me an
annoying #VALUE! error:

=if(or(D1 < A1 - 3*B1, D1 > A2 + 3*B1), "outlier?", "")

I modified that formula so that D1 is the value I am testing, A1 is the
value of the 1st quartile, B1 is the value between the 3rd quartile minus the
1st quartile, and A2 is the value of the 3rd quartile. But I got the same
error message. Then I removed the spaces in the formula, but I still get the
same error. It is very frustating, and I don't know how to fix this.

Any suggestions?
 
G

Guest

OK. What Excel version are you using? I still get the FALSE value as a
result. This is very strange. I am now thinking my Excel version is not
reading correctly my formula. Is that possible?

I am using the trial version of Excel 2007.

Eugenio


Dave F said:
Using the number values instead of the cell references,
=IF(OR(0.257<(0.528-(3*0.253)),0.257>(0.781+(3*0.253))),"outlier?","")
resolves to a blank.

Dave
--
A hint to posters: Specific, detailed questions are more likely to be
answered than questions that provide no detail about your problem.


Quco said:
The following are the values and respective format I have on the cells:

G3027 value is 0.257 and Format Cell as Number using 3 decimal places.
G3035 value is 0.528 and Format Cell as Number using 3 decimal places.
G3040 value is 0.253 and Format Cell as Number using 3 decimal places.
G3037 value is 0.781 and Format Cell as Number using 3 decimal places.

Eugenio


Dave F said:
It's hard to say what the issue is. What are the actual values in G3027,
etc. Are they formatted as numbers, as opposed to text?

OR tests for TRUE or FALSE values: =OR(1+2=10,2+2=4) resolves to TRUE
because one calculation is TRUE. On the other hand =OR(1+2=2,2+2=24332)
resolves to FALSE because neither calculation is true. Therefore your
formula says: "If [calculation A] OR [calculation B] is correct, or both are
correct, then consider it an outlier, else return an empty string."

Dave
--
A hint to posters: Specific, detailed questions are more likely to be
answered than questions that provide no detail about your problem.


:

It appears to be right. I don't know what is wrong. I am comparing:
=IF(OR(G3027<(G3035-(3*G3040)),G3027>(G3037+(3*G3040))),"outlier?","")
=IF(OR(D1<(A1-(3*B1)),D1>(A2+(3*B1))),"outlier?","")

My formula vs. your notation for the formula I am using.

The cells are all values resulting from other formulas on my spreadsheet;
all are numbers.

I was reading more about the OR function, it appears that the OR function
always return a True or FALSE value. But how do I make understand Excel that
it needs to treat the content of the OR formula as valid (in this case) so
that the result is printed as "outlier?" ???

Maybe there is something I am missing... do you think is because my cells
are not number-values but numbers resulting from other formulas? everything
else seems to be working properly on my Excel spreadsheet.

:

Well, FALSE is an improvement because it indicates the formula is calculating.

IF statements work with the following logic: IF A, Then B, Else C. Stated
another way: If A, value if A is true, value if A is false. Absent a value
in the false argument, FALSE is returned.

So: check your formula and the values in the cells that the formula refers
to. Are the values in the cells numbers? Are they formatted as numbers? Do
you have all your parentheses in order? Etc.

Dave
--
A hint to posters: Specific, detailed questions are more likely to be
answered than questions that provide no detail about your problem.


:

Thanks Dave,

This is is very interesting, now I am not getting the error message #VALUE!;
now I am getting the word FALSE in the cell as a result instead getting the
text "outlier?".


:

This works for me: =IF(OR(D1<(A1-(3*B1)),D1>(A2+(3*B1))),"outlier?","")

You're missing many parentheses in your formula below; also the spaces make
it hard to read what you're trying to do.

See if my proposal works.

Dave
--
A hint to posters: Specific, detailed questions are more likely to be
answered than questions that provide no detail about your problem.


:

I have a population that I belive it will be represent a Gaussian (normal)
distribution if I am able to identify and remove the outliers properly. I now
understand I cannot use the Grubb's Test as outlined in a couple of websites
I found (see http://www.graphpad.com/articles/outlier.htm and
http://www.itl.nist.gov/div898/software/dataplot/refman1/auxillar/grubtest.htm) because I have many outliers and the distribution is not normal.

Based on a discussion I found on this forums (subject: "how do I identify
statistical outliers with excel", dated: 1/7/2007), I decided to use the the
Interquartile Range method (IQR); but the following formula is giving me an
annoying #VALUE! error:

=if(or(D1 < A1 - 3*B1, D1 > A2 + 3*B1), "outlier?", "")

I modified that formula so that D1 is the value I am testing, A1 is the
value of the 1st quartile, B1 is the value between the 3rd quartile minus the
1st quartile, and A2 is the value of the 3rd quartile. But I got the same
error message. Then I removed the spaces in the formula, but I still get the
same error. It is very frustating, and I don't know how to fix this.

Any suggestions?
 
G

Guest

I'm using XL 03.

Dave
--
A hint to posters: Specific, detailed questions are more likely to be
answered than questions that provide no detail about your problem.


Quco said:
OK. What Excel version are you using? I still get the FALSE value as a
result. This is very strange. I am now thinking my Excel version is not
reading correctly my formula. Is that possible?

I am using the trial version of Excel 2007.

Eugenio


Dave F said:
Using the number values instead of the cell references,
=IF(OR(0.257<(0.528-(3*0.253)),0.257>(0.781+(3*0.253))),"outlier?","")
resolves to a blank.

Dave
--
A hint to posters: Specific, detailed questions are more likely to be
answered than questions that provide no detail about your problem.


Quco said:
The following are the values and respective format I have on the cells:

G3027 value is 0.257 and Format Cell as Number using 3 decimal places.
G3035 value is 0.528 and Format Cell as Number using 3 decimal places.
G3040 value is 0.253 and Format Cell as Number using 3 decimal places.
G3037 value is 0.781 and Format Cell as Number using 3 decimal places.

Eugenio


:

It's hard to say what the issue is. What are the actual values in G3027,
etc. Are they formatted as numbers, as opposed to text?

OR tests for TRUE or FALSE values: =OR(1+2=10,2+2=4) resolves to TRUE
because one calculation is TRUE. On the other hand =OR(1+2=2,2+2=24332)
resolves to FALSE because neither calculation is true. Therefore your
formula says: "If [calculation A] OR [calculation B] is correct, or both are
correct, then consider it an outlier, else return an empty string."

Dave
--
A hint to posters: Specific, detailed questions are more likely to be
answered than questions that provide no detail about your problem.


:

It appears to be right. I don't know what is wrong. I am comparing:
=IF(OR(G3027<(G3035-(3*G3040)),G3027>(G3037+(3*G3040))),"outlier?","")
=IF(OR(D1<(A1-(3*B1)),D1>(A2+(3*B1))),"outlier?","")

My formula vs. your notation for the formula I am using.

The cells are all values resulting from other formulas on my spreadsheet;
all are numbers.

I was reading more about the OR function, it appears that the OR function
always return a True or FALSE value. But how do I make understand Excel that
it needs to treat the content of the OR formula as valid (in this case) so
that the result is printed as "outlier?" ???

Maybe there is something I am missing... do you think is because my cells
are not number-values but numbers resulting from other formulas? everything
else seems to be working properly on my Excel spreadsheet.

:

Well, FALSE is an improvement because it indicates the formula is calculating.

IF statements work with the following logic: IF A, Then B, Else C. Stated
another way: If A, value if A is true, value if A is false. Absent a value
in the false argument, FALSE is returned.

So: check your formula and the values in the cells that the formula refers
to. Are the values in the cells numbers? Are they formatted as numbers? Do
you have all your parentheses in order? Etc.

Dave
--
A hint to posters: Specific, detailed questions are more likely to be
answered than questions that provide no detail about your problem.


:

Thanks Dave,

This is is very interesting, now I am not getting the error message #VALUE!;
now I am getting the word FALSE in the cell as a result instead getting the
text "outlier?".


:

This works for me: =IF(OR(D1<(A1-(3*B1)),D1>(A2+(3*B1))),"outlier?","")

You're missing many parentheses in your formula below; also the spaces make
it hard to read what you're trying to do.

See if my proposal works.

Dave
--
A hint to posters: Specific, detailed questions are more likely to be
answered than questions that provide no detail about your problem.


:

I have a population that I belive it will be represent a Gaussian (normal)
distribution if I am able to identify and remove the outliers properly. I now
understand I cannot use the Grubb's Test as outlined in a couple of websites
I found (see http://www.graphpad.com/articles/outlier.htm and
http://www.itl.nist.gov/div898/software/dataplot/refman1/auxillar/grubtest.htm) because I have many outliers and the distribution is not normal.

Based on a discussion I found on this forums (subject: "how do I identify
statistical outliers with excel", dated: 1/7/2007), I decided to use the the
Interquartile Range method (IQR); but the following formula is giving me an
annoying #VALUE! error:

=if(or(D1 < A1 - 3*B1, D1 > A2 + 3*B1), "outlier?", "")

I modified that formula so that D1 is the value I am testing, A1 is the
value of the 1st quartile, B1 is the value between the 3rd quartile minus the
1st quartile, and A2 is the value of the 3rd quartile. But I got the same
error message. Then I removed the spaces in the formula, but I still get the
same error. It is very frustating, and I don't know how to fix this.

Any suggestions?
 
G

Guest

Do not retype Dave's formula; copy from his post and paste it into the
formula bar.

The structure of Dave's formula is
=IF(<condition>,"outlier?","")
the only possible return values (in any version of Excel) are therefore
"outlier?" or "". For you to get FALSE, you must have mistyped the formula.

Jerry

Quco said:
OK. What Excel version are you using? I still get the FALSE value as a
result. This is very strange. I am now thinking my Excel version is not
reading correctly my formula. Is that possible?

I am using the trial version of Excel 2007.

Eugenio


Dave F said:
Using the number values instead of the cell references,
=IF(OR(0.257<(0.528-(3*0.253)),0.257>(0.781+(3*0.253))),"outlier?","")
resolves to a blank.

Dave
--
A hint to posters: Specific, detailed questions are more likely to be
answered than questions that provide no detail about your problem.


Quco said:
The following are the values and respective format I have on the cells:

G3027 value is 0.257 and Format Cell as Number using 3 decimal places.
G3035 value is 0.528 and Format Cell as Number using 3 decimal places.
G3040 value is 0.253 and Format Cell as Number using 3 decimal places.
G3037 value is 0.781 and Format Cell as Number using 3 decimal places.

Eugenio


:

It's hard to say what the issue is. What are the actual values in G3027,
etc. Are they formatted as numbers, as opposed to text?

OR tests for TRUE or FALSE values: =OR(1+2=10,2+2=4) resolves to TRUE
because one calculation is TRUE. On the other hand =OR(1+2=2,2+2=24332)
resolves to FALSE because neither calculation is true. Therefore your
formula says: "If [calculation A] OR [calculation B] is correct, or both are
correct, then consider it an outlier, else return an empty string."

Dave
--
A hint to posters: Specific, detailed questions are more likely to be
answered than questions that provide no detail about your problem.


:

It appears to be right. I don't know what is wrong. I am comparing:
=IF(OR(G3027<(G3035-(3*G3040)),G3027>(G3037+(3*G3040))),"outlier?","")
=IF(OR(D1<(A1-(3*B1)),D1>(A2+(3*B1))),"outlier?","")

My formula vs. your notation for the formula I am using.

The cells are all values resulting from other formulas on my spreadsheet;
all are numbers.

I was reading more about the OR function, it appears that the OR function
always return a True or FALSE value. But how do I make understand Excel that
it needs to treat the content of the OR formula as valid (in this case) so
that the result is printed as "outlier?" ???

Maybe there is something I am missing... do you think is because my cells
are not number-values but numbers resulting from other formulas? everything
else seems to be working properly on my Excel spreadsheet.

:

Well, FALSE is an improvement because it indicates the formula is calculating.

IF statements work with the following logic: IF A, Then B, Else C. Stated
another way: If A, value if A is true, value if A is false. Absent a value
in the false argument, FALSE is returned.

So: check your formula and the values in the cells that the formula refers
to. Are the values in the cells numbers? Are they formatted as numbers? Do
you have all your parentheses in order? Etc.

Dave
--
A hint to posters: Specific, detailed questions are more likely to be
answered than questions that provide no detail about your problem.


:

Thanks Dave,

This is is very interesting, now I am not getting the error message #VALUE!;
now I am getting the word FALSE in the cell as a result instead getting the
text "outlier?".


:

This works for me: =IF(OR(D1<(A1-(3*B1)),D1>(A2+(3*B1))),"outlier?","")

You're missing many parentheses in your formula below; also the spaces make
it hard to read what you're trying to do.

See if my proposal works.

Dave
--
A hint to posters: Specific, detailed questions are more likely to be
answered than questions that provide no detail about your problem.


:

I have a population that I belive it will be represent a Gaussian (normal)
distribution if I am able to identify and remove the outliers properly. I now
understand I cannot use the Grubb's Test as outlined in a couple of websites
I found (see http://www.graphpad.com/articles/outlier.htm and
http://www.itl.nist.gov/div898/software/dataplot/refman1/auxillar/grubtest.htm) because I have many outliers and the distribution is not normal.

Based on a discussion I found on this forums (subject: "how do I identify
statistical outliers with excel", dated: 1/7/2007), I decided to use the the
Interquartile Range method (IQR); but the following formula is giving me an
annoying #VALUE! error:

=if(or(D1 < A1 - 3*B1, D1 > A2 + 3*B1), "outlier?", "")

I modified that formula so that D1 is the value I am testing, A1 is the
value of the 1st quartile, B1 is the value between the 3rd quartile minus the
1st quartile, and A2 is the value of the 3rd quartile. But I got the same
error message. Then I removed the spaces in the formula, but I still get the
same error. It is very frustating, and I don't know how to fix this.

Any suggestions?
 
D

David Biddulph

You shouldn't be getting FALSE from that formula, whatever the inputs,
because both outcomes of the IF are defined as strings.
Did you literally copy and paste from the formula bar to the newsgroup
posting, or might something have been mistyped?
Try copying back from your posting below to the spreadsheet, or try copying
=IF(OR(G3027<G3035-3*G3040,G3027>G3037+3*G3040),"outlier?","")
into your spreadsheet to check.
 
G

Guest

I HAVE FOUND THE CAUSE!!!

I did type the formulas, and then I did copied them from this discussion,
but it was still not working.

I then opened the same spreadsheet using another computer with Excel 2003.
Excel 2003 does not show the formula as Excel 2007 does. Here's what it was
causing the problem:

Excel 2007 shows by default a one-line portion of the formula bar affecting
the cell. There is an icon at the far right side of the formula type area. If
you place the cursor there, a legend that says "Expand Formula Bar (Ctrl +
Shift + U)" appears.

Excel 2003 showed me the whole content of the formula bar when I opened the
same spreadsheet. It shows the whole content of the formula bar area!

Excel 2007 did not showed me the remaining portions of an older formula
within that cell. That's why it was acting that way. Can you beleive this?
Thanks all for your help!



Jerry W. Lewis said:
Do not retype Dave's formula; copy from his post and paste it into the
formula bar.

The structure of Dave's formula is
=IF(<condition>,"outlier?","")
the only possible return values (in any version of Excel) are therefore
"outlier?" or "". For you to get FALSE, you must have mistyped the formula.

Jerry

Quco said:
OK. What Excel version are you using? I still get the FALSE value as a
result. This is very strange. I am now thinking my Excel version is not
reading correctly my formula. Is that possible?

I am using the trial version of Excel 2007.

Eugenio


Dave F said:
Using the number values instead of the cell references,
=IF(OR(0.257<(0.528-(3*0.253)),0.257>(0.781+(3*0.253))),"outlier?","")
resolves to a blank.

Dave
--
A hint to posters: Specific, detailed questions are more likely to be
answered than questions that provide no detail about your problem.


:

The following are the values and respective format I have on the cells:

G3027 value is 0.257 and Format Cell as Number using 3 decimal places.
G3035 value is 0.528 and Format Cell as Number using 3 decimal places.
G3040 value is 0.253 and Format Cell as Number using 3 decimal places.
G3037 value is 0.781 and Format Cell as Number using 3 decimal places.

Eugenio


:

It's hard to say what the issue is. What are the actual values in G3027,
etc. Are they formatted as numbers, as opposed to text?

OR tests for TRUE or FALSE values: =OR(1+2=10,2+2=4) resolves to TRUE
because one calculation is TRUE. On the other hand =OR(1+2=2,2+2=24332)
resolves to FALSE because neither calculation is true. Therefore your
formula says: "If [calculation A] OR [calculation B] is correct, or both are
correct, then consider it an outlier, else return an empty string."

Dave
--
A hint to posters: Specific, detailed questions are more likely to be
answered than questions that provide no detail about your problem.


:

It appears to be right. I don't know what is wrong. I am comparing:
=IF(OR(G3027<(G3035-(3*G3040)),G3027>(G3037+(3*G3040))),"outlier?","")
=IF(OR(D1<(A1-(3*B1)),D1>(A2+(3*B1))),"outlier?","")

My formula vs. your notation for the formula I am using.

The cells are all values resulting from other formulas on my spreadsheet;
all are numbers.

I was reading more about the OR function, it appears that the OR function
always return a True or FALSE value. But how do I make understand Excel that
it needs to treat the content of the OR formula as valid (in this case) so
that the result is printed as "outlier?" ???

Maybe there is something I am missing... do you think is because my cells
are not number-values but numbers resulting from other formulas? everything
else seems to be working properly on my Excel spreadsheet.

:

Well, FALSE is an improvement because it indicates the formula is calculating.

IF statements work with the following logic: IF A, Then B, Else C. Stated
another way: If A, value if A is true, value if A is false. Absent a value
in the false argument, FALSE is returned.

So: check your formula and the values in the cells that the formula refers
to. Are the values in the cells numbers? Are they formatted as numbers? Do
you have all your parentheses in order? Etc.

Dave
--
A hint to posters: Specific, detailed questions are more likely to be
answered than questions that provide no detail about your problem.


:

Thanks Dave,

This is is very interesting, now I am not getting the error message #VALUE!;
now I am getting the word FALSE in the cell as a result instead getting the
text "outlier?".


:

This works for me: =IF(OR(D1<(A1-(3*B1)),D1>(A2+(3*B1))),"outlier?","")

You're missing many parentheses in your formula below; also the spaces make
it hard to read what you're trying to do.

See if my proposal works.

Dave
--
A hint to posters: Specific, detailed questions are more likely to be
answered than questions that provide no detail about your problem.


:

I have a population that I belive it will be represent a Gaussian (normal)
distribution if I am able to identify and remove the outliers properly. I now
understand I cannot use the Grubb's Test as outlined in a couple of websites
I found (see http://www.graphpad.com/articles/outlier.htm and
http://www.itl.nist.gov/div898/software/dataplot/refman1/auxillar/grubtest.htm) because I have many outliers and the distribution is not normal.

Based on a discussion I found on this forums (subject: "how do I identify
statistical outliers with excel", dated: 1/7/2007), I decided to use the the
Interquartile Range method (IQR); but the following formula is giving me an
annoying #VALUE! error:

=if(or(D1 < A1 - 3*B1, D1 > A2 + 3*B1), "outlier?", "")

I modified that formula so that D1 is the value I am testing, A1 is the
value of the 1st quartile, B1 is the value between the 3rd quartile minus the
1st quartile, and A2 is the value of the 3rd quartile. But I got the same
error message. Then I removed the spaces in the formula, but I still get the
same error. It is very frustating, and I don't know how to fix this.

Any suggestions?
 
G

Guest

That's a weird one. I say you sue Microsoft.
--
A hint to posters: Specific, detailed questions are more likely to be
answered than questions that provide no detail about your problem.


Quco said:
I HAVE FOUND THE CAUSE!!!

I did type the formulas, and then I did copied them from this discussion,
but it was still not working.

I then opened the same spreadsheet using another computer with Excel 2003.
Excel 2003 does not show the formula as Excel 2007 does. Here's what it was
causing the problem:

Excel 2007 shows by default a one-line portion of the formula bar affecting
the cell. There is an icon at the far right side of the formula type area. If
you place the cursor there, a legend that says "Expand Formula Bar (Ctrl +
Shift + U)" appears.

Excel 2003 showed me the whole content of the formula bar when I opened the
same spreadsheet. It shows the whole content of the formula bar area!

Excel 2007 did not showed me the remaining portions of an older formula
within that cell. That's why it was acting that way. Can you beleive this?
Thanks all for your help!



Jerry W. Lewis said:
Do not retype Dave's formula; copy from his post and paste it into the
formula bar.

The structure of Dave's formula is
=IF(<condition>,"outlier?","")
the only possible return values (in any version of Excel) are therefore
"outlier?" or "". For you to get FALSE, you must have mistyped the formula.

Jerry

Quco said:
OK. What Excel version are you using? I still get the FALSE value as a
result. This is very strange. I am now thinking my Excel version is not
reading correctly my formula. Is that possible?

I am using the trial version of Excel 2007.

Eugenio


:

Using the number values instead of the cell references,
=IF(OR(0.257<(0.528-(3*0.253)),0.257>(0.781+(3*0.253))),"outlier?","")
resolves to a blank.

Dave
--
A hint to posters: Specific, detailed questions are more likely to be
answered than questions that provide no detail about your problem.


:

The following are the values and respective format I have on the cells:

G3027 value is 0.257 and Format Cell as Number using 3 decimal places.
G3035 value is 0.528 and Format Cell as Number using 3 decimal places.
G3040 value is 0.253 and Format Cell as Number using 3 decimal places.
G3037 value is 0.781 and Format Cell as Number using 3 decimal places.

Eugenio


:

It's hard to say what the issue is. What are the actual values in G3027,
etc. Are they formatted as numbers, as opposed to text?

OR tests for TRUE or FALSE values: =OR(1+2=10,2+2=4) resolves to TRUE
because one calculation is TRUE. On the other hand =OR(1+2=2,2+2=24332)
resolves to FALSE because neither calculation is true. Therefore your
formula says: "If [calculation A] OR [calculation B] is correct, or both are
correct, then consider it an outlier, else return an empty string."

Dave
--
A hint to posters: Specific, detailed questions are more likely to be
answered than questions that provide no detail about your problem.


:

It appears to be right. I don't know what is wrong. I am comparing:
=IF(OR(G3027<(G3035-(3*G3040)),G3027>(G3037+(3*G3040))),"outlier?","")
=IF(OR(D1<(A1-(3*B1)),D1>(A2+(3*B1))),"outlier?","")

My formula vs. your notation for the formula I am using.

The cells are all values resulting from other formulas on my spreadsheet;
all are numbers.

I was reading more about the OR function, it appears that the OR function
always return a True or FALSE value. But how do I make understand Excel that
it needs to treat the content of the OR formula as valid (in this case) so
that the result is printed as "outlier?" ???

Maybe there is something I am missing... do you think is because my cells
are not number-values but numbers resulting from other formulas? everything
else seems to be working properly on my Excel spreadsheet.

:

Well, FALSE is an improvement because it indicates the formula is calculating.

IF statements work with the following logic: IF A, Then B, Else C. Stated
another way: If A, value if A is true, value if A is false. Absent a value
in the false argument, FALSE is returned.

So: check your formula and the values in the cells that the formula refers
to. Are the values in the cells numbers? Are they formatted as numbers? Do
you have all your parentheses in order? Etc.

Dave
--
A hint to posters: Specific, detailed questions are more likely to be
answered than questions that provide no detail about your problem.


:

Thanks Dave,

This is is very interesting, now I am not getting the error message #VALUE!;
now I am getting the word FALSE in the cell as a result instead getting the
text "outlier?".


:

This works for me: =IF(OR(D1<(A1-(3*B1)),D1>(A2+(3*B1))),"outlier?","")

You're missing many parentheses in your formula below; also the spaces make
it hard to read what you're trying to do.

See if my proposal works.

Dave
--
A hint to posters: Specific, detailed questions are more likely to be
answered than questions that provide no detail about your problem.


:

I have a population that I belive it will be represent a Gaussian (normal)
distribution if I am able to identify and remove the outliers properly. I now
understand I cannot use the Grubb's Test as outlined in a couple of websites
I found (see http://www.graphpad.com/articles/outlier.htm and
http://www.itl.nist.gov/div898/software/dataplot/refman1/auxillar/grubtest.htm) because I have many outliers and the distribution is not normal.

Based on a discussion I found on this forums (subject: "how do I identify
statistical outliers with excel", dated: 1/7/2007), I decided to use the the
Interquartile Range method (IQR); but the following formula is giving me an
annoying #VALUE! error:

=if(or(D1 < A1 - 3*B1, D1 > A2 + 3*B1), "outlier?", "")

I modified that formula so that D1 is the value I am testing, A1 is the
value of the 1st quartile, B1 is the value between the 3rd quartile minus the
1st quartile, and A2 is the value of the 3rd quartile. But I got the same
error message. Then I removed the spaces in the formula, but I still get the
same error. It is very frustating, and I don't know how to fix this.

Any suggestions?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top