Counting Unique Values

G

Guest

I have a column with the following dates:

11/23/05
11/23/05
11/23/05
11/25/05
11/25/05
11/28/05
12/1/05
12/1/05
12/1/05

I want a formula that will tell me how many unique dates there are in the
column. Can anyone help?

thanks,
Bob
 
G

Guest

If memory serves, I believe I've seen this formula suggested by others. It
should ignore blank cells. Change range as needed.

=SUMPRODUCT(--(A1:A6<>""),1/COUNTIF(A1:A6,A1:A6&""))

If there are no empty cells in your data you could shorten it to
=SUMPRODUCT(1/COUNTIF(A1:A5,A1:A5))
 
E

Epinn

Biff,

That's brilliant! Some time ago, I found the following array formula suggested by an expert.

{=SUM(COUNTIF(A1:A15,A1:A15)/IF(NOT(COUNTIF(A1:A15,A1:A15)),1,COUNTIF(A1:A15,A1:A15))^2)}

Like your formula, it takes care of blanks in the array as well. Can you confirm that both your formula and the above formula do exactly the same thing i.e. count unique text and numeric values in a range which may contain blanks. If yes, I'll replace the above with your formula.

Regarding the formula

=SUMPRODUCT((A1:A15<>"")/COUNTIF(A1:A15,A1:A15&""))

can you explain the &"" part please? I know if I remove &"" I will get the #DIV/0! error. But I can't tell from "evaluate formula" what &"" is doing?

One more question:-

I know from my record and JMB's comment that this formula

=SUMPRODUCT(1/COUNTIF(A1:A15,A1:A15))

will give an error when there is a blank in the array. When I change it to

=SUMPRODUCT(1/COUNTIF(A1:A15,A1:A15&""))

I get a result which counts the blank(s) as well. I guess if I want blanks counted, I can use this formula, right?

Another alternative to count unique records is to use Advanced filter, unique records, and COUNTA. This doesn't require any analysis and understanding. :)

Biff, I look forward to your guidance. Thanks!

Epinn

Try this:

=SUMPRODUCT(--(A1:A9<>"")/COUNTIF(A1:A9,A1:A9&""))

Biff
 
E

Epinn

I only do that when I'm bored and there's nothing to play with!

Do you want to play with the following?

{=SUM(IF(FREQUENCY(IF(LEN(A2:A10)>0,MATCH(A2:A10,A2:A10,0),""), IF(LEN(A2:A10)>0,MATCH(A2:A10,A2:A10,0),""))>0,1))}

If I remember correctly, the above is from Help which usually tells us the "inefficient" way of doing things. Glad to have someone like you, JMB and Luc around. By the way, Luc, you were brilliant the other day switching roles between the LOOKUP table and the data set (LOOKUP values).

Bob, I hope you don't mind me showing up. Also, please ignore the formulae I posted because I don't want to confuse you.

Epinn


I only do that when I'm bored and there's nothing to play with!

Biff
 
B

Bob Phillips

This is my standard reply to this question

Let's start by defining the range A1:A20 to talk specifics.

Bob,John,Bob,Bob,John,John,Bob,Bill,Bill,Max


or data in just A1:A10


The basic formula to count unique items is
=SUMPRODUCT(1/COUNTIF($A$1:$A$10,$A$1:$A$10))
The COUNTIF($A$1:$A$10,$A$1:$A$10) part of the formula builds an array of
the number of occurrences of each item, in this case{4;3;4;4;3;3;4;2;2;1}.
As can be seen, each occurrence of the repeated value is counted, so there
are four occurrences of Bob in the array. There will always be the same
number of occurrences of value as the count of that value, unless two or
more items are repeated the same number of times, in which case it will be
some multiple of that count.
Thus the item that is repeated 4 times has 4 instances of that count,
dividing 1 by the count of 4, gives 0.25 4 times. The full array of values
is
{0.25;0.333333333333333;0.25;0.25;0.333333333333333;0.333333333333333;0.25;0
..5;0.5;1}.
The item that repeats 4 times sums to 1. The item that repeats 3 times also
sums to 1. It should be clear from this that every value works in the same
way and sums to 1. In other words, 1 is returned for every unique item. The
sum of these values becomes the count of unique items.
As our test range is A1:A20, and some of the items in A1:A20 are blank,
extending this formula to A1:A20 would return a #DIV/0! Error.
The reason for the error is blank cells in the full range A1:A20. Each blank
cell returns a 0 value from the COUNTIF formula, which gives the #DIV/0!
Error when divided into 1.
The solution to this is to force it to count the empty cells as well, and
not return a zero. Adding &"" to the end of the COUNTIF formula forces a
count of the blanks.
This addition on its own removes the #DIV/0! error, but will cause the
blanks to be counted as a unique item. A further addition to the formula
resolves this by testing for those blanks. Instead of dividing the array of
counts into 1 each time, adding the test creates an array of TRUE/FALSE
values to be divided by the equivalent element in the counts array. Each
blank will resolve to FALSE in the dividend array, and the count of the
blanks in the divisor array. The result of this will be 0, so the blanks do
not get counted.


--
HTH

Bob Phillips

(replace somewhere in email address with gmail if mailing direct)

Biff,

That's brilliant! Some time ago, I found the following array formula
suggested by an expert.

{=SUM(COUNTIF(A1:A15,A1:A15)/IF(NOT(COUNTIF(A1:A15,A1:A15)),1,COUNTIF(A1:A15
,A1:A15))^2)}

Like your formula, it takes care of blanks in the array as well. Can you
confirm that both your formula and the above formula do exactly the same
thing i.e. count unique text and numeric values in a range which may contain
blanks. If yes, I'll replace the above with your formula.

Regarding the formula

=SUMPRODUCT((A1:A15<>"")/COUNTIF(A1:A15,A1:A15&""))

can you explain the &"" part please? I know if I remove &"" I will get the
#DIV/0! error. But I can't tell from "evaluate formula" what &"" is doing?

One more question:-

I know from my record and JMB's comment that this formula

=SUMPRODUCT(1/COUNTIF(A1:A15,A1:A15))

will give an error when there is a blank in the array. When I change it to

=SUMPRODUCT(1/COUNTIF(A1:A15,A1:A15&""))

I get a result which counts the blank(s) as well. I guess if I want blanks
counted, I can use this formula, right?

Another alternative to count unique records is to use Advanced filter,
unique records, and COUNTA. This doesn't require any analysis and
understanding. :)

Biff, I look forward to your guidance. Thanks!

Epinn

Try this:

=SUMPRODUCT(--(A1:A9<>"")/COUNTIF(A1:A9,A1:A9&""))

Biff
 
B

Bob Phillips

Biff,
That's brilliant! Some time ago, I found the following array formula
suggested by an expert.
{=SUM(COUNTIF(A1:A15,A1:A15)/IF(NOT(COUNTIF(A1:A15,A1:A15)),1,COUNTIF(A1:A15
,A1:A15))^2)}

Like your formula, it takes care of blanks in the array as well.
Can you confirm that both your formula and the above formula do
exactly the same thing i.e. count unique text and numeric values in
a range which may contain blanks. If yes, I'll replace the above with
your formula.


They are essentially the same formula. You can simplify it by removing the ^
2 and testing for blan k

=SUM((A1:A15<>"")/IF(NOT(COUNTIF(A1:A15,A1:A15)),1,COUNTIF(A1:A15,A1:A15)))

which starts to look more like the SP version.
 
G

Guest

I got the same answer with
=SUMPRODUCT(--(FREQUENCY(A2:A10,A2:A10)>0))
as the original formula. If the data was text and I wanted to use
Frequency, I think you would have to use the formula you posted. Frequency
doesn't like text, but it ignores blanks - hence the IF(Len(..), Match(..),
"") combination.

If dealing with a contiguous range, however, using Sumproduct/Countif should
work w/both numbers and text (and you can include or exclude blanks) - so I
wouldn't use Frequency.

I think Frequency would be useful for dealing w/non-contigous cells
(containing numeric data - I think it was Harlan I saw use it for that
purpose).

From testing a little, it seems Frequency ignores blanks. If data is
numeric and you want to evaluate D3:E5 and G7:H9, then

=SUMPRODUCT(--(FREQUENCY((D3:E5,G7:H9),(D3:E5,G7:H9))>0))

Not sure about if the cells are non-contiguous and the data is text.
Frequency doesn't like text and Len/Match don't seem to work w/two
dimensional arrays, much less non-contiguous cells - just based on my
observations, which doesn't mean someone can't do it.
 
E

Epinn

Bob,

Thank you for asking the question. I have learned a lot. I hope you have got what you wanted. You asked for a formula and I think you have a choice of two formulae now (COUNTIF and FREQUENCY). Please feel free to post any followup questions.

Bob P.,

Thank you for the formula. It looks smarter now. I never like ^2. The last paragraph of your standard reply is precious. I know the purpose of &"" but I don't think there is really an explanation for its formation. I'll take it as part of the syntax.

JMB,

Thank you for taking the time to experiment with FREQUENCY. I appreciate it very much.

This information is invaluable. I never thought of "non-contiguous" arrays. The formula on my record looks like this

{=SUM(IF(FREQUENCY(A2:A10,A2:A10)>0,1))} and your use of SUMPRODUCT makes it look smarter.


Bob's question is on dates which are numeric. So, I think the following formula which takes care of blanks can serve his purpose.

=SUMPRODUCT(--(FREQUENCY(A1:A15,A1:A15)>0))

I tested it and it gave the same results as =SUMPRODUCT((A1:A15<>"")/COUNTIF(A1:A15,A1:A15&""))

Bob can pick and choose now.

Thank you all for a wonderful lesson.

Epinn

I have a column with the following dates:

11/23/05
11/23/05
11/23/05
11/25/05
11/25/05
11/28/05
12/1/05
12/1/05
12/1/05

I want a formula that will tell me how many unique dates there are in the
column. Can anyone help?

thanks,
Bob
 
B

Bob Phillips

The ^2 is necessary in the first formula because COUNTIF(A1:A15,A1:A15)
returns a count a of each instance of each item. (A1:A15<>"") just returns a
TRUE/FALSE for each instance. Both are divided by the count of the
instances, but as the first is already a count of the instances, it has to
be squared so as to get the correct fractional components.

--
HTH

Bob Phillips

(replace somewhere in email address with gmail if mailing direct)

Bob,

Thank you for asking the question. I have learned a lot. I hope you have
got what you wanted. You asked for a formula and I think you have a choice
of two formulae now (COUNTIF and FREQUENCY). Please feel free to post any
followup questions.

Bob P.,

Thank you for the formula. It looks smarter now. I never like ^2. The
last paragraph of your standard reply is precious. I know the purpose of
&"" but I don't think there is really an explanation for its formation.
I'll take it as part of the syntax.

JMB,

Thank you for taking the time to experiment with FREQUENCY. I appreciate it
very much.

This information is invaluable. I never thought of "non-contiguous" arrays.
The formula on my record looks like this

{=SUM(IF(FREQUENCY(A2:A10,A2:A10)>0,1))} and your use of SUMPRODUCT makes it
look smarter.


Bob's question is on dates which are numeric. So, I think the following
formula which takes care of blanks can serve his purpose.

=SUMPRODUCT(--(FREQUENCY(A1:A15,A1:A15)>0))

I tested it and it gave the same results as
=SUMPRODUCT((A1:A15<>"")/COUNTIF(A1:A15,A1:A15&""))

Bob can pick and choose now.

Thank you all for a wonderful lesson.

Epinn
 
E

Epinn

The formulae discussed in this thread should serve Bob's needs.

Experts, please confirm that the formulae are only good as long as there are no trailing spaces.

e.g. AA111 and AA111__ ( __ are trailing spaces) will be counted twice.

If we are really picky, we'll do LEN( ), SUBSTITUTE ( ) etc., right?

Epinn

I have a column with the following dates:

11/23/05
11/23/05
11/23/05
11/25/05
11/25/05
11/28/05
12/1/05
12/1/05
12/1/05

I want a formula that will tell me how many unique dates there are in the
column. Can anyone help?

thanks,
Bob
 
E

Epinn

This is how I would remove trailing spaces.

=SUBSTITUTE(A1,CHAR(32),REPT("",255))

Please correct me if I am wrong. Thanks.

Epinn

The formulae discussed in this thread should serve Bob's needs.

Experts, please confirm that the formulae are only good as long as there are no trailing spaces.

e.g. AA111 and AA111__ ( __ are trailing spaces) will be counted twice.

If we are really picky, we'll do LEN( ), SUBSTITUTE ( ) etc., right?

Epinn

I have a column with the following dates:

11/23/05
11/23/05
11/23/05
11/25/05
11/25/05
11/28/05
12/1/05
12/1/05
12/1/05

I want a formula that will tell me how many unique dates there are in the
column. Can anyone help?

thanks,
Bob
 
R

Rob

Epinn,

The following formula entered as an array i.e. Ctrl+Shift+Enter will
correctly report 4 unique dates.

=SUM(1/COUNTIF(A1:A9,A1:A9))

HTH Rob

The formulae discussed in this thread should serve Bob's needs.

Experts, please confirm that the formulae are only good as long as there are
no trailing spaces.

e.g. AA111 and AA111__ ( __ are trailing spaces) will be counted twice.

If we are really picky, we'll do LEN( ), SUBSTITUTE ( ) etc., right?

Epinn

I have a column with the following dates:

11/23/05
11/23/05
11/23/05
11/25/05
11/25/05
11/28/05
12/1/05
12/1/05
12/1/05

I want a formula that will tell me how many unique dates there are in the
column. Can anyone help?

thanks,
Bob
 
E

Epinn

Try test your formula with a range that contains a blank and you'll know why the experts and I won't recommend it. Also, I am a fan of SUMPRODUCT, so I'll use SUMPRODUCT instead of SUM and CSE if there are no blanks.

Thanks for dropping by and sharing.

Epinn

Epinn,

The following formula entered as an array i.e. Ctrl+Shift+Enter will
correctly report 4 unique dates.

=SUM(1/COUNTIF(A1:A9,A1:A9))

HTH Rob

The formulae discussed in this thread should serve Bob's needs.

Experts, please confirm that the formulae are only good as long as there are
no trailing spaces.

e.g. AA111 and AA111__ ( __ are trailing spaces) will be counted twice.

If we are really picky, we'll do LEN( ), SUBSTITUTE ( ) etc., right?

Epinn

I have a column with the following dates:

11/23/05
11/23/05
11/23/05
11/25/05
11/25/05
11/28/05
12/1/05
12/1/05
12/1/05

I want a formula that will tell me how many unique dates there are in the
column. Can anyone help?

thanks,
Bob
 
R

Roger Govier

Absolutely no difference to the much simpler
=SUBSTITUTE(A1," ","")

--
Regards

Roger Govier


This is how I would remove trailing spaces.

=SUBSTITUTE(A1,CHAR(32),REPT("",255))

Please correct me if I am wrong. Thanks.

Epinn

The formulae discussed in this thread should serve Bob's needs.

Experts, please confirm that the formulae are only good as long as there
are no trailing spaces.

e.g. AA111 and AA111__ ( __ are trailing spaces) will be counted twice.

If we are really picky, we'll do LEN( ), SUBSTITUTE ( ) etc., right?

Epinn

I have a column with the following dates:

11/23/05
11/23/05
11/23/05
11/25/05
11/25/05
11/28/05
12/1/05
12/1/05
12/1/05

I want a formula that will tell me how many unique dates there are in
the
column. Can anyone help?

thanks,
Bob
 
R

Ragdyer

Don't understand this comment Epinn:
(emphasis mine)
<<<"so I'll use SUMPRODUCT instead of SUM and CSE *IF THERE ARE NO
BLANKS*.">>>

The Sumproduct() formula Biff posted *will work* with blanks!

--
Regards,

RD

---------------------------------------------------------------------------
Please keep all correspondence within the NewsGroup, so all may benefit !
---------------------------------------------------------------------------

Try test your formula with a range that contains a blank and you'll know why
the experts and I won't recommend it. Also, I am a fan of SUMPRODUCT, so
I'll use SUMPRODUCT instead of SUM and CSE if there are no blanks.

Thanks for dropping by and sharing.

Epinn

Epinn,

The following formula entered as an array i.e. Ctrl+Shift+Enter will
correctly report 4 unique dates.

=SUM(1/COUNTIF(A1:A9,A1:A9))

HTH Rob

The formulae discussed in this thread should serve Bob's needs.

Experts, please confirm that the formulae are only good as long as there are
no trailing spaces.

e.g. AA111 and AA111__ ( __ are trailing spaces) will be counted twice.

If we are really picky, we'll do LEN( ), SUBSTITUTE ( ) etc., right?

Epinn

I have a column with the following dates:

11/23/05
11/23/05
11/23/05
11/25/05
11/25/05
11/28/05
12/1/05
12/1/05
12/1/05

I want a formula that will tell me how many unique dates there are in the
column. Can anyone help?

thanks,
Bob
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top