Counting Unique Values

R

RJL0323

Hello All,
I have a question related to counting unique values in a column of
data. I will try to illustrate my question. I have a column of data
with 1000 rows. In this column there are duplicated values. I would
like to be able to use a function count how many unique values are in
the column. Let's say there were 4 duplicates of 250 values in the
column. I would like to be able to write a function to calculate the
250. I am very familiar with Excel and am able to acheive the number
through subtotals and/or pivot tables. I know I can find how many
instances one specific value appears in the column through sumproduct
and/or countif statements, but the function to calculate the number of
unique values has really got me stumped.

Does anyone have any ideas?

Thanks in advance!!
RJ
 
B

Bob Phillips

=SUMPRODUCT(1/COUNTIF(A1:A1000,A1:A1000))

--

HTH

Bob Phillips

(remove nothere from the email address if mailing direct)
 
B

Bill Kuunders

One way.
enter =count(E2:E1002) in cell E1
select the range in question
go to ....data....filter....advanced filter....copy to another location
select E2
tick unique records only............. no need for a criteria range

Greetings from New Zealand
Bill K
 
B

Bill Kuunders

Bob, I check this news group frequently as a means to learn stuff.

Could you please explain why and how your formula works?

Thank You
 
P

Peo Sjoblom

I would personally use the variant

=SUMPRODUCT(--(A1:A1000<>""),1/COUNTIF(A1:A1000,A1:A1000&""))

otherwise you'll get DIV/0 errors if there are blank cells, it works as
follows

the 1/countif part returns an array of numbers, if there is one value unique
it will return 1,
if there are 2 values that are the same it will return 2 times 0.5 (1/2 =
0.5), if 3 it will return 0.333333, 4 0.25 and so on

assume we have this in A1:A10

1
2
3
4
65
6
1
2
3
4


it would be 6 unique values, the 1/countif returns

{0.5;0.5;0.5;0.5;1;1;0.5;0.5;0.5;0.5}

sumproduct will sum them to return 6, if we change the last number 4 to 1 so
there would be 3 1

{0.333333333333333;0.5;0.5;1;1;1;0.333333333333333;0.5;0.5;0.333333333333333}

still returns the total of 6

I believe former MVP Dave Hager was the originator of it although it has
been converted from

=SUM(1/COUNTIF))

to sumproduct thus it can be entered normally

--

Regards,

Peo Sjoblom

Northwest Excel Solutions

Portland, Oregon
 
B

Bill Kuunders

Thanks Peo,

When I only enter = 1/COUNTIF(A1:A1000,A1:A1000&"") the answer is 0.5
I suppose it is showing me the most frequent occuring in the array??

Thanks again, excellent formula
Bill Kuunders
 
P

Peo Sjoblom

Actually, the 0.5 is the first value in the array, meaning that whatever is
in A1 occurs twice in the range

select B1:B10, with B1 as the active cell click in the formula bar and put
that formula, now enter it with ctrl + shift & enter, now if you sum B1:B10
you'll get the total of uniques, that is how array formulas can work

--

Regards,

Peo Sjoblom

Northwest Excel Solutions

Portland, Oregon
 
B

Bob Phillips

From the OPs post there was a full set of data, so in his case there was no
need to handle blanks.

--

HTH

Bob Phillips

(remove nothere from the email address if mailing direct)
 
B

Bob Phillips

Bill,

I had a go at explaining it a while back in http://tinyurl.com/dhbxe. This
explanation is based upon the version that caters for blanks, but the
fundamental principle is the same.

--

HTH

Bob Phillips

(remove nothere from the email address if mailing direct)
 
B

Bill Kuunders

Thank You Both

Peo and Bob

I appreciate your time and effort very much.

Bill Kuunders
 
V

via135

hi Peo!

for a single column of data the formula is ok!

can i use the same formula for data with more than one column with
duplicates?
for example

col"a" col"b" col"c" col"d"

xxx yyy 10 zzz
xyz abc 20 rst
yzx cab 10 mno
bac def 30 xyz
xyz abc 20 rst
xyz abc 10 rst
yzx cab 10 mno

-now i want to count the number of records excluding the duplicates!
in the above example 2nd & 5th, 3rd & 6th are duplicates.

if i make a count of total records without repetition, i must a get an
answer of 5 ie.(7-2)
 
A

Aladin Akyurek

1] If you download & install the morefunc.xll add-in:

=COUNTDIFF(A1:A7&"#"&B1:B7&"#"&C1:C7&"#"&D1:D7,,"")

which needs to be confirmed with control+shift+enter, not just with enter.

2] With built-in functions:

=SUMPRODUCT(--((MATCH(A1:A7&"#"&B1:B7&"#"&C1:C7&"#"&D1:D7,A1:A7&"#"&B1:B7&"#"&C1:C7&"#"&D1:D7,0)=ROW(INDEX(A1:A7,0,0))-ROW(A1)+1)))
 
V

via135

hi Mr Aladin!

as far as i know i never heard abt the worksheet function "COUNTDIFF"
in XL!

COULD YOU PL KINDLY EXPLAIN ME ABT THAT???!!!!

-via135




Aladin said:
1] If you download & install the morefunc.xll add-in:

=COUNTDIFF(A1:A7&"#"&B1:B7&"#"&C1:C7&"#"&D1:D7,,"")

which needs to be confirmed with control+shift+enter, not just with
enter.

2] With built-in functions:

=SUMPRODUCT(--((MATCH(A1:A7&"#"&B1:B7&"#"&C1:C7&"#"&D1:D7,A1:A7&"#"&B1:B7&"#"&C1:C7&"#"&D1:D7,0)=ROW(INDEX(A1:A7,0,0))-ROW(A1)+1)))

hi Peo!

for a single column of data the formula is ok!

can i use the same formula for data with more than one column with
duplicates?
for example

col"a" col"b" col"c" col"d"

xxx yyy 10 zzz
xyz abc 20 rst
yzx cab 10 mno
bac def 30 xyz
xyz abc 20 rst
xyz abc 10 rst
yzx cab 10 mno

-now i want to count the number of records excluding the duplicates!
in the above example 2nd & 5th, 3rd & 6th are duplicates.

if i make a count of total records without repetition, i must a get an
answer of 5 ie.(7-2)
 
P

Peo Sjoblom

It's from an add-in that can be downloaded here

http://xcell05.free.fr/

name is Morefunc

OTOH you can easily get this using the subtotal function and advanced
filter, assume that your sample data included headers are in A1:D8, in let's
say E1 put

=SUBTOTAL(3,A2:A8)

will return 7, not select the table (click one cell and then press ctrl and
*

do data>filter>advanced filter, select unique entries only, click OK, check
the subtotal formula which now has changed to 5. to take off the filter
data>filter>show all

another way, in column E in cell E2 put

=A2&B2&C2&D2

copy down to E8, then use the original formula on this help column

=SUMPRODUCT(--(E2:E8<>""),1/COUNTIF(E2:E8,E2:E8&""))


returns 5

HTH



--

Regards,

Peo Sjoblom

Northwest Excel Solutions

Portland, Oregon




via135 said:
hi Mr Aladin!

as far as i know i never heard abt the worksheet function "COUNTDIFF"
in XL!

COULD YOU PL KINDLY EXPLAIN ME ABT THAT???!!!!

-via135




Aladin said:
1] If you download & install the morefunc.xll add-in:

=COUNTDIFF(A1:A7&"#"&B1:B7&"#"&C1:C7&"#"&D1:D7,,"")

which needs to be confirmed with control+shift+enter, not just with
enter.

2] With built-in functions:

=SUMPRODUCT(--((MATCH(A1:A7&"#"&B1:B7&"#"&C1:C7&"#"&D1:D7,A1:A7&"#"&B1:B7&"#"&C1:C7&"#"&D1:D7,0)=ROW(INDEX(A1:A7,0,0))-ROW(A1)+1)))

hi Peo!

for a single column of data the formula is ok!

can i use the same formula for data with more than one column with
duplicates?
for example

col"a" col"b" col"c" col"d"

xxx yyy 10 zzz
xyz abc 20 rst
yzx cab 10 mno
bac def 30 xyz
xyz abc 20 rst
xyz abc 10 rst
yzx cab 10 mno

-now i want to count the number of records excluding the duplicates!
in the above example 2nd & 5th, 3rd & 6th are duplicates.

if i make a count of total records without repetition, i must a get an
answer of 5 ie.(7-2)






Peo Sjoblom Wrote:

I would personally use the variant

=SUMPRODUCT(--(A1:A1000<>""),1/COUNTIF(A1:A1000,A1:A1000&""))

otherwise you'll get DIV/0 errors if there are blank cells, it works
as
follows

the 1/countif part returns an array of numbers, if there is one value
unique
it will return 1,
if there are 2 values that are the same it will return 2 times 0.5 (1/2
=
0.5), if 3 it will return 0.333333, 4 0.25 and so on

assume we have this in A1:A10

1
2
3
4
65
6
1
2
3
4


it would be 6 unique values, the 1/countif returns

{0.5;0.5;0.5;0.5;1;1;0.5;0.5;0.5;0.5}

sumproduct will sum them to return 6, if we change the last number 4 to
1 so
there would be 3 1

{0.333333333333333;0.5;0.5;1;1;1;0.333333333333333;0.5;0.5;0.333333333333333}

still returns the total of 6

I believe former MVP Dave Hager was the originator of it although it
has
been converted from

=SUM(1/COUNTIF))

to sumproduct thus it can be entered normally

--

Regards,

Peo Sjoblom

Northwest Excel Solutions

Portland, Oregon





Bob, I check this news group frequently as a means to learn stuff.

Could you please explain why and how your formula works?

Thank You
--
Greetings from New Zealand
Bill K




=SUMPRODUCT(1/COUNTIF(A1:A1000,A1:A1000))

--

HTH

Bob Phillips

(remove nothere from the email address if mailing direct)

"RJL0323" <[email protected]>

wrote in

message


Hello All,
I have a question related to counting unique values in a column of
data. I will try to illustrate my question. I have a column of

data

with 1000 rows. In this column there are duplicated values. I

would

like to be able to use a function count how many unique values are

in

the column. Let's say there were 4 duplicates of 250 values in the
column. I would like to be able to write a function to calculate

the

250. I am very familiar with Excel and am able to acheive the

number

through subtotals and/or pivot tables. I know I can find how many
instances one specific value appears in the column through

sumproduct

and/or countif statements, but the function to calculate the number

of

unique values has really got me stumped.

Does anyone have any ideas?

Thanks in advance!!
RJ


--
RJL0323


------------------------------------------------------------------------

RJL0323's Profile:

http://www.excelforum.com/member.php?action=getinfo&userid=19456

View this thread:
http://www.excelforum.com/showthread.php?threadid=513331
 
P

Peo Sjoblom

Try this variant

=SUM(IF(FREQUENCY(IF((A2:A8<>"")*(B2:B8<>"")*(C2:C8<>"")*(D2:D8<>""),MATCH(A2:A8&B2:B8&C2:C8&D2:D8,A2:A8&B2:B8&C2:C8&D2:D8,0)),ROW(INDIRECT("1:"&ROWS(A2:A8))))>0,1))

entered with ctrl + shift & enter

Note that I used the same ranges in earlier samples thus your data
(excluding headers are in A2:D8)
also note that if this range is large the formula might choke excel and make
it very slow

the last ROWS part is just to make it flexible, if you know that you always
will have for instance 30 rows (A2:A31)
you could use

=SUM(IF(FREQUENCY(IF((A2:A31<>"")*(B2:B31<>"")*(C2:C31<>"")*(D2:D31<>""),MATCH(A2:A31&B2:B31&C2:C31&D2:D31,A2:A31&B2:B31&C2:C31&D2:D31,0)),ROW(INDIRECT("1:30")))>0,1))

--

Regards,

Peo Sjoblom

Northwest Excel Solutions

Portland, Oregon
 
A

Aladin Akyurek

The data in range A1:E12 including the range with concatenation and headers:

{"Name1","Name2","Name3","Name4",0;
"xxx","yyy",10,"zzz","xxxyyy10zzz";
"xyz",0,20,"cascade","xyz20cascade";
"yzx","cab",10,"mno","yzxcab10mno";
"bac","def",30,0,"bacdef30";
"xyz","abc",20,"rst","xyzabc20rst";
"xyz","abc",10,"rst","xyzabc10rst";
"yzx","cab",10,"mno","yzxcab10mno";
0,0,0,0,"";
0,0,0,0,"";
0,0,0,0,"";
"wer","ewrt",879,"q","werewrt879q"}

The zeroes stand for empty cells.

1]

=SUMPRODUCT(--(E2:E12<>""),1/COUNTIF(E2:E12,E2:E12&""))

2]

{=SUM(IF(FREQUENCY(IF((A2:A12<>"")*(B2:B12<>"")*(C2:C12<>"")*(D2:D12<>""),
MATCH(A2:A12&B2:B12&C2:C12&D2:D12,A2:A12&B2:B12&C2:C12&D2:D12,0)),
ROW(INDIRECT("1:30")))>0,1))}

3]

{=COUNTDIFF(A2:A12&"#"&B2:B12&"#"&C2:C12&"#"&D2:D12,,"")}

4]

=SUMPRODUCT(--((MATCH(A2:A12&"#"&B2:B12&"#"&C2:C12&"#"&D2:D12,
A2:A12&"#"&B2:B12&"#"&C2:C12&"#"&D2:D12,0)
=ROW(INDEX(A2:A12,0,0))-ROW(A2)+1)))

[1] yields: 7, while [2] delivers: 5.

[3] and [4] both yield: 8.
 
V

via135

i'm afraid Peo!
that your sample download gives the result as "7" i/o "5"!
while your earlier post throws the correct result of "5"!!
maybe bcoz of extending the range upto row31 in your sample
download??!!

-via135
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top