Counting unique records in a field

L

Loris

In Excel 2003, I have a large list (database). In one of the fields I have
multiple company names. Each company may be listed many times. Is there a
way to count the number of different companies that are in that field? I
don't care how many times any one company appears, just how many different
companies there are.
 
G

Gary Mc

Loris,

=COUNTIF(RANGE,"COMPANY NAME") where "RANGE" contains the list of company
names should give you what you want.
 
T

T. Valko

Try this:

If there might be empty cells and they are to be excluded:

=SUMPRODUCT((A1:A100<>"")/COUNTIF(A1:A100,A1:A100&""))

If there will not be empty cells:

=SUMPRODUCT(1/COUNTIF(A1:A100,A1:A100))

Note that these are slow to calculate on large amounts of data (1000's of
rows).
 
G

Gary Mc

Loris,

I apologize for my earlier response as it was incorrect; I misread your
request.

GMc
 
L

Lars-Åke Aspelin

Don't know if it will speed up anything, but you can replace
SUMPRODUCT with SUM in these two formulas.

Lars-Åke
 
L

Loris

This is exactly what I needed. I wasn't familiar with the SUMPRODUCT
function. Now I have to figure out what it is doing so I can understand why
it works. Thanks much.
 
S

Shane Devenshire

Hi,

1. Gary - Your idea can be modified as follow, provided there are no spaces:

=SUMPRODUCT(1/COUNTIF(A1:A10,A1:A10))

2. Antares - to understand the formula select a portion of the formula on
the formula bar, such as COUNTIF(A1:A10,A1:A10) and press the F9 key.
Examine the results and press Esc. test another complete piece...

If this helps, please click the Yes button

Cheers,
Shane Devenshire
 
S

Shane Devenshire

Hi,

If you know there are going to be blanks in your range then you can use:
=SUMPRODUCT(1/COUNTIF(A3:A19,A3:A19&""))-1
if there won't be any blanks for sure then
=SUMPRODUCT(1/COUNTIF(A3:A19,A3:A19))
but if you don't know it's better to use the previously suggested formula.

If this helps, please click the Yes button

Cheers,
Shane Devenshire
 
S

Shane Devenshire

Hi Lars,

SUMPRODUCT is faster than SUM array entered. It might only be 15-25% faster
but its something.

If this helps, please click the Yes button

Cheers,
Shane Devenshire
 
T

T. Valko

In this application the SUM array is marginally faster than SUMPRODUCT.

=SUMPRODUCT((A1:A1000<>"")/COUNTIF(A1:A1000,A1:A1000&""))

Average calc time (5 calculations): 0.063036 seconds

=SUM((A1:A1000<>"")/COUNTIF(A1:A1000,A1:A1000&""))

Average calc time (5 calculations): 0.062782 seconds

=SUMPRODUCT(1/COUNTIF(A1:A1000,A1:A1000))

Average calc time (5 calculations): 0.31791 seconds

=SUM(1/COUNTIF(A1:A1000,A1:A1000))

Average calc time (5 calculations): 0.316408 seconds

Calc times measured using Charles Williams' RangeTimer method.
 
T

T. Valko

Try this little experiemnt. Let's breakdown the formula into the steps it
takes to arrive at a result.

=SUMPRODUCT((A1:A5<>"")/COUNTIF(A1:A5,A1:A5&""))

Enter this data in A1:A5 - A, B, C, D, A

Enter this formula in B1 and copy down to B5:

=A1<>""

Enter this formula in C1 and copy down to C5:

=COUNTIF(A$1:A$5,A1&"")

Enter this formula in D1 and copy down to D5:

=B1/C1

Enter this formula in E1: =SUM(D1:D5)

That's how many unique items are in A1:A5.

Now, change the formula C1 to:

=COUNTIF(A$1:A$5,A1)

Copy down to C5

Now, clear cell C3 and see what happens.
 
T

T. Valko

Hi Dom!

Long time no see!

I've tested COUNTDIFF on 10s of 1000's of rows and it is by far the fastest.
I haven't tested it yet in Excel 2007 on 100,000's of rows.

The standard SUMPRODUCT((R<>"")/COUNTIF(R,R&"")) brings my machine to its
knees when R=65535 rows.

Too bad MS doesn't have a built-in function to do this.

I'll take a look at the SUM(IF(FREQUENCY(.....))) version.
 
D

Domenic

T. Valko said:

Hi Biff!
Long time no see!

Yep, it's been a while! :)
I've tested COUNTDIFF on 10s of 1000's of rows and it is by far the fastest.

Yeah, if efficiency is a concern, it seems the route to go. :)
I haven't tested it yet in Excel 2007 on 100,000's of rows.

If you get around to testing it, can you share the results with us?
The standard SUMPRODUCT((R<>"")/COUNTIF(R,R&"")) brings my machine to its
knees when R=65535 rows.

Yeah, same here. And SUM(IF(FREQUENCY(.....))) is pretty well the same.
Too bad MS doesn't have a built-in function to do this.

You would think that MS would have included it with their latest
version. After all, it does seem to be a common task.
I'll take a look at the SUM(IF(FREQUENCY(.....))) version.

It's not only more efficient, but it avoids the bug associated with
SUMPRODUCT, discussed here...

http://groups.google.com/group/microsoft.public.excel.worksheet.functions
/browse_thread/thread/140c19c8fb483fb8/c20ec397f954a6ce?lnk=st&q=sumprodu
ct+group%3A*Excel*+author%3Aharlan&rnum=134#c20ec397f954a6ce

And it's somewhat more robust against special characters, discussed
here...

http://www.mrexcel.com/forum/showpost.php?p=1186773&postcount=20
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top