Counting unique records in a field

  • Thread starter Thread starter Loris
  • Start date Start date
L

Loris

In Excel 2003, I have a large list (database). In one of the fields I have
multiple company names. Each company may be listed many times. Is there a
way to count the number of different companies that are in that field? I
don't care how many times any one company appears, just how many different
companies there are.
 
Loris,

=COUNTIF(RANGE,"COMPANY NAME") where "RANGE" contains the list of company
names should give you what you want.
 
Try this:

If there might be empty cells and they are to be excluded:

=SUMPRODUCT((A1:A100<>"")/COUNTIF(A1:A100,A1:A100&""))

If there will not be empty cells:

=SUMPRODUCT(1/COUNTIF(A1:A100,A1:A100))

Note that these are slow to calculate on large amounts of data (1000's of
rows).
 
Loris,

I apologize for my earlier response as it was incorrect; I misread your
request.

GMc
 
Don't know if it will speed up anything, but you can replace
SUMPRODUCT with SUM in these two formulas.

Lars-Åke
 
This is exactly what I needed. I wasn't familiar with the SUMPRODUCT
function. Now I have to figure out what it is doing so I can understand why
it works. Thanks much.
 
Hi,

1. Gary - Your idea can be modified as follow, provided there are no spaces:

=SUMPRODUCT(1/COUNTIF(A1:A10,A1:A10))

2. Antares - to understand the formula select a portion of the formula on
the formula bar, such as COUNTIF(A1:A10,A1:A10) and press the F9 key.
Examine the results and press Esc. test another complete piece...

If this helps, please click the Yes button

Cheers,
Shane Devenshire
 
Hi,

If you know there are going to be blanks in your range then you can use:
=SUMPRODUCT(1/COUNTIF(A3:A19,A3:A19&""))-1
if there won't be any blanks for sure then
=SUMPRODUCT(1/COUNTIF(A3:A19,A3:A19))
but if you don't know it's better to use the previously suggested formula.

If this helps, please click the Yes button

Cheers,
Shane Devenshire
 
Hi Lars,

SUMPRODUCT is faster than SUM array entered. It might only be 15-25% faster
but its something.

If this helps, please click the Yes button

Cheers,
Shane Devenshire
 
In this application the SUM array is marginally faster than SUMPRODUCT.

=SUMPRODUCT((A1:A1000<>"")/COUNTIF(A1:A1000,A1:A1000&""))

Average calc time (5 calculations): 0.063036 seconds

=SUM((A1:A1000<>"")/COUNTIF(A1:A1000,A1:A1000&""))

Average calc time (5 calculations): 0.062782 seconds

=SUMPRODUCT(1/COUNTIF(A1:A1000,A1:A1000))

Average calc time (5 calculations): 0.31791 seconds

=SUM(1/COUNTIF(A1:A1000,A1:A1000))

Average calc time (5 calculations): 0.316408 seconds

Calc times measured using Charles Williams' RangeTimer method.
 
Try this little experiemnt. Let's breakdown the formula into the steps it
takes to arrive at a result.

=SUMPRODUCT((A1:A5<>"")/COUNTIF(A1:A5,A1:A5&""))

Enter this data in A1:A5 - A, B, C, D, A

Enter this formula in B1 and copy down to B5:

=A1<>""

Enter this formula in C1 and copy down to C5:

=COUNTIF(A$1:A$5,A1&"")

Enter this formula in D1 and copy down to D5:

=B1/C1

Enter this formula in E1: =SUM(D1:D5)

That's how many unique items are in A1:A5.

Now, change the formula C1 to:

=COUNTIF(A$1:A$5,A1)

Copy down to C5

Now, clear cell C3 and see what happens.
 
Hi Dom!

Long time no see!

I've tested COUNTDIFF on 10s of 1000's of rows and it is by far the fastest.
I haven't tested it yet in Excel 2007 on 100,000's of rows.

The standard SUMPRODUCT((R<>"")/COUNTIF(R,R&"")) brings my machine to its
knees when R=65535 rows.

Too bad MS doesn't have a built-in function to do this.

I'll take a look at the SUM(IF(FREQUENCY(.....))) version.
 
T. Valko said:

Hi Biff!
Long time no see!

Yep, it's been a while! :-)
I've tested COUNTDIFF on 10s of 1000's of rows and it is by far the fastest.

Yeah, if efficiency is a concern, it seems the route to go. :-)
I haven't tested it yet in Excel 2007 on 100,000's of rows.

If you get around to testing it, can you share the results with us?
The standard SUMPRODUCT((R<>"")/COUNTIF(R,R&"")) brings my machine to its
knees when R=65535 rows.

Yeah, same here. And SUM(IF(FREQUENCY(.....))) is pretty well the same.
Too bad MS doesn't have a built-in function to do this.

You would think that MS would have included it with their latest
version. After all, it does seem to be a common task.
I'll take a look at the SUM(IF(FREQUENCY(.....))) version.

It's not only more efficient, but it avoids the bug associated with
SUMPRODUCT, discussed here...

http://groups.google.com/group/microsoft.public.excel.worksheet.functions
/browse_thread/thread/140c19c8fb483fb8/c20ec397f954a6ce?lnk=st&q=sumprodu
ct+group%3A*Excel*+author%3Aharlan&rnum=134#c20ec397f954a6ce

And it's somewhat more robust against special characters, discussed
here...

http://www.mrexcel.com/forum/showpost.php?p=1186773&postcount=20
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Back
Top