Finding the most common cell duplicate (text)

J

juggo

Hey,

Here's my problem/question.

I have a big excel file with about 6000 rows of data (mostly text).
The 6,000 entries in Column B are all text and that is the data I'm
interested in. Each entry contains a person's name. My goal is to
figure out which name's are the most commonly listed. Yes I could just
sort by that particular column and I've done that. But with 6000+
entries, it's still hard to tell which are listed the most.

So my aim would be something like the following...

If I have this in Column B (shorter version, what is listed after the
'--' would be the cell contents):
Row1 -- Jake Scott
Row2 -- Donald Smith
Row3 -- Tim Matthews
Row4 -- Donald Smith
Row5 -- Jake Scott
Row6 -- Donald Smith

For it to be able to tell me that 'Jake Scott' occurs 2 times, 'Donald
Smith' occurs 3 times, and 'Tim Matthews' occurs 1 time. And then sort
those numbers so I can have a list of the most common entries.

Would this be possible? If so, how? If not, any other ideas outside
of Excel?

regards,

Jon
 
D

duane

try this

1) sort your list (maybe copy to another sheet first)
2) in cell c1 and copied down
=IF(C1>C2,C1,0)
3) in cell d1 and copied down
=IF(C2>C3,C2,0)
4) copy columns c and d and paste special as values
5) sort columns B thru d based on column d (descending) to get a list
of names and occurances - any duplicate names will have a zero in
column d
 
J

juggo

Duane,

Thanks for the reply. I followed you until step #5.

How, exactly, do I "sort columns B thru d based on column d
(descending)"?

I went to the Data Sort dialogue box, but I don't see anything about
sorting a column 'through' or 'based on' anything.

regards,

Jon
 
J

juggo

Nevermind, I think I found a way to do it through the 'Subtotals'
function.

Not the prettiest output, but it works.

Thanks anyway.
 
R

Ron Rosenfeld

Hey,

Here's my problem/question.

I have a big excel file with about 6000 rows of data (mostly text).
The 6,000 entries in Column B are all text and that is the data I'm
interested in. Each entry contains a person's name. My goal is to
figure out which name's are the most commonly listed. Yes I could just
sort by that particular column and I've done that. But with 6000+
entries, it's still hard to tell which are listed the most.

So my aim would be something like the following...

If I have this in Column B (shorter version, what is listed after the
'--' would be the cell contents):
Row1 -- Jake Scott
Row2 -- Donald Smith
Row3 -- Tim Matthews
Row4 -- Donald Smith
Row5 -- Jake Scott
Row6 -- Donald Smith

For it to be able to tell me that 'Jake Scott' occurs 2 times, 'Donald
Smith' occurs 3 times, and 'Tim Matthews' occurs 1 time. And then sort
those numbers so I can have a list of the most common entries.

Would this be possible? If so, how? If not, any other ideas outside
of Excel?

regards,

Jon

You could use a Pivot Table, or Data/Sort Data/Subtotals

1. Insert a "Label Row" at Row 1
2. B1: Names
C1: Frequency
C2: =COUNTIF(rng,B2)
(where rng is a named reference or an absolute reference to your list of names
in B2:Bn)

3. Fill down from C2:Cn

4. Data/Sort Descending on Frequency

5. Data/Subtotals
At each change in Names
Use Function Count
Add Subtotal to Frequency

Click on the <2> to collapse to just the Count of Names.

------------------
For a Pivot Table, no need for the second column.

Just use the Pivot Table wizard; drag Names to Rows and Names to Data area. It
should give you a count and, if necessary, you should be able to sort it.


--ron
 
B

Biff

Hi!

The easy way:

Use an advanced filter and copy the unique entries to another column. Then
use a Countif formula.

Biff
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top