Finding the most common cell duplicate (text)

  • Thread starter Thread starter juggo
  • Start date Start date
J

juggo

Hey,

Here's my problem/question.

I have a big excel file with about 6000 rows of data (mostly text).
The 6,000 entries in Column B are all text and that is the data I'm
interested in. Each entry contains a person's name. My goal is to
figure out which name's are the most commonly listed. Yes I could just
sort by that particular column and I've done that. But with 6000+
entries, it's still hard to tell which are listed the most.

So my aim would be something like the following...

If I have this in Column B (shorter version, what is listed after the
'--' would be the cell contents):
Row1 -- Jake Scott
Row2 -- Donald Smith
Row3 -- Tim Matthews
Row4 -- Donald Smith
Row5 -- Jake Scott
Row6 -- Donald Smith

For it to be able to tell me that 'Jake Scott' occurs 2 times, 'Donald
Smith' occurs 3 times, and 'Tim Matthews' occurs 1 time. And then sort
those numbers so I can have a list of the most common entries.

Would this be possible? If so, how? If not, any other ideas outside
of Excel?

regards,

Jon
 
try this

1) sort your list (maybe copy to another sheet first)
2) in cell c1 and copied down
=IF(C1>C2,C1,0)
3) in cell d1 and copied down
=IF(C2>C3,C2,0)
4) copy columns c and d and paste special as values
5) sort columns B thru d based on column d (descending) to get a list
of names and occurances - any duplicate names will have a zero in
column d
 
Duane,

Thanks for the reply. I followed you until step #5.

How, exactly, do I "sort columns B thru d based on column d
(descending)"?

I went to the Data Sort dialogue box, but I don't see anything about
sorting a column 'through' or 'based on' anything.

regards,

Jon
 
Nevermind, I think I found a way to do it through the 'Subtotals'
function.

Not the prettiest output, but it works.

Thanks anyway.
 
Hey,

Here's my problem/question.

I have a big excel file with about 6000 rows of data (mostly text).
The 6,000 entries in Column B are all text and that is the data I'm
interested in. Each entry contains a person's name. My goal is to
figure out which name's are the most commonly listed. Yes I could just
sort by that particular column and I've done that. But with 6000+
entries, it's still hard to tell which are listed the most.

So my aim would be something like the following...

If I have this in Column B (shorter version, what is listed after the
'--' would be the cell contents):
Row1 -- Jake Scott
Row2 -- Donald Smith
Row3 -- Tim Matthews
Row4 -- Donald Smith
Row5 -- Jake Scott
Row6 -- Donald Smith

For it to be able to tell me that 'Jake Scott' occurs 2 times, 'Donald
Smith' occurs 3 times, and 'Tim Matthews' occurs 1 time. And then sort
those numbers so I can have a list of the most common entries.

Would this be possible? If so, how? If not, any other ideas outside
of Excel?

regards,

Jon

You could use a Pivot Table, or Data/Sort Data/Subtotals

1. Insert a "Label Row" at Row 1
2. B1: Names
C1: Frequency
C2: =COUNTIF(rng,B2)
(where rng is a named reference or an absolute reference to your list of names
in B2:Bn)

3. Fill down from C2:Cn

4. Data/Sort Descending on Frequency

5. Data/Subtotals
At each change in Names
Use Function Count
Add Subtotal to Frequency

Click on the <2> to collapse to just the Count of Names.

------------------
For a Pivot Table, no need for the second column.

Just use the Pivot Table wizard; drag Names to Rows and Names to Data area. It
should give you a count and, if necessary, you should be able to sort it.


--ron
 
Hi!

The easy way:

Use an advanced filter and copy the unique entries to another column. Then
use a Countif formula.

Biff
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Back
Top