Removing Duplicates

  • Thread starter Thread starter Guest
  • Start date Start date
G

Guest

Hi

I am running v.2000 and need some help. I have an excel sheet containing
multiple columns and 1000s of rows.

I know how to delete rows which are completely identical (advanced filter
etc) but I need to know the following:

If there are two rows which are generally different (and so won't be deleted
by the advanced filter method), but each have an identical value in column D,
how would I get rid of the entire row containing the duplicate cell?

So for example if I had three rows as follows:

Mr | Joe | Bloggs | London
Mr | Tom | Jones | Glasgow
Mr | Ben | Jones | Essex

None of these would count as traditional duplicates via the advanced filter
method.

But what if I wanted to get rid of duplicate surnames - in the above example
I would ideally want to be left with just one of the "Jones" rows - how can I
do this?

It is worth noting that I have only basic experience with excel and so am
not familiar with macros etc - I tried one macro tutorial that didnt work :-S

Hope someone can help!

Thanks
Danielle
 
Perhaps something like this:

With your list in Cells A3:D4000
and list headings in cells A3:D3 (Title, FName, Lname, Location)

F1: DupTest
F2: =COUNTIF(C$3:C4,C4)>1

Select your list (A3:D4000)
Data>Filter>Advanced Filter
List Range: (already selected)
Criteria Range: $F$1:$F$2

Click the [OK] button to filter the list in place.

That will hide the first occurrence of a LName and display only the
duplicates.

Select the visible cells below the column headings
Edit|Delete (you can only delete entire rows in a filtered list)

Does that help?

***********
Regards,
Ron

XL2002, WinXP-Pro
 
Hi Ron

Thanks for your answer. I am afraid I dont understand some parts of your
reply, though - can you clarify for me?:

How would the list be in D4000 when there are only 3 items (in the example)?
Or is this based on there being 4000 records?

What is F1 and F2 referring to? What do I have to do with them?

If you can let me know and I can try it again :-)

Thanks
Danielle

Ron Coderre said:
Perhaps something like this:

With your list in Cells A3:D4000
and list headings in cells A3:D3 (Title, FName, Lname, Location)

F1: DupTest
F2: =COUNTIF(C$3:C4,C4)>1

Select your list (A3:D4000)
Data>Filter>Advanced Filter
List Range: (already selected)
Criteria Range: $F$1:$F$2

Click the [OK] button to filter the list in place.

That will hide the first occurrence of a LName and display only the
duplicates.

Select the visible cells below the column headings
Edit|Delete (you can only delete entire rows in a filtered list)

Does that help?

***********
Regards,
Ron

XL2002, WinXP-Pro


Danielle said:
Hi

I am running v.2000 and need some help. I have an excel sheet containing
multiple columns and 1000s of rows.

I know how to delete rows which are completely identical (advanced filter
etc) but I need to know the following:

If there are two rows which are generally different (and so won't be deleted
by the advanced filter method), but each have an identical value in column D,
how would I get rid of the entire row containing the duplicate cell?

So for example if I had three rows as follows:

Mr | Joe | Bloggs | London
Mr | Tom | Jones | Glasgow
Mr | Ben | Jones | Essex

None of these would count as traditional duplicates via the advanced filter
method.

But what if I wanted to get rid of duplicate surnames - in the above example
I would ideally want to be left with just one of the "Jones" rows - how can I
do this?

It is worth noting that I have only basic experience with excel and so am
not familiar with macros etc - I tried one macro tutorial that didnt work :-S

Hope someone can help!

Thanks
Danielle
 
OK

Regarding the assumptions and criteria:
1)As you know, for Advanced Filters to work, they need column headings on
the list. Hence the: Title, FName, Lname, Location

2)The range A3:D4000 is just as an example, but I made sure the data rows of
the list (under the column headings) would be under the criteria rows so they
wouldn't be deleted in the duplicate removal.

3)Since I used a formulaic criteria, I couldn't use a list column heading. I
could have left cell F1 blank, but instead i just used a descriptive heading
that didn't match any of the list headings.

4)Notice the single dollar sign ($) in the criteria forumula. That formula
will play out this way:

Testing cell C4, the formula will be: =COUNTIF(C$3:C4,C4)>1
Testing cell C5, the formula will be: =COUNTIF(C$3:C5,C5)>1
Testing cell C6, the formula will be: =COUNTIF(C$3:C6,C6)>1
etc

Consequently, the first time the formula encounters a value, its count will
be 1. All other instances will have a count of greater than 1.

The Advanced filter only displays records where the item count is greater
than 1....the duplicates.

Does that help?

***********
Regards,
Ron

XL2002, WinXP-Pro


Danielle said:
Hi Ron

Thanks for your answer. I am afraid I dont understand some parts of your
reply, though - can you clarify for me?:

How would the list be in D4000 when there are only 3 items (in the example)?
Or is this based on there being 4000 records?

What is F1 and F2 referring to? What do I have to do with them?

If you can let me know and I can try it again :-)

Thanks
Danielle

Ron Coderre said:
Perhaps something like this:

With your list in Cells A3:D4000
and list headings in cells A3:D3 (Title, FName, Lname, Location)

F1: DupTest
F2: =COUNTIF(C$3:C4,C4)>1

Select your list (A3:D4000)
Data>Filter>Advanced Filter
List Range: (already selected)
Criteria Range: $F$1:$F$2

Click the [OK] button to filter the list in place.

That will hide the first occurrence of a LName and display only the
duplicates.

Select the visible cells below the column headings
Edit|Delete (you can only delete entire rows in a filtered list)

Does that help?

***********
Regards,
Ron

XL2002, WinXP-Pro


Danielle said:
Hi

I am running v.2000 and need some help. I have an excel sheet containing
multiple columns and 1000s of rows.

I know how to delete rows which are completely identical (advanced filter
etc) but I need to know the following:

If there are two rows which are generally different (and so won't be deleted
by the advanced filter method), but each have an identical value in column D,
how would I get rid of the entire row containing the duplicate cell?

So for example if I had three rows as follows:

Mr | Joe | Bloggs | London
Mr | Tom | Jones | Glasgow
Mr | Ben | Jones | Essex

None of these would count as traditional duplicates via the advanced filter
method.

But what if I wanted to get rid of duplicate surnames - in the above example
I would ideally want to be left with just one of the "Jones" rows - how can I
do this?

It is worth noting that I have only basic experience with excel and so am
not familiar with macros etc - I tried one macro tutorial that didnt work :-S

Hope someone can help!

Thanks
Danielle
 
Hello again,

Thanks - it's much clearer now and I understand what you mean.

However I still dont seem to be getting the right result :(

I am basically writing the TestDup in the header, then putting the formula
into the F2 column and the copying it down for each record.

I have tried various methods including selecting and not selecting the F
colum, choosing in-place or copy to another location, and choosing and not
choosing unique fields only, but dont seem to be getting anywhere.

Its worth noting that for the F fields with the formula in, it comes back
with either a FALSE or TRUE value, but only a couple of my records are
showing true, and they arent duplicates. The ducplicates that are there still
say False. Could this be a problem?

Sorry to be such a difficult one!

Danielle

Ron Coderre said:
OK

Regarding the assumptions and criteria:
1)As you know, for Advanced Filters to work, they need column headings on
the list. Hence the: Title, FName, Lname, Location

2)The range A3:D4000 is just as an example, but I made sure the data rows of
the list (under the column headings) would be under the criteria rows so they
wouldn't be deleted in the duplicate removal.

3)Since I used a formulaic criteria, I couldn't use a list column heading. I
could have left cell F1 blank, but instead i just used a descriptive heading
that didn't match any of the list headings.

4)Notice the single dollar sign ($) in the criteria forumula. That formula
will play out this way:

Testing cell C4, the formula will be: =COUNTIF(C$3:C4,C4)>1
Testing cell C5, the formula will be: =COUNTIF(C$3:C5,C5)>1
Testing cell C6, the formula will be: =COUNTIF(C$3:C6,C6)>1
etc

Consequently, the first time the formula encounters a value, its count will
be 1. All other instances will have a count of greater than 1.

The Advanced filter only displays records where the item count is greater
than 1....the duplicates.

Does that help?

***********
Regards,
Ron

XL2002, WinXP-Pro


Danielle said:
Hi Ron

Thanks for your answer. I am afraid I dont understand some parts of your
reply, though - can you clarify for me?:

How would the list be in D4000 when there are only 3 items (in the example)?
Or is this based on there being 4000 records?

What is F1 and F2 referring to? What do I have to do with them?

If you can let me know and I can try it again :-)

Thanks
Danielle

Ron Coderre said:
Perhaps something like this:

With your list in Cells A3:D4000
and list headings in cells A3:D3 (Title, FName, Lname, Location)

F1: DupTest
F2: =COUNTIF(C$3:C4,C4)>1

Select your list (A3:D4000)
Data>Filter>Advanced Filter
List Range: (already selected)
Criteria Range: $F$1:$F$2

Click the [OK] button to filter the list in place.

That will hide the first occurrence of a LName and display only the
duplicates.

Select the visible cells below the column headings
Edit|Delete (you can only delete entire rows in a filtered list)

Does that help?

***********
Regards,
Ron

XL2002, WinXP-Pro


:

Hi

I am running v.2000 and need some help. I have an excel sheet containing
multiple columns and 1000s of rows.

I know how to delete rows which are completely identical (advanced filter
etc) but I need to know the following:

If there are two rows which are generally different (and so won't be deleted
by the advanced filter method), but each have an identical value in column D,
how would I get rid of the entire row containing the duplicate cell?

So for example if I had three rows as follows:

Mr | Joe | Bloggs | London
Mr | Tom | Jones | Glasgow
Mr | Ben | Jones | Essex

None of these would count as traditional duplicates via the advanced filter
method.

But what if I wanted to get rid of duplicate surnames - in the above example
I would ideally want to be left with just one of the "Jones" rows - how can I
do this?

It is worth noting that I have only basic experience with excel and so am
not familiar with macros etc - I tried one macro tutorial that didnt work :-S

Hope someone can help!

Thanks
Danielle
 
Try just following the instructions exactly as i posted them.

(With the Advanced Filter method I posted, you don't need to copy the
formula down the side of your data. The filter will adjust the formula
"internally" as it processes records. Consequently, you'll only need the one
formula in the criterial.
And DON'T put a column heading labelled TestDup in your data or it will
confuse the Advanced Filter.)


Does that help?

***********
Regards,
Ron

XL2002, WinXP-Pro


Danielle said:
Hello again,

Thanks - it's much clearer now and I understand what you mean.

However I still dont seem to be getting the right result :(

I am basically writing the TestDup in the header, then putting the formula
into the F2 column and the copying it down for each record.

I have tried various methods including selecting and not selecting the F
colum, choosing in-place or copy to another location, and choosing and not
choosing unique fields only, but dont seem to be getting anywhere.

Its worth noting that for the F fields with the formula in, it comes back
with either a FALSE or TRUE value, but only a couple of my records are
showing true, and they arent duplicates. The ducplicates that are there still
say False. Could this be a problem?

Sorry to be such a difficult one!

Danielle

Ron Coderre said:
OK

Regarding the assumptions and criteria:
With your list in Cells A3:D4000
and list headings in cells A3:D3 (Title, FName, Lname, Location)
F1: DupTest
F2: =COUNTIF(C$3:C4,C4)>1

1)As you know, for Advanced Filters to work, they need column headings on
the list. Hence the: Title, FName, Lname, Location

2)The range A3:D4000 is just as an example, but I made sure the data rows of
the list (under the column headings) would be under the criteria rows so they
wouldn't be deleted in the duplicate removal.

3)Since I used a formulaic criteria, I couldn't use a list column heading. I
could have left cell F1 blank, but instead i just used a descriptive heading
that didn't match any of the list headings.

4)Notice the single dollar sign ($) in the criteria forumula. That formula
will play out this way:

Testing cell C4, the formula will be: =COUNTIF(C$3:C4,C4)>1
Testing cell C5, the formula will be: =COUNTIF(C$3:C5,C5)>1
Testing cell C6, the formula will be: =COUNTIF(C$3:C6,C6)>1
etc

Consequently, the first time the formula encounters a value, its count will
be 1. All other instances will have a count of greater than 1.

The Advanced filter only displays records where the item count is greater
than 1....the duplicates.

Does that help?

***********
Regards,
Ron

XL2002, WinXP-Pro


Danielle said:
Hi Ron

Thanks for your answer. I am afraid I dont understand some parts of your
reply, though - can you clarify for me?:

How would the list be in D4000 when there are only 3 items (in the example)?
Or is this based on there being 4000 records?

What is F1 and F2 referring to? What do I have to do with them?

If you can let me know and I can try it again :-)

Thanks
Danielle

:

Perhaps something like this:

With your list in Cells A3:D4000
and list headings in cells A3:D3 (Title, FName, Lname, Location)

F1: DupTest
F2: =COUNTIF(C$3:C4,C4)>1

Select your list (A3:D4000)
Data>Filter>Advanced Filter
List Range: (already selected)
Criteria Range: $F$1:$F$2

Click the [OK] button to filter the list in place.

That will hide the first occurrence of a LName and display only the
duplicates.

Select the visible cells below the column headings
Edit|Delete (you can only delete entire rows in a filtered list)

Does that help?

***********
Regards,
Ron

XL2002, WinXP-Pro


:

Hi

I am running v.2000 and need some help. I have an excel sheet containing
multiple columns and 1000s of rows.

I know how to delete rows which are completely identical (advanced filter
etc) but I need to know the following:

If there are two rows which are generally different (and so won't be deleted
by the advanced filter method), but each have an identical value in column D,
how would I get rid of the entire row containing the duplicate cell?

So for example if I had three rows as follows:

Mr | Joe | Bloggs | London
Mr | Tom | Jones | Glasgow
Mr | Ben | Jones | Essex

None of these would count as traditional duplicates via the advanced filter
method.

But what if I wanted to get rid of duplicate surnames - in the above example
I would ideally want to be left with just one of the "Jones" rows - how can I
do this?

It is worth noting that I have only basic experience with excel and so am
not familiar with macros etc - I tried one macro tutorial that didnt work :-S

Hope someone can help!

Thanks
Danielle
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Back
Top