finding and deleting duplicated lines

  • Thread starter Thread starter Jezebel
  • Start date Start date
J

Jezebel

You can do this with Find and Replace. Use wildcards, search for: (*^13)\1,
replace with: \1

For some purposes, a better approach is to copy the whole lot to Excel, use
the data filters, then copy back to Word.
 
Hello all

I work as a translator (German to English)
and keep my terminology lists in MS-Word-2000.

These lists have the following format per line :
GERMAN tab ENGLISH tab SUBJECT para mark;
(so they can easily be converted to Excel or TXT).

One of my lists is the result of mixing several smaller lists
and contains a lot of exactly duplicate lines.

My question is :
How can I automatically find and delete such unwanted duplicates ?
(The list is far too big to do this on foot).
Can this be done by macro ???

Thanks in advance for any help
from Ray
 
Hello Jezebel
search for (*^13)\1
replace with \1

I'm afraid that didn't work.

Can you give me a few more detailed tips for trying with Excel filters ?

Thanks, Ray
 
If it didn't work then either your lines are not strictly identical or are
not discrete paragraphs (search and replace can deal with that, too); or one
of us made a typo! :)

The quick way in Excel is to select the column, the select 'unique records
only' from the Filter > Advanced dialog.
 
The duplicates must be consecutively placed in the list, so you would have
to sort the list for this to work.

--
<>>< ><<> ><<> <>>< ><<> <>>< <>><<>
Graham Mayor - Word MVP

My web site www.gmayor.com

<>>< ><<> ><<> <>>< ><<> <>>< <>><<>
 
I'd just assumed that, but you're right; I shouldn't have jumped to that
conclusion.
 
Hello again Jezebel

The Excel approach seems to have worked
- but I'm curious to know how the Word solution works.

Would you mind explaining what the search string represents ?

Thanks again
from Ray
 
Taking it from the inside out, *^13 means any number of characters followed
by character 013, which is the paragraph mark (ie, this expression matches
any paragraph). By putting an expression in brackets you can refer to it by
number. In this case there is only one bracketed expression, hence \1. So
(*^13)\1 means any repeated paragraph.

The technical term for this kind of searching is 'regular expression' -- a
Google on which will find plenty of info.
 
Back
Top