Logic for Finding "Near Duplicates" in Discussion Threads

Chris Hansen · Feb 13, 2004

I'm interested in finding some logic to help me
identify "near duplicates" in Outlook e-mail populations.
Basically, I'm interested in locating only the most
complete e-mail in a discussion thread chain. Now in many
cases that would be the latest e-mail, but not always,
since a user, in replying or forwarding, can alter the
contents of the foregoing chain. So a "near duplicate,"
by my definition, would contain the same or less content,
and nothing unique in that text. Anyone cracked this nut
before? Thanks much!

Ken Slovak - [MVP - Outlook] · Feb 13, 2004

You can locate messages in the same conversation thread using the
ConversationTopic property. Items with the same ConversationTopic are
ordered in the conversation by the ConversationIndex property.

From there you can check on Item.Body to compare the text in the
various messages. I don't know of a way to do what you want in the UI,
only with code like that.

Logic for Finding "Near Duplicates" in Discussion Threads

Chris Hansen

Ken Slovak - [MVP - Outlook]