PC Review


Reply
Thread Tools Rate Thread

Comparing 2 MSG files .

 
 
=?Utf-8?B?cGVrZXRp?=
Guest
Posts: n/a
 
      17th Jan 2006
I am writing an application that generates a message digest (MD5) for a MSG
file generated by Outlook application. If I generate a MSG file each from 2
PST files, FILEA.PST and COPY_OF_FILEA.PST (* COPY_OF_FILEA.PST is a copy of
FILEA.PST *), A.MSG and A1.MSG, I get a hashvalue or the message digest
different for these messages though the content of these messages are same.
Any reasons for this and workarounds ?
 
Reply With Quote
 
 
 
 
Dmitry Streblechenko
Guest
Posts: n/a
 
      17th Jan 2006
Some properties can and will be different, e.g. PR_SEARCH_KEY. Most likely
neither you nor your users care about these properties, but they will be
make the MSG fiel different.
Secondly, the order in which properties are streamed to the MSG file can be
different, so even if the mesasge is the same, the hash will be different.
You need to use a different algorithm to determine whether two messages are
the same, such as comparing the message body, subject, recipients, etc.

Dmitry Streblechenko (MVP)
http://www.dimastr.com/
OutlookSpy - Outlook, CDO
and MAPI Developer Tool

"peketi" <(E-Mail Removed)> wrote in message
newsA646850-5E62-4610-9E82-(E-Mail Removed)...
>I am writing an application that generates a message digest (MD5) for a MSG
> file generated by Outlook application. If I generate a MSG file each from
> 2
> PST files, FILEA.PST and COPY_OF_FILEA.PST (* COPY_OF_FILEA.PST is a copy
> of
> FILEA.PST *), A.MSG and A1.MSG, I get a hashvalue or the message digest
> different for these messages though the content of these messages are
> same.
> Any reasons for this and workarounds ?



 
Reply With Quote
 
=?Utf-8?B?cGVrZXRp?=
Guest
Posts: n/a
 
      18th Jan 2006
Thanks for the information. Initially I thought it may be the 'Entry ID'
which is the only thing different but you have given me some more insight to
this problem. Is there a way out to get the 'Message-ID' from the message,
may be that would help me solve this problem. Your help is this regard would
be highly appreciated.


"Dmitry Streblechenko" wrote:

> Some properties can and will be different, e.g. PR_SEARCH_KEY. Most likely
> neither you nor your users care about these properties, but they will be
> make the MSG fiel different.
> Secondly, the order in which properties are streamed to the MSG file can be
> different, so even if the mesasge is the same, the hash will be different.
> You need to use a different algorithm to determine whether two messages are
> the same, such as comparing the message body, subject, recipients, etc.
>
> Dmitry Streblechenko (MVP)
> http://www.dimastr.com/
> OutlookSpy - Outlook, CDO
> and MAPI Developer Tool
>
> "peketi" <(E-Mail Removed)> wrote in message
> newsA646850-5E62-4610-9E82-(E-Mail Removed)...
> >I am writing an application that generates a message digest (MD5) for a MSG
> > file generated by Outlook application. If I generate a MSG file each from
> > 2
> > PST files, FILEA.PST and COPY_OF_FILEA.PST (* COPY_OF_FILEA.PST is a copy
> > of
> > FILEA.PST *), A.MSG and A1.MSG, I get a hashvalue or the message digest
> > different for these messages though the content of these messages are
> > same.
> > Any reasons for this and workarounds ?

>
>
>

 
Reply With Quote
 
Dmitry Streblechenko
Guest
Posts: n/a
 
      18th Jan 2006
Entry id is never stored in the MSG files since it only makes sense in the
context of the parent store, which MSG files do not have.
Which message id do you mean? The one from the MIME headers? Note that some
messages (e.g. the ones stored in the Sent Items of a PST store) do not even
have a MIME headers.

Dmitry Streblechenko (MVP)
http://www.dimastr.com/
OutlookSpy - Outlook, CDO
and MAPI Developer Tool

"peketi" <(E-Mail Removed)> wrote in message
news:33775DB5-B895-46F5-A10C-(E-Mail Removed)...
> Thanks for the information. Initially I thought it may be the 'Entry ID'
> which is the only thing different but you have given me some more insight
> to
> this problem. Is there a way out to get the 'Message-ID' from the
> message,
> may be that would help me solve this problem. Your help is this regard
> would
> be highly appreciated.
>
>
> "Dmitry Streblechenko" wrote:
>
>> Some properties can and will be different, e.g. PR_SEARCH_KEY. Most
>> likely
>> neither you nor your users care about these properties, but they will be
>> make the MSG fiel different.
>> Secondly, the order in which properties are streamed to the MSG file can
>> be
>> different, so even if the mesasge is the same, the hash will be
>> different.
>> You need to use a different algorithm to determine whether two messages
>> are
>> the same, such as comparing the message body, subject, recipients, etc.
>>
>> Dmitry Streblechenko (MVP)
>> http://www.dimastr.com/
>> OutlookSpy - Outlook, CDO
>> and MAPI Developer Tool
>>
>> "peketi" <(E-Mail Removed)> wrote in message
>> newsA646850-5E62-4610-9E82-(E-Mail Removed)...
>> >I am writing an application that generates a message digest (MD5) for a
>> >MSG
>> > file generated by Outlook application. If I generate a MSG file each
>> > from
>> > 2
>> > PST files, FILEA.PST and COPY_OF_FILEA.PST (* COPY_OF_FILEA.PST is a
>> > copy
>> > of
>> > FILEA.PST *), A.MSG and A1.MSG, I get a hashvalue or the message digest
>> > different for these messages though the content of these messages are
>> > same.
>> > Any reasons for this and workarounds ?

>>
>>
>>



 
Reply With Quote
 
=?Utf-8?B?RGllZ29s?=
Guest
Posts: n/a
 
      15th Feb 2006
Dmitry, I have a similar problem:

We've written an application that processes e-mails we receive in Outlook
and computes their MD5 hash values. So far it works alright for all file
types, except attached .msg files (note we are not saving a certain message
in the .msg format, but rather saving an attached .msg file).
To our surprise, MD5 hashes vary for attached .msg files (and only for
these). I mean, if I save an attached MSG twice, first as 1.msg and then as
2.msg, their hash values differ. Why's that? I assumed Outlook handled
attached .msgs as if they were any other file type, but now it seems not
(other file types get identical hash values when saved twice).
Any clues?

"Dmitry Streblechenko" wrote:

> Entry id is never stored in the MSG files since it only makes sense in the
> context of the parent store, which MSG files do not have.
> Which message id do you mean? The one from the MIME headers? Note that some
> messages (e.g. the ones stored in the Sent Items of a PST store) do not even
> have a MIME headers.
>
> Dmitry Streblechenko (MVP)
> http://www.dimastr.com/
> OutlookSpy - Outlook, CDO
> and MAPI Developer Tool
>
> "peketi" <(E-Mail Removed)> wrote in message
> news:33775DB5-B895-46F5-A10C-(E-Mail Removed)...
> > Thanks for the information. Initially I thought it may be the 'Entry ID'
> > which is the only thing different but you have given me some more insight
> > to
> > this problem. Is there a way out to get the 'Message-ID' from the
> > message,
> > may be that would help me solve this problem. Your help is this regard
> > would
> > be highly appreciated.
> >
> >
> > "Dmitry Streblechenko" wrote:
> >
> >> Some properties can and will be different, e.g. PR_SEARCH_KEY. Most
> >> likely
> >> neither you nor your users care about these properties, but they will be
> >> make the MSG fiel different.
> >> Secondly, the order in which properties are streamed to the MSG file can
> >> be
> >> different, so even if the mesasge is the same, the hash will be
> >> different.
> >> You need to use a different algorithm to determine whether two messages
> >> are
> >> the same, such as comparing the message body, subject, recipients, etc.
> >>
> >> Dmitry Streblechenko (MVP)
> >> http://www.dimastr.com/
> >> OutlookSpy - Outlook, CDO
> >> and MAPI Developer Tool
> >>
> >> "peketi" <(E-Mail Removed)> wrote in message
> >> newsA646850-5E62-4610-9E82-(E-Mail Removed)...
> >> >I am writing an application that generates a message digest (MD5) for a
> >> >MSG
> >> > file generated by Outlook application. If I generate a MSG file each
> >> > from
> >> > 2
> >> > PST files, FILEA.PST and COPY_OF_FILEA.PST (* COPY_OF_FILEA.PST is a
> >> > copy
> >> > of
> >> > FILEA.PST *), A.MSG and A1.MSG, I get a hashvalue or the message digest
> >> > different for these messages though the content of these messages are
> >> > same.
> >> > Any reasons for this and workarounds ?
> >>
> >>
> >>

>
>
>

 
Reply With Quote
 
Dmitry Streblechenko
Guest
Posts: n/a
 
      16th Feb 2006
Again, on the MSG file level, the binary data is irrelevant.
You must decide which set of properties constitutes a message, then
calculate a hash. E.g. a hash of concatenated body, subject, sender, sent
date, etc separated by 0x0 should do the trick.

Dmitry Streblechenko (MVP)
http://www.dimastr.com/
OutlookSpy - Outlook, CDO
and MAPI Developer Tool

"Diegol" <(E-Mail Removed)> wrote in message
news:2AB5F3D8-9043-4760-B452-(E-Mail Removed)...
> Dmitry, I have a similar problem:
>
> We've written an application that processes e-mails we receive in Outlook
> and computes their MD5 hash values. So far it works alright for all file
> types, except attached .msg files (note we are not saving a certain
> message
> in the .msg format, but rather saving an attached .msg file).
> To our surprise, MD5 hashes vary for attached .msg files (and only for
> these). I mean, if I save an attached MSG twice, first as 1.msg and then
> as
> 2.msg, their hash values differ. Why's that? I assumed Outlook handled
> attached .msgs as if they were any other file type, but now it seems not
> (other file types get identical hash values when saved twice).
> Any clues?
>
> "Dmitry Streblechenko" wrote:
>
>> Entry id is never stored in the MSG files since it only makes sense in
>> the
>> context of the parent store, which MSG files do not have.
>> Which message id do you mean? The one from the MIME headers? Note that
>> some
>> messages (e.g. the ones stored in the Sent Items of a PST store) do not
>> even
>> have a MIME headers.
>>
>> Dmitry Streblechenko (MVP)
>> http://www.dimastr.com/
>> OutlookSpy - Outlook, CDO
>> and MAPI Developer Tool
>>
>> "peketi" <(E-Mail Removed)> wrote in message
>> news:33775DB5-B895-46F5-A10C-(E-Mail Removed)...
>> > Thanks for the information. Initially I thought it may be the 'Entry
>> > ID'
>> > which is the only thing different but you have given me some more
>> > insight
>> > to
>> > this problem. Is there a way out to get the 'Message-ID' from the
>> > message,
>> > may be that would help me solve this problem. Your help is this regard
>> > would
>> > be highly appreciated.
>> >
>> >
>> > "Dmitry Streblechenko" wrote:
>> >
>> >> Some properties can and will be different, e.g. PR_SEARCH_KEY. Most
>> >> likely
>> >> neither you nor your users care about these properties, but they will
>> >> be
>> >> make the MSG fiel different.
>> >> Secondly, the order in which properties are streamed to the MSG file
>> >> can
>> >> be
>> >> different, so even if the mesasge is the same, the hash will be
>> >> different.
>> >> You need to use a different algorithm to determine whether two
>> >> messages
>> >> are
>> >> the same, such as comparing the message body, subject, recipients,
>> >> etc.
>> >>
>> >> Dmitry Streblechenko (MVP)
>> >> http://www.dimastr.com/
>> >> OutlookSpy - Outlook, CDO
>> >> and MAPI Developer Tool
>> >>
>> >> "peketi" <(E-Mail Removed)> wrote in message
>> >> newsA646850-5E62-4610-9E82-(E-Mail Removed)...
>> >> >I am writing an application that generates a message digest (MD5) for
>> >> >a
>> >> >MSG
>> >> > file generated by Outlook application. If I generate a MSG file
>> >> > each
>> >> > from
>> >> > 2
>> >> > PST files, FILEA.PST and COPY_OF_FILEA.PST (* COPY_OF_FILEA.PST is a
>> >> > copy
>> >> > of
>> >> > FILEA.PST *), A.MSG and A1.MSG, I get a hashvalue or the message
>> >> > digest
>> >> > different for these messages though the content of these messages
>> >> > are
>> >> > same.
>> >> > Any reasons for this and workarounds ?
>> >>
>> >>
>> >>

>>
>>
>>



 
Reply With Quote
 
=?Utf-8?B?RGllZ29s?=
Guest
Posts: n/a
 
      16th Feb 2006
Hi Dmitry,

The problem we have is that we treat every attachment as a file.
We copy every attachment in a temporal folder and hash the filetostring of
that temp.
I don't know how to retrieve those properties from a formerly attached .msg
which is now saved to a folder...

I also need some advice about this... We're using Visual Fox Pro for the
utility that processes incoming mails, because we are receiving fox tables,
but I'd like to switch to VB. Any recommendations/comments?

Thank you in advance
Diegol

"Dmitry Streblechenko" wrote:

> Again, on the MSG file level, the binary data is irrelevant.
> You must decide which set of properties constitutes a message, then
> calculate a hash. E.g. a hash of concatenated body, subject, sender, sent
> date, etc separated by 0x0 should do the trick.
>
> Dmitry Streblechenko (MVP)
> http://www.dimastr.com/
> OutlookSpy - Outlook, CDO
> and MAPI Developer Tool
>
> "Diegol" <(E-Mail Removed)> wrote in message
> news:2AB5F3D8-9043-4760-B452-(E-Mail Removed)...
> > Dmitry, I have a similar problem:
> >
> > We've written an application that processes e-mails we receive in Outlook
> > and computes their MD5 hash values. So far it works alright for all file
> > types, except attached .msg files (note we are not saving a certain
> > message
> > in the .msg format, but rather saving an attached .msg file).
> > To our surprise, MD5 hashes vary for attached .msg files (and only for
> > these). I mean, if I save an attached MSG twice, first as 1.msg and then
> > as
> > 2.msg, their hash values differ. Why's that? I assumed Outlook handled
> > attached .msgs as if they were any other file type, but now it seems not
> > (other file types get identical hash values when saved twice).
> > Any clues?
> >
> > "Dmitry Streblechenko" wrote:
> >
> >> Entry id is never stored in the MSG files since it only makes sense in
> >> the
> >> context of the parent store, which MSG files do not have.
> >> Which message id do you mean? The one from the MIME headers? Note that
> >> some
> >> messages (e.g. the ones stored in the Sent Items of a PST store) do not
> >> even
> >> have a MIME headers.
> >>
> >> Dmitry Streblechenko (MVP)
> >> http://www.dimastr.com/
> >> OutlookSpy - Outlook, CDO
> >> and MAPI Developer Tool
> >>
> >> "peketi" <(E-Mail Removed)> wrote in message
> >> news:33775DB5-B895-46F5-A10C-(E-Mail Removed)...
> >> > Thanks for the information. Initially I thought it may be the 'Entry
> >> > ID'
> >> > which is the only thing different but you have given me some more
> >> > insight
> >> > to
> >> > this problem. Is there a way out to get the 'Message-ID' from the
> >> > message,
> >> > may be that would help me solve this problem. Your help is this regard
> >> > would
> >> > be highly appreciated.
> >> >
> >> >
> >> > "Dmitry Streblechenko" wrote:
> >> >
> >> >> Some properties can and will be different, e.g. PR_SEARCH_KEY. Most
> >> >> likely
> >> >> neither you nor your users care about these properties, but they will
> >> >> be
> >> >> make the MSG fiel different.
> >> >> Secondly, the order in which properties are streamed to the MSG file
> >> >> can
> >> >> be
> >> >> different, so even if the mesasge is the same, the hash will be
> >> >> different.
> >> >> You need to use a different algorithm to determine whether two
> >> >> messages
> >> >> are
> >> >> the same, such as comparing the message body, subject, recipients,
> >> >> etc.
> >> >>
> >> >> Dmitry Streblechenko (MVP)
> >> >> http://www.dimastr.com/
> >> >> OutlookSpy - Outlook, CDO
> >> >> and MAPI Developer Tool
> >> >>
> >> >> "peketi" <(E-Mail Removed)> wrote in message
> >> >> newsA646850-5E62-4610-9E82-(E-Mail Removed)...
> >> >> >I am writing an application that generates a message digest (MD5) for
> >> >> >a
> >> >> >MSG
> >> >> > file generated by Outlook application. If I generate a MSG file
> >> >> > each
> >> >> > from
> >> >> > 2
> >> >> > PST files, FILEA.PST and COPY_OF_FILEA.PST (* COPY_OF_FILEA.PST is a
> >> >> > copy
> >> >> > of
> >> >> > FILEA.PST *), A.MSG and A1.MSG, I get a hashvalue or the message
> >> >> > digest
> >> >> > different for these messages though the content of these messages
> >> >> > are
> >> >> > same.
> >> >> > Any reasons for this and workarounds ?
> >> >>
> >> >>
> >> >>
> >>
> >>
> >>

>
>
>

 
Reply With Quote
 
Dmitry Streblechenko
Guest
Posts: n/a
 
      17th Feb 2006
You can save each attachment to a temporary folder using
Attachment.SaveAsFile, calculate the file hash (along with the filename),
then delete the file.
Can't give you any advise re. VFP vs VB: I don't use either of them, sorry.

Dmitry Streblechenko (MVP)
http://www.dimastr.com/
OutlookSpy - Outlook, CDO
and MAPI Developer Tool

"Diegol" <(E-Mail Removed)> wrote in message
news:3BCAD75F-09E3-47F8-9584-(E-Mail Removed)...
> Hi Dmitry,
>
> The problem we have is that we treat every attachment as a file.
> We copy every attachment in a temporal folder and hash the filetostring of
> that temp.
> I don't know how to retrieve those properties from a formerly attached
> .msg
> which is now saved to a folder...
>
> I also need some advice about this... We're using Visual Fox Pro for the
> utility that processes incoming mails, because we are receiving fox
> tables,
> but I'd like to switch to VB. Any recommendations/comments?
>
> Thank you in advance
> Diegol
>
> "Dmitry Streblechenko" wrote:
>
>> Again, on the MSG file level, the binary data is irrelevant.
>> You must decide which set of properties constitutes a message, then
>> calculate a hash. E.g. a hash of concatenated body, subject, sender, sent
>> date, etc separated by 0x0 should do the trick.
>>
>> Dmitry Streblechenko (MVP)
>> http://www.dimastr.com/
>> OutlookSpy - Outlook, CDO
>> and MAPI Developer Tool
>>
>> "Diegol" <(E-Mail Removed)> wrote in message
>> news:2AB5F3D8-9043-4760-B452-(E-Mail Removed)...
>> > Dmitry, I have a similar problem:
>> >
>> > We've written an application that processes e-mails we receive in
>> > Outlook
>> > and computes their MD5 hash values. So far it works alright for all
>> > file
>> > types, except attached .msg files (note we are not saving a certain
>> > message
>> > in the .msg format, but rather saving an attached .msg file).
>> > To our surprise, MD5 hashes vary for attached .msg files (and only for
>> > these). I mean, if I save an attached MSG twice, first as 1.msg and
>> > then
>> > as
>> > 2.msg, their hash values differ. Why's that? I assumed Outlook handled
>> > attached .msgs as if they were any other file type, but now it seems
>> > not
>> > (other file types get identical hash values when saved twice).
>> > Any clues?
>> >
>> > "Dmitry Streblechenko" wrote:
>> >
>> >> Entry id is never stored in the MSG files since it only makes sense in
>> >> the
>> >> context of the parent store, which MSG files do not have.
>> >> Which message id do you mean? The one from the MIME headers? Note that
>> >> some
>> >> messages (e.g. the ones stored in the Sent Items of a PST store) do
>> >> not
>> >> even
>> >> have a MIME headers.
>> >>
>> >> Dmitry Streblechenko (MVP)
>> >> http://www.dimastr.com/
>> >> OutlookSpy - Outlook, CDO
>> >> and MAPI Developer Tool
>> >>
>> >> "peketi" <(E-Mail Removed)> wrote in message
>> >> news:33775DB5-B895-46F5-A10C-(E-Mail Removed)...
>> >> > Thanks for the information. Initially I thought it may be the
>> >> > 'Entry
>> >> > ID'
>> >> > which is the only thing different but you have given me some more
>> >> > insight
>> >> > to
>> >> > this problem. Is there a way out to get the 'Message-ID' from the
>> >> > message,
>> >> > may be that would help me solve this problem. Your help is this
>> >> > regard
>> >> > would
>> >> > be highly appreciated.
>> >> >
>> >> >
>> >> > "Dmitry Streblechenko" wrote:
>> >> >
>> >> >> Some properties can and will be different, e.g. PR_SEARCH_KEY. Most
>> >> >> likely
>> >> >> neither you nor your users care about these properties, but they
>> >> >> will
>> >> >> be
>> >> >> make the MSG fiel different.
>> >> >> Secondly, the order in which properties are streamed to the MSG
>> >> >> file
>> >> >> can
>> >> >> be
>> >> >> different, so even if the mesasge is the same, the hash will be
>> >> >> different.
>> >> >> You need to use a different algorithm to determine whether two
>> >> >> messages
>> >> >> are
>> >> >> the same, such as comparing the message body, subject, recipients,
>> >> >> etc.
>> >> >>
>> >> >> Dmitry Streblechenko (MVP)
>> >> >> http://www.dimastr.com/
>> >> >> OutlookSpy - Outlook, CDO
>> >> >> and MAPI Developer Tool
>> >> >>
>> >> >> "peketi" <(E-Mail Removed)> wrote in message
>> >> >> newsA646850-5E62-4610-9E82-(E-Mail Removed)...
>> >> >> >I am writing an application that generates a message digest (MD5)
>> >> >> >for
>> >> >> >a
>> >> >> >MSG
>> >> >> > file generated by Outlook application. If I generate a MSG file
>> >> >> > each
>> >> >> > from
>> >> >> > 2
>> >> >> > PST files, FILEA.PST and COPY_OF_FILEA.PST (* COPY_OF_FILEA.PST
>> >> >> > is a
>> >> >> > copy
>> >> >> > of
>> >> >> > FILEA.PST *), A.MSG and A1.MSG, I get a hashvalue or the message
>> >> >> > digest
>> >> >> > different for these messages though the content of these messages
>> >> >> > are
>> >> >> > same.
>> >> >> > Any reasons for this and workarounds ?
>> >> >>
>> >> >>
>> >> >>
>> >>
>> >>
>> >>

>>
>>
>>



 
Reply With Quote
 
 
 
Reply

Thread Tools
Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Comparing Files in W2K ? Magnusfarce Microsoft Windows 2000 3 29th Jul 2005 04:53 PM
comparing two files Leo Kerner Windows XP General 3 11th Mar 2005 06:34 AM
Comparing XML Files Mark Jerde Microsoft C# .NET 2 10th Mar 2005 12:39 AM
Comparing Files Irnst Microsoft Access 1 29th Apr 2004 06:35 PM
Comparing Files KNorman Microsoft Dot NET 1 6th Aug 2003 08:18 AM


Features
 

Advertising
 

Newsgroups
 


All times are GMT +1. The time now is 02:45 PM.