Decoding MIME encoded email subject

  • Thread starter Thread starter b. dougherty
  • Start date Start date
B

b. dougherty

Greetings all- I am trying to extract subject headers from emails that
have been saved as text files. The subject headers are in MIME UTF-8
format, and so they appear like this:

subject:
=?utf-8?B?QVVUTyBQRU9QTEUgLS0gTWFuaGVpbeKAmXMgSmVmZiBCdW5jaCBpbiBIaWdoYmVhbXM7IExlZ2VuZGFyeSBSZWQgTWNDb21iczsgV2hv4oCZcyBTaGlmdGluZyBHZWFycz87IE1vcmU=?=

What class can I use to decode the subject text?
 
That's not MIME format. MIME provides separation of message parts and
embedding of messages within other messages.

That's probably base64 or uuencode or something like that, not sure
exactly. The MIME header should have an encoding line which says what
encoding is used for the rest of the message. Most commonly MIME
messages are encoded with Quoted-Printable for text and Base64 for
binary. QP looks pretty much just like regular text with a lot of
extra = signs.

If this isn't enough info, post more of the MIME message.

Sam
 
Sorry, it appears to be a message header extension, formatted as
described in section 4.1 of this:

http://tools.ietf.org/html/rfc2047

Any idea what class can decode this? Here's a larger snippet of the
mail:

--------------------------------------------------------------------

Content-Type: message/rfc822

Received: from SERVER ([x.x.x.x]) by x.com with Microsoft
SMTPSVC(6.0.3790.1830);
Wed, 13 Dec 2006 22:12:17 -0800
mime-version: 1.0
from: "User" <[email protected]>
to: (e-mail address removed)
date: 13 Dec 2006 22:12:17 -0800
subject:
=?utf-8?B?QVVUTyBQRU9QTEUgLS0gTWFuaGVpbeKAmXMgSmVmZiBCdW5jaCBpbiBIaWdoYmVhbXM7IExlZ2VuZGFyeSBSZWQgTWNDb21iczsgV2hv4oCZcyBTaGlmdGluZyBHZWFycz87IE1vcmU=?=
content-type: multipart/mixed;
boundary=--boundary_54358_dc8ddb80-9498-4b90-8e3e-3d2c411a5160
 
b. dougherty said:
Greetings all- I am trying to extract subject headers from emails that
have been saved as text files. The subject headers are in MIME UTF-8
format, and so they appear like this:

subject:
=?utf-8?B?QVVUTyBQRU9QTEUgLS0gTWFuaGVpbeKAmXMgSmVmZiBCdW5jaCBpbiBIaWdoYmVhbXM7IExlZ2VuZGFyeSBSZWQgTWNDb21iczsgV2hv4oCZcyBTaGlmdGluZyBHZWFycz87IE1vcmU=?=

What class can I use to decode the subject text?

Try this:

public static string Decode(string s)
{
MatchCollection rr = Regex.Matches(s,
@"(?:=\?)([^\?]+)(?:\?B\?)([^\?]*)(?:\?=)");
string charset = rr[0].Groups[1].Value;
string data = rr[0].Groups[2].Value;
byte[] b = Convert.FromBase64String(data);
string res = Encoding.GetEncoding(charset).GetString(b);
return res;
}

Arne
 
Arne, that worked perfectly. Thank you very much!


b. dougherty said:
Greetings all- I am trying to extract subject headers from emails that
have been saved as text files. The subject headers are in MIME UTF-8
format, and so they appear like this:

subject:
=?utf-8?B?QVVUTyBQRU9QTEUgLS0gTWFuaGVpbeKAmXMgSmVmZiBCdW5jaCBpbiBIaWdoYmVhbXM7IExlZ2VuZGFyeSBSZWQgTWNDb21iczsgV2hv4oCZcyBTaGlmdGluZyBHZWFycz87IE1vcmU=?=

What class can I use to decode the subject text?

Try this:

public static string Decode(string s)
{
MatchCollection rr = Regex.Matches(s,
@"(?:=\?)([^\?]+)(?:\?B\?)([^\?]*)(?:\?=)");
string charset = rr[0].Groups[1].Value;
string data = rr[0].Groups[2].Value;
byte[] b = Convert.FromBase64String(data);
string res = Encoding.GetEncoding(charset).GetString(b);
return res;
}

Arne
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Back
Top