convert to utf-8, part II

A

Albert Jan

Hi,

in my quest to properly display email messages I have overcome the problem
of decoding strings like

"=?GB2312?B?s8q5q8u+vq3A7aGissbO8bK/w8W1xNK7t+LQxQ==?="

(which appears to be an 'encoded-word' as is explained in rfc2047), by using
Convert.FromBase64String as Stefen kindly suggested (my post from
yesterday).

The body part of the same message however claims also to be of charset
GB2312, but certainly is not base64 encoded. It contains text in which (what
should be) Chinese and ASCII characters mixed like:


Vz@m#:UEP!=c 0724398252418

;rE_MAIL#:[email protected]


I tried to use the conversion function as suggested by Morton (thank you for
the code) but this doesn't produce chine looking characters:

public static string toUTF8(string messageString, string charset)
{
Encoding dstEnc = Encoding.UTF8;
MessageBox.Show(Encoding.Default.ToString());

if(charset.Length==0)
{
charset="us-ascii";
}

Encoding srcEnc=Encoding.GetEncoding(charset);
byte[] srcData = srcEnc.GetBytes( messageString );

string utf8String = dstEnc.GetString(srcData);
return utf8String;
}


I hope someone can help me.

Regards,

Albert Jan
 
C

Cor Ligthert

Albert,

Are you looking for something like this however than with your properiate
code set.

System.Text.Encoding.GetEncoding(437).GetBytes(Str.ReadToEnd)

And than with the proper codeset.

Cor
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads


Top