convert to utf-8, part II

  • Thread starter Thread starter Albert Jan
  • Start date Start date
A

Albert Jan

Hi,

in my quest to properly display email messages I have overcome the problem
of decoding strings like

"=?GB2312?B?s8q5q8u+vq3A7aGissbO8bK/w8W1xNK7t+LQxQ==?="

(which appears to be an 'encoded-word' as is explained in rfc2047), by using
Convert.FromBase64String as Stefen kindly suggested (my post from
yesterday).

The body part of the same message however claims also to be of charset
GB2312, but certainly is not base64 encoded. It contains text in which (what
should be) Chinese and ASCII characters mixed like:


Vz@m#:UEP!=c 0724398252418

;rE_MAIL#:[email protected]


I tried to use the conversion function as suggested by Morton (thank you for
the code) but this doesn't produce chine looking characters:

public static string toUTF8(string messageString, string charset)
{
Encoding dstEnc = Encoding.UTF8;
MessageBox.Show(Encoding.Default.ToString());

if(charset.Length==0)
{
charset="us-ascii";
}

Encoding srcEnc=Encoding.GetEncoding(charset);
byte[] srcData = srcEnc.GetBytes( messageString );

string utf8String = dstEnc.GetString(srcData);
return utf8String;
}


I hope someone can help me.

Regards,

Albert Jan
 
Albert,

Are you looking for something like this however than with your properiate
code set.

System.Text.Encoding.GetEncoding(437).GetBytes(Str.ReadToEnd)

And than with the proper codeset.

Cor
 
Back
Top