MultiByteToWideChar in .NET - Multibyte to Unicode conversion

groups · Nov 16, 2005

I have a C# application which needs to convert MultiByte strings to
Unicode.
However, I cannot get MultiByteToWideChar to behave as expected within
..net.
I have declared it as follows:

[DllImport("Kernel32", CharSet = CharSet.Auto)]
static extern Int32 MultiByteToWideChar(
UInt32 codePage,
UInt32 dwFlags,
[In, MarshalAs(UnmanagedType.LPStr)] String lpMultiByteStr,
Int32 cbMultiByte,
[Out, MarshalAs(UnmanagedType.LPWStr)] StringBuilder lpWideCharStr,

Int32 cchWideChar);

And am using it as follows:

private string ConvertToUnicode( string str, uint codepage)
{
int l = str.Length;
int i = 0;
i = MultiByteToWideChar( codepage, 0, str, -1, null, 0);
StringBuilder wideStr = new StringBuilder(i);
i = MultiByteToWideChar( codepage, 0, str, -1, wideStr,
wideStr.Capacity);
string s = wideStr.ToString();
return s;
}

If I initialize a C# string with the following bytes: 43, 3A, 5C, 83,
88, 83, 45, 83, 52, 83, 5C, 00 and use the ConvertToUnicode function
above with codepage 932 (Japanese), i get garbage (C:\???E?R?\).
However, using a pure .NET solution (below) I get the correct string
(C:\ãƒ¨ã‚¦ã‚³ã‚½):

private string MultibyteToUnicodeNETOnly( string str, int codepage)
{
byte[] source = MCBSToByte(str);
Encoding e1 = Encoding.GetEncoding(codepage);
Encoding e2 = Encoding.Unicode;
byte[] target = Encoding.Convert( e1, e2, source);
return e2.GetString( target);
}

private byte[] MCBSToByte(string s)
{
byte[] b = new byte[s.Length];
int i = 0 ;
foreach( char c in s)
b[ i++] = (byte)c;
return b;
}

Any insights on a way to get MultiByteToWideChar to work, or a better
solution? Thanks in advance.

Michael \(michka\) Kaplan [MS] · Nov 16, 2005

What is wrong with the pure managed solution? It is even better and faster
in 2.0....

--
MichKa [Microsoft]
NLS Collation/Locale/Keyboard Technical Lead
Globalization Infrastructure, Fonts, and Tools
Blog: http://blogs.msdn.com/michkap

This posting is provided "AS IS" with
no warranties, and confers no rights.

I have a C# application which needs to convert MultiByte strings to
Unicode.
However, I cannot get MultiByteToWideChar to behave as expected within
..net.
I have declared it as follows:

[DllImport("Kernel32", CharSet = CharSet.Auto)]
static extern Int32 MultiByteToWideChar(
UInt32 codePage,
UInt32 dwFlags,
[In, MarshalAs(UnmanagedType.LPStr)] String lpMultiByteStr,
Int32 cbMultiByte,
[Out, MarshalAs(UnmanagedType.LPWStr)] StringBuilder lpWideCharStr,

Int32 cchWideChar);

And am using it as follows:

private string ConvertToUnicode( string str, uint codepage)
{
int l = str.Length;
int i = 0;
i = MultiByteToWideChar( codepage, 0, str, -1, null, 0);
StringBuilder wideStr = new StringBuilder(i);
i = MultiByteToWideChar( codepage, 0, str, -1, wideStr,
wideStr.Capacity);
string s = wideStr.ToString();
return s;
}

If I initialize a C# string with the following bytes: 43, 3A, 5C, 83,
88, 83, 45, 83, 52, 83, 5C, 00 and use the ConvertToUnicode function
above with codepage 932 (Japanese), i get garbage (C:\???E?R?\).
However, using a pure .NET solution (below) I get the correct string
(C:\????):

private string MultibyteToUnicodeNETOnly( string str, int codepage)
{
byte[] source = MCBSToByte(str);
Encoding e1 = Encoding.GetEncoding(codepage);
Encoding e2 = Encoding.Unicode;
byte[] target = Encoding.Convert( e1, e2, source);
return e2.GetString( target);
}

private byte[] MCBSToByte(string s)
{
byte[] b = new byte[s.Length];
int i = 0 ;
foreach( char c in s)
b[ i++] = (byte)c;
return b;
}

Any insights on a way to get MultiByteToWideChar to work, or a better
solution? Thanks in advance.

Mattias Sjögren · Nov 16, 2005

[DllImport("Kernel32", CharSet = CharSet.Auto)]

There's no point in specifying CharSet.Auto here since there's only
one MultiByteToWideChar function.

[In, MarshalAs(UnmanagedType.LPStr)] String lpMultiByteStr,

This should be a byte[] instead of a String.

If I initialize a C# string with the following bytes: 43, 3A, 5C, 83,
88, 83, 45, 83, 52, 83, 5C, 00

That's your problem, you shouldn't use a string to store byte values.
A string is already Unicode in .NET so the conversion has already
taken place.

Any insights on a way to get MultiByteToWideChar to work, or a better
solution?

Any reason you don't want to use the Encoding classes?

Mattias

multi byte char * to csharp string	2	Oct 20, 2003
Unicode to ASCII in C#	1	Apr 23, 2010
.net -> win32 stream oddity	1	May 19, 2008
Changing Time Zone using Interop	0	Feb 2, 2010
Marshalling an LPSTREAM back to Managed code...	0	Nov 19, 2010
Converting byte array to Unicode string in C#	1	Feb 13, 2006
codes to work in .NET Framework	2	Apr 15, 2007
P/Invoke structure question	16	Jul 8, 2004

MultiByteToWideChar in .NET - Multibyte to Unicode conversion

groups

Michael \(michka\) Kaplan [MS]

Mattias Sjögren

Ask a Question

Similar Threads