XmlSerializiation to string using utf-8

M

muesliflakes

I have the following utility function to serialize my object to XML

public static string WriteToString( object o, Type type )
{
XmlSerializer serializer = new XmlSerializer( type );
StringWriter output = new StringWriter( );

serializer.Serialize(output, o );

return output.ToString( );
}

This method works great, except for one issue. Because the underlying
data is a string (unicode), the serialization process automatically
writes <?xml version='1.0' encoding='utf-16'?> out and there is no way
with strings of changing this encoding type.

Now if I was writing to a file, I would have no issues as StreamWriter
which is used for files supports an Encoding enumeration. But I don't
want to write the xml directly to a SqlServer stored procedure, but
the SP fails because it wants to see utf-8, now if I manually the
utf-16 to utf-8 (I know that the underlying data really is utf-16),
then SqlServer has no problems.

I want to find out how I can serialize an object to string, but change
the encoding type.

Cheers Dave
 
N

Nicholas Paldino [.NET/C# MVP]

muesliflakes,

You can't do this. The reason for this is that strings are always
stored in unicode. If you want to convert to UTF8, then you have to use the
UTF8Encoding class and call the GetBytes method to get the byte
representation of the string. Of course, you run the risk of dropping data,
since not everything that is represented in unicode can be represented in
UTF8.

Hope this helps.
 
N

Nicholas Paldino [.NET/C# MVP]

I should correct myself. UTF-8 -> Unicode is lossless.


--
- Nicholas Paldino [.NET/C# MVP]
- (e-mail address removed)

Nicholas Paldino said:
muesliflakes,

You can't do this. The reason for this is that strings are always
stored in unicode. If you want to convert to UTF8, then you have to use the
UTF8Encoding class and call the GetBytes method to get the byte
representation of the string. Of course, you run the risk of dropping data,
since not everything that is represented in unicode can be represented in
UTF8.

Hope this helps.


--
- Nicholas Paldino [.NET/C# MVP]
- (e-mail address removed)

muesliflakes said:
I have the following utility function to serialize my object to XML

public static string WriteToString( object o, Type type )
{
XmlSerializer serializer = new XmlSerializer( type );
StringWriter output = new StringWriter( );

serializer.Serialize(output, o );

return output.ToString( );
}

This method works great, except for one issue. Because the underlying
data is a string (unicode), the serialization process automatically
writes <?xml version='1.0' encoding='utf-16'?> out and there is no way
with strings of changing this encoding type.

Now if I was writing to a file, I would have no issues as StreamWriter
which is used for files supports an Encoding enumeration. But I don't
want to write the xml directly to a SqlServer stored procedure, but
the SP fails because it wants to see utf-8, now if I manually the
utf-16 to utf-8 (I know that the underlying data really is utf-16),
then SqlServer has no problems.

I want to find out how I can serialize an object to string, but change
the encoding type.

Cheers Dave
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top