C# ASP .NET -- UTF-16 encoding to UTF-8

D

davidjgonzalez

I have a web application written in ASP .NET (VS 2003) which an Adobe
Acrobat Form posts XML to. I am able to get the XML using the
Request.InputStream however the XML is UTF-16 encoded. This means that
the byte[] that i get from the Request.InputStream looks like:
[0]: 255
[1]: 254
[2]: 64
[3]: 0
[4]: 56
[5]: 0
....
essentially every other index in the array holds the value 0.
When i try to convert the byte array to a string, i get <\0x\0\m\0l\0
... (every other character is a \0) .. I also have the '\r\n' character
before the ending tags in the xml.

my question is two fold

1) how can i elegantly convert the UTF-16 formated xml to something
more readable aka UTF-8, ASCII, etc in Visual Studio 2003 (i cant find
any UTF-16 encoding support in VS 2003)

2) If 1 doesnt get rid of the '\r\n's how can i get rid of them?
string.replace("\r\n", "") didnt seem to work.

Thanks
 
D

davidjgonzalez

Greg thanks for the reply but that didn't seem to work..

My scope has slightly changed -- i no longer need UTF-8 encoding
persay, just need to parse the values from the XML that is being sent
over, so i need to convert the byte[] to a readable string.


.. here is my code:

-------------------------------------
Request.InputStream.Read(data, 0,
Convert.ToInt32(Request.InputStream.Length));

UnicodeEncoding encoding = new UnicodeEncoding( );
string decodedString = encoding.GetString(characters);
//at this point decodedString = " ????? ??????????? ?
?????????????????????????????????????????????????????????????????????????????????????????????????????????????"

DataSet ds = new DataSet();
ds = XmlToDataSet(decodedString);
-------------------------------------

when decodedString is passed to XmlToDataSet, it crashes I assume
because XmlToDataSet does not look like the encoding on the
decodedString..

How do i get the byte[] the Request.InputStream yields into a "normal"
encoding?

Thanks
 
J

Joerg Jooss

Thus wrote (e-mail address removed),
Greg thanks for the reply but that didn't seem to work..

My scope has slightly changed -- i no longer need UTF-8 encoding
persay, just need to parse the values from the XML that is being sent
over, so i need to convert the byte[] to a readable string.

.. here is my code:

-------------------------------------
Request.InputStream.Read(data, 0,
Convert.ToInt32(Request.InputStream.Length));
UnicodeEncoding encoding = new UnicodeEncoding( );

string decodedString = encoding.GetString(characters);

//at this point decodedString = " ????? ??????????? ?

??????????????????????????????????????????????????????????????????????
???????????????????????????????????????"

Your code is not really complete. You're reading into a byte array "data",
but decode something called "characters".

Note that you don't really need to perform these steps yourself, if all you
want to do is fill a DataSet.

aDataSet.ReadXml(Request.InputStream);

should do the trick. The XML infrastructure can figure out the encoding by
itself.

Cheers,
 
D

davidjgonzalez

oop - thanks for the catch ..

string decodedString = encoding.GetString(characters);
is supposed to read
string decodedString = encoding.GetString(data);

your ReadXml(...) solution from the input stream is just what i needed!
thanks
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top