Using XmlTextReader to read unicode characters

  • Thread starter Thread starter Jordan
  • Start date Start date
J

Jordan

I have a unicode XML file that I am trying to read using the .NET
XmlTextReader in C#. How do I read the unicode file? If I try to
using the XmlTextReader.Read() method, it throws an exception.

The exception reads:
The '€' character, hexadecimal value 0x80, cannot begin with a name.
Line 1, position 2.

Any suggestions? I read on Microsoft's website about writing surrogate
pairs, but I can't find any documentation that confirms the
XmlTextReader can handle surrogate pairs.
 
Jordan said:
I have a unicode XML file that I am trying to read using the .NET
XmlTextReader in C#. How do I read the unicode file? If I try to
using the XmlTextReader.Read() method, it throws an exception.

What Unicode encoding does that XML file have (e.g. UTF-8 or UTF-16)?
How do you know it is Unicode?
Is there an XML declaration (e.g. <?xml version="1.0"
encoding="UTF-8"?>) at the beginning? Is there a BOM (byte order mark)?
How do you create the XmlTextReader, simply with
new XmlTextReader("file.xml")
?
The exception reads:
The '€' character, hexadecimal value 0x80, cannot begin with a name.
Line 1, position 2.

Maybe the XML is not properly encoded? How do the first lines of the XML
file look?
What happens when you load the file with the IE browser? Does that give
a parse error too?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Back
Top