Invalid character returned when reading UTF-8 XML

E

Elliot

My XML is using encoding UTF-8 and its content contains Chinese character.
When debug the following codes:

string strXmlFile = "xml.xml";
XmlDocument objXml = new XmlDocument();

objXml.Load(strXmlFile);

It returns "Invalid character in the given encoding" and point to
objXml.Load(strXmlFile);.

How can I solve this matter?
There are many examples found to show UTF-8 string but load UTF-8 XML.
 
J

Jon Skeet [C# MVP]

Elliot said:
My XML is using encoding UTF-8 and its content contains Chinese character.
When debug the following codes:

string strXmlFile = "xml.xml";
XmlDocument objXml = new XmlDocument();

objXml.Load(strXmlFile);

It returns "Invalid character in the given encoding" and point to
objXml.Load(strXmlFile);.

How can I solve this matter?
There are many examples found to show UTF-8 string but load UTF-8 XML.

Sounds like your XML is actually invalid. Can you shrink it down to a
very small amount (just an element with some text in should be enough)
which still shows the problem, then base64 encode it and post it here?
 
A

Arne Vajhøj

Elliot said:
My XML is using encoding UTF-8 and its content contains Chinese character.
When debug the following codes:

string strXmlFile = "xml.xml";
XmlDocument objXml = new XmlDocument();

objXml.Load(strXmlFile);

It returns "Invalid character in the given encoding" and point to
objXml.Load(strXmlFile);.

How can I solve this matter?

Chance are good that you have encoding="UTF-8" in the XML file,
but that the content is actually in another encoding like
ISO-8859-1.

Arne
 
B

Bj?rn Brox

Elliot skrev:
Solved by changing encoding to BIG5.
I would call that a workaround for a XML file not containing UTF-8 when
it's declaration tells that it would. It's not a fix.
 
A

Arne Vajhøj

Bj?rn Brox said:
Elliot skrev:
I would call that a workaround for a XML file not containing UTF-8 when
it's declaration tells that it would. It's not a fix.

Isn't it a fix to change the declaration to match the actual content ??

Arne
 
B

Bjørn Brox

Arne Vajhøj skrev:
Isn't it a fix to change the declaration to match the actual content ??
It's a fix if you change this in the program producing the XML file.

It's a workaround if you manually edit the file afterwards or force
loading it as BIG5.

Anyhow, Elliot did not give enough details to make a conclusion, but my
first impression was that the fix was not done at the server-side.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top