xml special characters

J

Joe Abou Jaoude

Hi,

in my code, i need to load a string in an xml document.
However this string contains special characters like &,>,<," ...
I need to replace these characters by &amp; &lt; ...

however this is not as easy as it seems, because I have to check for
example if the > is actually a closing tag or an attribute value or
description, or if the double quote is a double quote to enclose an
attribute or a double quote in a description...

I want to know if there's any article or example that addresses this
issue, or if someone has any idea about how to solve it.

Thank you
 
R

Rick

If you use any of the xml components to load your string they should escape
the input correctly without you having to do anything.

Rick
 
?

=?ISO-8859-1?Q?G=F6ran_Andersson?=

Joe said:
in my code, i need to load a string in an xml document.
However this string contains special characters like &,>,<," ...
I need to replace these characters by &amp; &lt; ...

however this is not as easy as it seems, because I have to check for
example if the > is actually a closing tag or an attribute value or
description, or if the double quote is a double quote to enclose an
attribute or a double quote in a description...

It's not complicated at all as long as you do it before you put the
value into the markup code.

If you put the value in the markup code first, and then try to encode
it, you are screwed. You have thrown away information that can not be
recreated. In some cases you could determine what the value is, but
there is no way to do this for any possible value.
 
J

Joe Abou Jaoude

ok, let's say I have this string:

<root>
<element1 id="id1"> 3>1 </element1>
<root>

if I try to load this in an xml document.
i get an error saying there's an error in Line 2 position 23.

In this case Line 2 position 23 is the 27 th character in the
string.(the '>' character)
Is there a way to know that Line 2 position 23 corresponds to the 27th
character,and in this case I can go to the position 27 and change it

Regards
Joe
 
P

Phill W.

Joe said:
in my code, i need to load a string in an xml document.

You'd be better off using the classes/methods in the System.Xml
namespace to construct the Xml document in memory, then save the whole
thing.
However this string contains special characters like &,>,<," ...
I need to replace these characters by &amp; &lt; ...

however this is not as easy as it seems, because I have to check for
example if the > is actually a closing tag or an attribute value or
description, or if the double quote is a double quote to enclose an
attribute or a double quote in a description...

You /have/ to take some control over the data, here.
You /cannot/ just accept something that may or may not be Xml and try to
make sense of it - your /interpretation/ of the data may well be wrong.

Get the raw data at a lower level, escape it if required (or use the Xml
objects) and build up the [valid] Xml around it.

HTH,
Phill W.
 
M

Martin Honnen

Joe said:
ok, let's say I have this string:

<root>
<element1 id="id1"> 3>1 </element1>
<root>

if I try to load this in an xml document.
i get an error saying there's an error in Line 2 position 23.

No, the only error is the missing </root> so the error is in line 3
where you have <root> instead of </root>. The '>' does not need to be
escaped inside of an element, only '<' and '&' inside of an element
needs to be escaped.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top