Escaping special chars in XML

  • Thread starter Thread starter Frank Rizzo
  • Start date Start date
F

Frank Rizzo

Hello,

I'd like to have the following structure in my XML file

<lname, _fname, _minit>
<status>it is all good</status>
</lname, _fname, _minit>

But apparently, there is a problem with commas and underscores being in
the key name of the node. How can I escape it?

Thanks.
 
Peter said:
Frank,
This documentation page details exactly what is and is not allowed in an
XmlElement name:
http://msdn2.microsoft.com/en-us/library/35577sxd.aspx

Thank you. However, following those rules, something like this:

<order details> would be encoded as <Order_x0020_Details> (e.g. space
converted to hex plus underscore before and after). When I load the
file into IE, it doesn't convert _x0020_ to an actual space. So I am
confused as to how I should encode spaces and commas when they appear in
the name of the tag.

Thanks.
 
In two words, you can't. spaces and commas simply aren't allowed in XML
element names. What exactly is it that you hope to accomplish? Certainly
there must be another way to get there... If the idea is to be able to parse
the element name as if it were some CSV delimited format, all you would need
to do is use the uderscore and /or some other "legal" character as the
delimters.
Peter

--
Co-founder, Eggheadcafe.com developer portal:
http://www.eggheadcafe.com
UnBlog:
http://petesbloggerama.blogspot.com
 
Frank Rizzo said:
Hello,

I'd like to have the following structure in my XML file

<lname, _fname, _minit>
<status>it is all good</status>
</lname, _fname, _minit>

But apparently, there is a problem with commas and underscores being in
the key name of the node. How can I escape it?

You can't; those characters are not legal in element names. See
http://www.w3.org/TR/REC-xml/#sec-starttags for the exact rules, but
roughly:

1. The first character must be a letter or an underscore
2. The rest can be a letter, a digit, an underscore, a dot, or a dash

The BNF expansions list colons too, but nowadays colons are reserved for
expressing namespaces.

Can you use dots instead of commas?
 
Element names are not meant to store data, only to identify the data.

It might be better to do something like

<person>
<fname>xxx</fname>
<lname>xxx</lname>
<status>xxx</status>
</person>
 
But I can still escape spaces as _x0020_ and comma= _x002c_
Can't I use those in element names?
 
Frank Rizzo said:
But I can still escape spaces as _x0020_ and comma= _x002c_
Can't I use those in element names?

You can, but you'll have to unescape them yourself.

As Peter said, you shouldn't really be doing this - it very much feels
like you're using the element name to store data rather than just to
identify data.
 
Don't let the web page quoted about mislead you. If you do that, it's not
"escaping" in the sense that C# escapes backslash as \\.. It's not a
standard, and no program, other than anything you might write, will see a
relationship between space and _x0020_.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Back
Top