PC Review


Reply
Thread Tools Rate Thread

How to decode 'safe' html back to original raw text?

 
 
Richard Lewis Haggard
Guest
Posts: n/a
 
      12th Feb 2007
Is it possible to use features from XmlDocument to unescape text back to its
original raw text format after it has been escaped to handle non-HTML
compliant character strings?

I have code that serializes text to an XML file and then deserializes back
to text. If the user enters XML illegal text like "<Actor1>", the code
properly escapes it to "&lt;Actor1&gt;", something that doesn't interfere
with the XML syntax, and writes it into the Xml document, but the extracted
text from the node is not being 'unescaped' back to its original text. Is
there some way to use the built in features to restore the text data without
having to write yet another XML decoder parser?

Here's how the serialization works - Assuming that an XmlDocument object has
been created and it has some node already associated with it named
nodeParent, this is how a text node will be appended to that node.

public static XmlNode AppendText( XmlNode nodeParent, string nodeName,
string nodeValue )
{
XmlNode nodeText = nodeParent.OwnerDocument.CreateElement( nodeName );
nodeParent.AppendChild( nodeText );
nodeParent.Appendchild( nodeParent.OwnerDocument.CreateTextElement(
nodeValue ) );
return nodeText;
}

Assume that the text string "<Actor0>" is saved to node "Label". The result
in the XML file is

<Label>&lt;Actor0&gt;</Label>

To get the data out, I'm (incorrectly) using the XmlNode's InnerText
property, which simply returns the serialized text as it was written to the
file instead of converting it back to the original text. Is there an XmlNode
function that will unescape the text, thus returning the original text?
--
Richard Lewis Haggard
www.Haggard-And-Associates.com


 
Reply With Quote
 
 
 
 
Chan Ming Man
Guest
Posts: n/a
 
      13th Feb 2007
check this out:
http://msdn.microsoft.com/msdnmag/issues/01/01/xml/

chanmm

"Richard Lewis Haggard" <HaggardAtWorldDotStdDotCom> wrote in message
news:%(E-Mail Removed)...
> Is it possible to use features from XmlDocument to unescape text back to
> its original raw text format after it has been escaped to handle non-HTML
> compliant character strings?
>
> I have code that serializes text to an XML file and then deserializes back
> to text. If the user enters XML illegal text like "<Actor1>", the code
> properly escapes it to "&lt;Actor1&gt;", something that doesn't interfere
> with the XML syntax, and writes it into the Xml document, but the
> extracted text from the node is not being 'unescaped' back to its original
> text. Is there some way to use the built in features to restore the text
> data without having to write yet another XML decoder parser?
>
> Here's how the serialization works - Assuming that an XmlDocument object
> has been created and it has some node already associated with it named
> nodeParent, this is how a text node will be appended to that node.
>
> public static XmlNode AppendText( XmlNode nodeParent, string nodeName,
> string nodeValue )
> {
> XmlNode nodeText = nodeParent.OwnerDocument.CreateElement( nodeName );
> nodeParent.AppendChild( nodeText );
> nodeParent.Appendchild( nodeParent.OwnerDocument.CreateTextElement(
> nodeValue ) );
> return nodeText;
> }
>
> Assume that the text string "<Actor0>" is saved to node "Label". The
> result in the XML file is
>
> <Label>&lt;Actor0&gt;</Label>
>
> To get the data out, I'm (incorrectly) using the XmlNode's InnerText
> property, which simply returns the serialized text as it was written to
> the file instead of converting it back to the original text. Is there an
> XmlNode function that will unescape the text, thus returning the original
> text?
> --
> Richard Lewis Haggard
> www.Haggard-And-Associates.com
>


 
Reply With Quote
 
Richard Lewis Haggard
Guest
Posts: n/a
 
      13th Feb 2007
XmlNode.InnerText is supposed to unescape text. However, XmlNode.InnerXml
does not. Some of the code branches were incorrectly using node.InnerXml.
Once this was fixed then everything worked in the desired manner.
--
Richard Lewis Haggard
www.Haggard-And-Associates.com

"Richard Lewis Haggard" <HaggardAtWorldDotStdDotCom> wrote in message
news:%(E-Mail Removed)...
> Is it possible to use features from XmlDocument to unescape text back to
> its original raw text format after it has been escaped to handle non-HTML
> compliant character strings?
>
> I have code that serializes text to an XML file and then deserializes back
> to text. If the user enters XML illegal text like "<Actor1>", the code
> properly escapes it to "&lt;Actor1&gt;", something that doesn't interfere
> with the XML syntax, and writes it into the Xml document, but the
> extracted text from the node is not being 'unescaped' back to its original
> text. Is there some way to use the built in features to restore the text
> data without having to write yet another XML decoder parser?
>
> Here's how the serialization works - Assuming that an XmlDocument object
> has been created and it has some node already associated with it named
> nodeParent, this is how a text node will be appended to that node.
>
> public static XmlNode AppendText( XmlNode nodeParent, string nodeName,
> string nodeValue )
> {
> XmlNode nodeText = nodeParent.OwnerDocument.CreateElement( nodeName );
> nodeParent.AppendChild( nodeText );
> nodeParent.Appendchild( nodeParent.OwnerDocument.CreateTextElement(
> nodeValue ) );
> return nodeText;
> }
>
> Assume that the text string "<Actor0>" is saved to node "Label". The
> result in the XML file is
>
> <Label>&lt;Actor0&gt;</Label>
>
> To get the data out, I'm (incorrectly) using the XmlNode's InnerText
> property, which simply returns the serialized text as it was written to
> the file instead of converting it back to the original text. Is there an
> XmlNode function that will unescape the text, thus returning the original
> text?
> --
> Richard Lewis Haggard
> www.Haggard-And-Associates.com
>



 
Reply With Quote
 
 
 
Reply

Thread Tools
Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
formated text to wingdings and can't get back to original font =?Utf-8?B?VXNHcmFudF83NQ==?= Microsoft Word Document Management 3 2 Weeks Ago 12:56 AM
Linking a slide to text and then going back to the original text s =?Utf-8?B?cmVwbGljYW50dXM=?= Microsoft Powerpoint 1 29th Aug 2007 08:06 PM
formated text to wingdings and can't get back to original font =?Utf-8?B?VXNHcmFudF83NQ==?= Microsoft Word Document Management 0 22nd Jun 2006 04:15 PM
I cut some text, then saved it can I get the original file back? =?Utf-8?B?RW1tYQ==?= Microsoft Word Document Management 4 25th Feb 2006 08:11 PM
Filter to HTML Decode only certain HTML tags =?Utf-8?B?Y2Rvbnlp?= Microsoft ASP .NET 0 1st Apr 2004 06:11 PM


Features
 

Advertising
 

Newsgroups
 


All times are GMT +1. The time now is 01:37 PM.