problem with '<' character

Bob · Apr 10, 2009

Hi,

The user has to enter text in a textbox. He also must be able to enter html
code like <a href="..."> or ... . But also possibly the characters <
(lesst than) and > (greater than). The page has its ValidateRequest="false".

The problem is that the textbox automatically removes the < and the > and
the text between (e.g. <this will be erased> does not appear on the page).
One possibility is this:
txt = TextBox1.Text
txt = txt.Replace("<", "<")

This works: the < and > are now visible on the page, but then, the html tags
doesn't work anymore: i get e.g. <a href="..."> on the page, but it's not
clickable anymore.

So my problem is: how to get at the same time for the same textbox the
non-html codes < and > together with the html codes?

Thanks for help.
Bob

Scott M. · Apr 10, 2009

Use HTMLEncode and HTMLDecode to allow the user to send the data and for you
to render it.

http://msdn.microsoft.com/en-us/library/system.web.httpserverutility.htmlencode.aspx
http://msdn.microsoft.com/en-us/library/system.web.httpserverutility.htmldecode.aspx

-Scott

Bob · Apr 10, 2009

Hi thanks for replying;
still a problem because i need both cases together.

i tried this:

txt="<aaaa>"
tit="<aaaa>"
txt = Server.HtmlEncode(txt)
tit = Server.HtmlDecode(tit)

i get:
txt=<aaaa>
tit= nothing (empty): here i need also <aaaa>

then i tried this:
txt="aaaa"
tit="aaaa"
txt = Server.HtmlEncode(txt)
tit = Server.HtmlDecode(tit)

i get:
txt=aaaa: here i need aaaa
tit=aaaa

So, in my example 1), the HtmlDEcode is wrong and in my example 2), it's the HTMLENcode which is wrong.

Scott M. · Apr 11, 2009

You aren't using decode properly. Decode is meant to take already encoded data and decode it, but your example below is trying to decode already decoded data.

-Scott
Hi thanks for replying;
still a problem because i need both cases together.

i tried this:

txt="<aaaa>"
tit="<aaaa>"
txt = Server.HtmlEncode(txt)
tit = Server.HtmlDecode(tit)

i get:
txt=<aaaa>
tit= nothing (empty): here i need also <aaaa>

then i tried this:
txt="aaaa"
tit="aaaa"
txt = Server.HtmlEncode(txt)
tit = Server.HtmlDecode(tit)

i get:
txt=aaaa: here i need aaaa
tit=aaaa

So, in my example 1), the HtmlDEcode is wrong and in my example 2), it's the HTMLENcode which is wrong.

Bob · Apr 11, 2009

Scott,

i probably do.
Could you then explain me how i can do in order not to loose the non-html characters (like <xxx>) and not to loose the html tags (like ) in the same textbox?

Suppose the user enters this text::
"this is bold text and this is a <non-html tag>."

i want to get (in all browsers):
this is bold text and this is a <non-html tag>.

Thanks

"Scott M." <[email protected]> schreef in bericht You aren't using decode properly. Decode is meant to take already encoded data and decode it, but your example below is trying to decode already decoded data.

-Scott
Hi thanks for replying;
still a problem because i need both cases together.

i tried this:

txt="<aaaa>"
tit="<aaaa>"
txt = Server.HtmlEncode(txt)
tit = Server.HtmlDecode(tit)

i get:
txt=<aaaa>
tit= nothing (empty): here i need also <aaaa>

then i tried this:
txt="aaaa"
tit="aaaa"
txt = Server.HtmlEncode(txt)
tit = Server.HtmlDecode(tit)

i get:
txt=aaaa: here i need aaaa
tit=aaaa

So, in my example 1), the HtmlDEcode is wrong and in my example 2), it's the HTMLENcode which is wrong.

JM · Apr 11, 2009

Suppose the user enters this text::

"this is bold text and this is a <non-html tag>."
i want to get (in all browsers): this is bold text and this is a <non-html
tag>.

If a browser doesn't know what to do with a tag, they generally ignore them
so that shouldn't be an issue. .

Also, double check your work..

tit="<aaaa>"
tit = Server.HtmlDecode(tit)

You seem to be decoding unencoded text - try something like..

txt="<this should be bold>"
encoded=Server.HTMLEncode(txt)
decoded=Server.HTMLDecode(encoded)

John

Bob · Apr 11, 2009

What you suggest works, but is not necessary.
This gives the same result without using encode/decode: txt="<this should
be bold>"

But when doing this:
txt="<aaaa>"
encoded=Server.HTMLEncode(txt)
decoded=Server.HTMLDecode(encoded)

it doesn't work (gives nothing).

So there is no solution for this.

Scott M. · Apr 11, 2009

Actually, it's working perfectly. It's just that (as was mentioned already)
browsers will ignore markup (anything between < and >) that it doesn't
recognize as proper HTML markup.

What you are essentially asking for is a way for the server to be able to
distinguish between HTML markup and non-HTML markup. If it is non-HTML
markup, you'll need to ENcode the markup characters and with HTML markup,
you should not alter the markup at all.

I am not aware of an automatic method for determining what is and isn't HTML
on the server, but it would not be extermely difficult to check the input
against a known list of HTML tags.

-Scott

Bob · Apr 11, 2009

Maybe we have all wrong, but your solution doesn't work either, Cowboy.

I typed this in your textbox:
bold<aaaaa>

I get this:
bold<aaaaa>
bold

The point is still what Scott wrote:
What you are essentially asking for is a way for the server to be able to
distinguish between HTML markup and non-HTML markup. If it is non-HTML
markup, you'll need to ENcode the markup characters and with HTML markup,
you should not alter the markup at all.

"Cowboy (Gregory A. Beamer)" <[email protected]> schreef in bericht You have it all wrong. The point is encoding to store in a database and then decoding when you bring it back out. Attached is a sample app that displays how encode and decode works.

Note that there is a real danger with allowing this type of input, which is why you have to turn validationRequest to false on the page. Make sure you do basic input checking, as a malicious user could put <script> tags in and possibly hack your site.

--
Gregory A. Beamer
MVP; MCP: +I, SE, SD, DBA

Blog:
http://feeds.feedburner.com/GregoryBeamer

*********************************************
| Think outside the box |
*********************************************
Hi thanks for replying;
still a problem because i need both cases together.

i tried this:

txt="<aaaa>"
tit="<aaaa>"
txt = Server.HtmlEncode(txt)
tit = Server.HtmlDecode(tit)

i get:
txt=<aaaa>
tit= nothing (empty): here i need also <aaaa>

then i tried this:
txt="aaaa"
tit="aaaa"
txt = Server.HtmlEncode(txt)
tit = Server.HtmlDecode(tit)

i get:
txt=aaaa: here i need aaaa
tit=aaaa

So, in my example 1), the HtmlDEcode is wrong and in my example 2), it's the HTMLENcode which is wrong.

Cowboy \(Gregory A. Beamer\) · Apr 13, 2009

I missed that part. If you want to use tag delimiters in your text that are not really tags, alongside valid HTML, you have a bit more of a problem and will have to write your own Endcode/Decode methods. I can see no way around this.

Fortunately, the rules are not that extensive, as far as escaped characters go.

No matter what you do, you will have to check for <script> tags, as they are security risks.

Have you looked into any of the Rich Text controls you can put on your page? One of them might have the facility to escape bogus tags.

--
Gregory A. Beamer
MVP; MCP: +I, SE, SD, DBA

Blog:
http://feeds.feedburner.com/GregoryBeamer

*********************************************
| Think outside the box |
*********************************************
Maybe we have all wrong, but your solution doesn't work either, Cowboy.

I typed this in your textbox:
bold<aaaaa>

I get this:
bold<aaaaa>
bold

The point is still what Scott wrote:
What you are essentially asking for is a way for the server to be able to
distinguish between HTML markup and non-HTML markup. If it is non-HTML
markup, you'll need to ENcode the markup characters and with HTML markup,
you should not alter the markup at all.

"Cowboy (Gregory A. Beamer)" <[email protected]> schreef in bericht You have it all wrong. The point is encoding to store in a database and then decoding when you bring it back out. Attached is a sample app that displays how encode and decode works.

Note that there is a real danger with allowing this type of input, which is why you have to turn validationRequest to false on the page. Make sure you do basic input checking, as a malicious user could put <script> tags in and possibly hack your site.

--
Gregory A. Beamer
MVP; MCP: +I, SE, SD, DBA

Blog:
http://feeds.feedburner.com/GregoryBeamer

*********************************************
| Think outside the box |
*********************************************
Hi thanks for replying;
still a problem because i need both cases together.

i tried this:

txt="<aaaa>"
tit="<aaaa>"
txt = Server.HtmlEncode(txt)
tit = Server.HtmlDecode(tit)

i get:
txt=<aaaa>
tit= nothing (empty): here i need also <aaaa>

then i tried this:
txt="aaaa"
tit="aaaa"
txt = Server.HtmlEncode(txt)
tit = Server.HtmlDecode(tit)

i get:
txt=aaaa: here i need aaaa
tit=aaaa

So, in my example 1), the HtmlDEcode is wrong and in my example 2), it's the HTMLENcode which is wrong.

JM · Apr 13, 2009

I am not aware of an automatic method for determining what is and isn't

HTML on the server, but it would not be extermely difficult to check the
input against a known list of HTML tags.

That would be the only way to be 100% sure.

John

Three aspnet questions	7	Aug 2, 2008
Html 4.01 strict instead of xhtml?	5	Aug 7, 2007
Problem with Menu control, crawler and WebResource.axd	1	Mar 16, 2010
Response.WriteBinary problem.	2	Oct 27, 2008
Problem with Page.SetFocus with AJAX extenders	2	Feb 15, 2008
Master page and CSS link strange problem.	1	Jun 20, 2006
Validation of viewstate MAC failed	2	May 21, 2009
problem with customvalidator	2	Mar 18, 2008

problem with '<' character

Bob

Scott M.

Bob

Scott M.

Bob

JM

Bob

Scott M.

Bob

Cowboy \(Gregory A. Beamer\)

JM

Ask a Question

Similar Threads