Convert text encoded with character referense ({) to unicode or uft-8

D

Daniel Köster

Is there someone who has got some tips on how to convert text encoded with
character referense ({) to unicode or uft-8 format using VB.net? Is
there a function or something that can help with the conversion?

To use a simple replace "this" with "that" is not an option since there are
som asian-texts that I need to convert as well. (chinese, thai and
japanese;
the replace list would be to large to handle)

What i want to do is to be able to compare a file coded with character
references (i.e. {) with a file coded with normal unicode characters
(i.e. ö,ä,å)

Best regards
Daniel
 
J

Jon Skeet [C# MVP]

Daniel Köster said:
Is there someone who has got some tips on how to convert text encoded with
character referense ({) to unicode or uft-8 format using VB.net? Is
there a function or something that can help with the conversion?

To use a simple replace "this" with "that" is not an option since there are
som asian-texts that I need to convert as well. (chinese, thai and
japanese;
the replace list would be to large to handle)

What i want to do is to be able to compare a file coded with character
references (i.e. {) with a file coded with normal unicode characters
(i.e. ö,ä,å)

Just do "normal" parsing to find the to start with, then use
Substring (or whatever) to get the xxx bit, parse it as an integer
(Int32.Parse or Convert.ToInt32) and cast the result to a character.
 
M

Mihai N.

Just do "normal" parsing to find the to start with, then use
Substring (or whatever) to get the xxx bit, parse it as an integer
(Int32.Parse or Convert.ToInt32) and cast the result to a character.

HttpUtility.HtmlDecode
HttpUtility.HtmlEncode
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top