Daniel Köster <(E-Mail Removed)> wrote:
> Is there someone who has got some tips on how to convert text encoded with
> character referense ({) to unicode or uft-8 format using VB.net? Is
> there a function or something that can help with the conversion?
>
> To use a simple replace "this" with "that" is not an option since there are
> som asian-texts that I need to convert as well. (chinese, thai and
> japanese;
> the replace list would be to large to handle)
>
> What i want to do is to be able to compare a file coded with character
> references (i.e. {) with a file coded with normal unicode characters
> (i.e. ö,ä,å)
Just do "normal" parsing to find the &#xxx; to start with, then use
Substring (or whatever) to get the xxx bit, parse it as an integer
(Int32.Parse or Convert.ToInt32) and cast the result to a character.
--
Jon Skeet - <(E-Mail Removed)>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too