Convert international characters

  • Thread starter Thread starter Peter K
  • Start date Start date
P

Peter K

Hi

I am writing an application where I need to process some text read from
files, and write information to a content-management-system.

Unfortunately the CMS does not accept international characters in some text
strings (eg for names of entities), like the Danish Å, Ø, Æ.

Is there an inbuilt, or recognised, method of converting these types of
characters to "ascii" text (to be honest I don't know if "ascii" is the
correct technical term to use - I mean "ordinary text" as you would type on
a U.S keyboard for example).


/Peter
 
Peter K said:
Hi

I am writing an application where I need to process some text read from
files, and write information to a content-management-system.

Unfortunately the CMS does not accept international characters in some
text
strings (eg for names of entities), like the Danish Å, Ø, Æ.

Is there an inbuilt, or recognised, method of converting these types of
characters to "ascii" text (to be honest I don't know if "ascii" is the
correct technical term to use - I mean "ordinary text" as you would type
on
a U.S keyboard for example).


There are several ways for encoding these characters. eg. UTF-7 encodes all
Unicode-characters so, that the result will all be ASCII. But the question
is, what will the CMS do with it. Probably it will read it as ASCII and
showing the letters as strange character sequences.

The CMS should have same way to understand non-ASCII-characters. Maybe you
only have to know the right encoding, maybe its even configurable. This all
depends on the CMS.

Christof
 
Back
Top