language translator

P

peyush

I am making a windows based language translator. I am using fonts for
the purpose. But when i try to translate text from english to hindi,
its not that accurate(i am using shusha font). I am not sure what
algorithm google people use but they are able to translate b/w the
words perfectly. Has anyone got any idea about their algorithm? I
found some sample code which use google web translation service only
http://code.google.com/p/gtapi/
But when i try to convert a word from english to hindi, it doesn't
show me hindi characters. the output is like this
मेरा . I am not sure what the exact problem
is. Do I need to install any font or libraries or configure some
settings or anything else inorder to get this thing working??? Do i
need to configure My visual studio as well??? Any help will be
appreciated

Thanx

Peyush Goel
 
A

Alberto Poblacion

peyush said:
[...]
But when i try to convert a word from english to hindi, it doesn't
show me hindi characters. the output is like this
मेरा . I am not sure what the exact problem
is. Do I need to install any font or libraries or configure some
settings or anything else inorder to get this thing working???

Those codes look like you are getting back the Unicode codepoints for the
hindi characters encoded within the html that you are receiving. For
instance, "े" means "the character whose codepoint is 2375". An easy
way to convert the html into a Unicode String is provided by the HtmlDecode
method in the System.Web.HttpServerUtility class.
Of course, even if you do get a System.String with the correct Unicode
characters, the string will only display correctly on screen if you are
using a font that contains those characters. You can verify this by
examining your font with the "Character Map" utility that Windows provides
in Programs -> Accessories -> System Tools. For instance, my Windows 2008
contains a font named "Mangal" which has the characters in your sample.
 
P

peyush

[...]
But when i try to convert a word from english to hindi, it doesn't
show me hindi characters. the output is like this
मेरा .  I am not sure what the exact problem
is. Do I need to install any font or libraries or configure some
settings or anything else inorder to get this thing working???

   Those codes look like you are getting back the Unicode codepoints for the
hindi characters encoded within the html that you are receiving. For
instance, "े" means "the character whose codepoint is 2375". An easy
way to convert the html into a Unicode String is provided by the HtmlDecode
method in the System.Web.HttpServerUtility class.
    Of course, even if you do get a System.String with the correct Unicode
characters, the string will only display correctly on screen if you are
using a font that contains those characters. You can verify this by
examining your font with the "Character Map" utility that Windows provides
in Programs -> Accessories -> System Tools. For instance, my Windows 2008
contains a font named "Mangal" which has the characters in your sample.

thanx for the feedback. I am really not sure what font google uses but
I now know that i was using google translation. Now I am going through
google tranliterate. thats what i want. Can we use it in windows based
applications??? and if yes.... that unicode problem might occur again.
Hw to get rid of that now.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top