How to tell a char is ascii or unicode?

  • Thread starter Thread starter Nick
  • Start date Start date
N

Nick

hi,

I want to know if a string contain any unicode character, and what I
did is like this:
for(int i = 0 ; i < str.Length; i++)
{
char ch = str;
//Then, right here, how to tell this char is ascii or unicode?
}

thanks.
 
First, you'll have to rephrase the question you're asking your code to
answer.

Unicode is a superset of ASCII, so the question is really, "Is there
any character in this Unicode string that cannot be represented in
ASCII?"

One way to find out is to use an ASCIIEncoding to transform the string
into an array of ASCII bytes, then back into a Unicode string, and
compare the strings. If any character has changed, then that character
couldn't be represented as ASCII.
 
Bruce Wood said:
First, you'll have to rephrase the question you're asking your code to
answer.

Unicode is a superset of ASCII, so the question is really, "Is there
any character in this Unicode string that cannot be represented in
ASCII?"

One way to find out is to use an ASCIIEncoding to transform the string
into an array of ASCII bytes, then back into a Unicode string, and
compare the strings. If any character has changed, then that character
couldn't be represented as ASCII.

An easier way, though, is to check whether its numeric value is > 127.
 
How about calling ReadXml method without XmlReadMode argument"

dsLog.ReadXml("pi_feedback_log.xml");
 
Hi,


yogeshprabhu said:
How about calling ReadXml method without XmlReadMode argument"

dsLog.ReadXml("pi_feedback_log.xml");


And how this will tell you about unicode/ascii characters?
 
Yes. I initially rejected that out of concerns that ASCIIEncoding might
transform some characters above 127 into valid ASCII characters, but
then the transform back would transform these into Unicode characters
below 128, and so cases like this would fail my test, as well.

So, yes, checking whether the numeric value is > 127 is safe, and much
quicker than doing the ASCIIEncoding thing.
 
My bad, apparently I wanted to reply to two different posts, and they got
intermixed. This was meant for one of the XmlReader question. Thanks for
pointing out.
 
Bruce Wood said:
Yes. I initially rejected that out of concerns that ASCIIEncoding might
transform some characters above 127 into valid ASCII characters, but
then the transform back would transform these into Unicode characters
below 128, and so cases like this would fail my test, as well.

So, yes, checking whether the numeric value is > 127 is safe, and much
quicker than doing the ASCIIEncoding thing.

Just being pedantic here but the subject is misleading - a char is ALWAYS
unicode by definition - the subject should be "How to tell if a char is a
valid ASCII character (as well as unicode)"
 
thank you guys.
I knew I could see the value if it is greater than 127, but not sure if
it was safe. Now, I think it should be safe.
In, c/c++, there is isascii() function, I remember I saw some similar
function in .net, but just could not remember what it exact is. This is
why I asked.
Thanks again.
 
Back
Top