ASCII Character Conversion

P

Phanidhar

Hi,
I'm developing a Winform application in C#( .net 2.0). I've a dialog box
where user can input text and that text would be sent across to other machine
using sockets.

When the user enters ASCII character which are non-printable like ASCII 20
( using ALT+20), this character is converted to ASCII value 194( or something
like that). What should be done to preserve the original ASCII value of 20?

Thanking you in advance.
Phani
 
J

Jon Skeet [C# MVP]

 I'm developing a Winform application in C#( .net 2.0). I've a dialog box
where user can input text and that text would be sent across to other machine
using sockets.

 When the user enters ASCII character which are non-printable like ASCII20
( using ALT+20), this character is converted to ASCII value 194( or something
like that). What should be done to preserve the original ASCII value of 20?

I think your use of the term "ASCII" is pretty loose here. What actual
character are you talking about? ASCII 20 (well, pseudo-ASCII - I
believe true, pedantic, standard ASCII only starts at 32) is a control
code. All .NET characters are stored as Unicode - what's the Unicode
code point for the character you're trying to represent?

If, fundamentally, you're trying to send non-text data you shouldn't
try to pretend it's text. Separate out the text data from the binary
data and be very careful about how you use each of them.

Jon
 
J

Jeroen Mostert

Jon said:
ASCII 20 (well, pseudo-ASCII - I believe true, pedantic, standard ASCII
only starts at 32)

No, the control codes are part of the standard; 20 is DLE (Data Link
Escape), though good luck quizzing people on what that was used for. The
*printable* ASCII characters are in the range 32-126, 127 is the control
code DEL, and the remainder is, popular misconceptions to the contrary, not
part of ASCII (and there's no single "extended ASCII" character set, let
alone a standard).
 
J

Jeroen Mostert

Jeroen said:
No, the control codes are part of the standard; 20 is DLE (Data Link
Escape), though good luck quizzing people on what that was used for.

Hum. 20 is DLE in octal. In decimal, 16 is DLE; 20 is Device Control 4,
which is obviously *quite* different. I mean, imagine sending DLE when the
other side expects DC4, or vice versa! It doesn't take a scientist to
realize the potential for disaster, or something.

Anyway, moving swiftly on...
 
J

Jon Skeet [C# MVP]

No, the control codes are part of the standard; 20 is DLE (Data Link
Escape), though good luck quizzing people on what that was used for. The
*printable* ASCII characters are in the range 32-126, 127 is the control
code DEL

For some reason I had the impression that the "full" ISO standard for
ASCII didn't include either 0-31 or 127 itself. However, I can't
remember any source for that, and certainly it's not the commonly used
idea of ASCII.
and the remainder is, popular misconceptions to the contrary, not
part of ASCII (and there's no single "extended ASCII" character set, let
alone a standard).

Heartily agreed :)

Jon
 
A

Alain Boss

Jon said:
For some reason I had the impression that the "full" ISO standard for
ASCII didn't include either 0-31 or 127 itself. However, I can't
remember any source for that, and certainly it's not the commonly used
idea of ASCII.

0x20 = 32 decimal = 'space' in ASCII

regards
Alain
 
A

Arne Vajhøj

Jon said:
For some reason I had the impression that the "full" ISO standard for
ASCII didn't include either 0-31 or 127 itself. However, I can't
remember any source for that, and certainly it's not the commonly used
idea of ASCII.

If http://en.wikipedia.org/wiki/ASCII is correct then the
non printable are part of ASCII.

And lots of non printable characters are or were widely used
in both files and communication.

CR and LF are still used.

XON, XOFF were used a lot in terminals (the real ones, not
terminal emulators with a buffer of 2000 lines).

Arne
 
J

Jon Skeet [C# MVP]

Ifhttp://en.wikipedia.org/wiki/ASCIIis correct then the
non printable are part of ASCII.

And lots of non printable characters are or were widely used
in both files and communication.

CR and LF are still used.

Oh absolutely - that's why I was surprised when I was first told that
they weren't part of "official" ASCII.

I just wish I could remember where I heard it from. I suspect it was
in a discussion which included a debate about whether ISO-8859-1 has a
"hole" between 128 and 159, or whether it includes other control
characters. (I argued from the Unicode documentation which states - or
at least stated - that the first 256 characters of Unicode were the
same as in ISO-8859-1; others argued from other sources.)

I'm happy to just assume I'm wrong on this one though - certainly
everyone realistically includes 0-31 as part of ASCII.

Jon
 
A

Arne Vajhøj

I just wish I could remember where I heard it from. I suspect it was
in a discussion which included a debate about whether ISO-8859-1 has a
"hole" between 128 and 159, or whether it includes other control
characters. (I argued from the Unicode documentation which states - or
at least stated - that the first 256 characters of Unicode were the
same as in ISO-8859-1; others argued from other sources.)

Unfortunately ISO standards are not freely (beer not speach)
available.

They are in Unicode Basic Latin
http://www.unicode.org/charts/PDF/U0000.pdf and
Latin1 http://www.unicode.org/charts/PDF/U0080.pdf !

I know that they are defined in DECMCS the Predecessor
of ISO-8859-1.

Considering that they are in DECMCS and in Unicode
under the Latin1 name (which is a known synonym
for ISO-8859-1), then there are very strong
indications that they are in ISO-8859-1.

Arne
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top