Convert a string to Ascii codes and then back to string again

K

Kai Bohli

Hi all !

I need to translate a string to Ascii and return a string again. The code below dosen't work for
Ascii (Superset) codes above 127. Any help are greatly appreciated.


protected internal string StringToAscii(string S)
{
byte[] strArray = Encoding.UTF7.GetBytes(S);
string NewString = Encoding.UTF7.GetString(strArray);
return NewString;
}

TIA


Best wishes
Kai Bohli
(e-mail address removed)
Norway
 
J

Jon Skeet [C# MVP]

Kai Bohli said:
I need to translate a string to Ascii and return a string again. The
code below dosen't work for Ascii (Superset) codes above 127.

There are no ASCII codes above 127. You can't translate a string with
characters > Unicode 127 into ASCII without losing data.

Now, what do you *really* want your method to do?
 
K

Kai Bohli

Hi John !
There are no ASCII codes above 127. You can't translate a string with
characters > Unicode 127 into ASCII without losing data.

I know - I've read your faq, but didn't quite know how to put it :)
Now, what do you *really* want your method to do?

My users enter some data into the app, like receiver of a package, adresses etc.
Say that this package are going to Sweden, where "Ascii" code 132 (See Ascii Character Codes Chart 2
in Visual Studio) is common.

Now, I don't know in which strings the characters (that must be translated) will be, and and I don't
know where in the strings they will appear.

The labelprinters support just standard text out of the box. Any special characters has to be sent
as "Ascii" codes.

In order to keep it simple, I would like to send the orginal string (that the user entered) into a
function and return the "Ascii" codes so I can write it directly to the printerport.

TIA

Best wishes
Kai Bohli
(e-mail address removed)
Norway
 
J

Jon Skeet [C# MVP]

Kai Bohli said:
I know - I've read your faq, but didn't quite know how to put it :)


My users enter some data into the app, like receiver of a package, adresses etc.
Say that this package are going to Sweden, where "Ascii" code 132
(See Ascii Character Codes Chart 2 in Visual Studio) is common.

It's unfortunate that the MSDN still has pages which call it ASCII :(
Now, I don't know in which strings the characters (that must be
translated) will be, and and I don't know where in the strings they
will appear.

The labelprinters support just standard text out of the box. Any
special characters has to be sent as "Ascii" codes.

You need to know *exactly* what encoding to use. ASCII itself doesn't
contain the character you're interested in, so you need to know which
encoding you *do* want - and what the printer will support. Hopefully
your printer manual will tell you.
In order to keep it simple, I would like to send the orginal string
(that the user entered) into a function and return the "Ascii" codes
so I can write it directly to the printerport.

Well, the first thing to realise is that if you want the text in an
encoded form, you shouldn't be returning a string to start with - you
should be returning an array of bytes. After that, you just need to
know which encoding your printer is expecting, and use that.
 
K

Kai Bohli

Hi again John.
Now, what do you *really* want your method to do?

Below is my old Delphi code which does what I want. It replaces the character with the Ascii
equalient. If possible, I would like to translate the whole string in one pass, so I could catch
characters that are not already "coded" in the loop.

TIA


function TLabelPrint.LabelTegn(S: string): string;
var L, i : Integer;
begin
L := Length(S);
i := 1;
while i <= L do
begin
if (S = 'æ') then
S := #145
else if S = 'Æ' then
S := #146
else if S = 'ø' then
S := #155
else if S = 'Ø' then
S := #157
else if S = 'å' then
S := #134
else if S = 'Å' then
S := #143
else if S = 'Ã' then
S := #196
else if S = 'Ö' then
S := #153
else if S = 'ö' then
S := #148
else if S = 'Ü' then
S := #154
else if S = 'ü' then
S := #129
else if S = 'È' then
S := #200
else if S = 'è' then
S := #232
else if S = 'É' then
S := #144
else if S = 'é' then
S := #130
else if S = 'Ê' then
S := #202
else if S = 'ê' then
S := #136
else if S = 'Ä' then
S := #142
else if S = 'ä' then
S := #132;

inc(i);
end; // while
result := S;
end;


Best wishes
Kai Bohli
(e-mail address removed)
Norway
 
J

Jon Skeet [C# MVP]

Kai Bohli said:
Below is my old Delphi code which does what I want. It replaces the
character with the Ascii equalient.

No, it replaces it with an equivalent in some other character encoding,
*not* ASCII.
If possible, I would like to translate the whole string in one pass,
so I could catch characters that are not already "coded" in the loop.

I think you need to fundamentally re-evaluate the difference between
text data and binary data. I don't know anything about how Delphi works
in terms of character encodings, so I don't know exactly what your
Delphi code is doing. I still believe that using a suitable encoding is
likely to be your best bet - and you should consult the manual of your
printer for which encoding to use.
 
K

Kai Bohli

Hi again John.

Ok, thanks for your help on this matter. I'll dig deeper in the manual. BTW - the printer manual
also have the same Ascii chart as VS's chart 2.
I think you need to fundamentally re-evaluate the difference between
text data and binary data. I don't know anything about how Delphi works
in terms of character encodings, so I don't know exactly what your
Delphi code is doing. I still believe that using a suitable encoding is
likely to be your best bet - and you should consult the manual of your
printer for which encoding to use.

Best wishes
Kai Bohli
(e-mail address removed)
Norway
 
J

Jon Skeet [C# MVP]

Kai Bohli said:
Ok, thanks for your help on this matter. I'll dig deeper in the
manual. BTW - the printer manual also have the same Ascii chart as
VS's chart 2.

Does it just call it ASCII though, or does it have a code page number?
If you've got a code page, you're away...
 
K

Kai Bohli

Hi again Jon (sorry for calling you John in the previous postings)
Does it just call it ASCII though, or does it have a code page number?
If you've got a code page, you're away...

It's called just ASCII, but I have about 60 different code pages to choose among. There's a few
problem with choosing these codepages though.

1) A code page for one labelprinter type might not be aviable on another labelprinter.
2) If I choose codepage for Swedish, and the user want's to send a package to France, I'll not have
access to the French accented characters.
3) If I use a code page, I must find out what the character code means for that code page.

As you say Jon, Ascii code above 127 may not exist in a strict manner, but it's still used this way
a lot in the hardware and software industry.

For instance this is from the Delphi help file:
<snip>
A character string, also called a string literal or string constant, consists of a quoted string, a
control string, or a combination of quoted and control strings. Separators can occur only within
quoted strings. A quoted string is a sequence of up to 255 characters from the extended ASCII
character set, written on one line and enclosed by apostrophes. A quoted string with nothing between
the apostrophes is a null string. Two sequential apostrophes in a quoted string denote a single
character, namely an apostrophe. For example,
'BORLAND' { BORLAND }

A control string is a sequence of one or more control characters, each of which consists of the #
symbol followed by an unsigned integer constant from 0 to 255 (decimal or hexadecimal) and denotes
the corresponding ASCII character. The control string
#89#111#117 is equivalent to the quoted string 'You'
</snip>

I know from previous experience that if I send the "Ascii" code of 132 it will be displayed
correctly on any labelprinter used in any country. Probably because i don't spesify a code page, a
standard code page page will be used which relates to the ascii chart 2.



Best wishes
Kai Bohli
(e-mail address removed)
Norway
 
J

Jon Skeet [C# MVP]

Kai Bohli said:
Hi again Jon (sorry for calling you John in the previous postings)

No worries :)
It's called just ASCII, but I have about 60 different code pages to
choose among. There's a few problem with choosing these codepages
though.

If the printer supports many different code pages, how do you tell it
which one you're using?
1) A code page for one labelprinter type might not be aviable on
another labelprinter.

So allow the user to select the code page used by their printer.
2) If I choose codepage for Swedish, and the user want's to send a
package to France, I'll not have access to the French accented
characters.
Ditto.

3) If I use a code page, I must find out what the character code
means for that code page.

No you don't - the Encoding class knows all that for you. You give the
Encoding instance a string, and it converts that into the appropriate
byte sequence for you.
As you say Jon, Ascii code above 127 may not exist in a strict
manner, but it's still used this way a lot in the hardware and
software industry.

And that's a problem in itself. Different code pages are used as if
they're compatible when they're not.
For instance this is from the Delphi help file:
<snip>
A character string, also called a string literal or string constant,
consists of a quoted string, a control string, or a combination of
quoted and control strings. Separators can occur only within quoted
strings. A quoted string is a sequence of up to 255 characters from
the extended ASCII character set, written on one line and enclosed by
apostrophes. A quoted string with nothing between the apostrophes is
a null string. Two sequential apostrophes in a quoted string denote a
single character, namely an apostrophe. For example, 'BORLAND' {
BORLAND }

Ah great - "the extended ASCII character set"... as if there's only
one. It sounds like a string is basically just a sequence of bytes as
far as Delphi is concerned. Blech :(
A control string is a sequence of one or more control characters,
each of which consists of the # symbol followed by an unsigned
integer constant from 0 to 255 (decimal or hexadecimal) and denotes
the corresponding ASCII character. The control string #89#111#117 is
equivalent to the quoted string 'You' </snip>

I know from previous experience that if I send the "Ascii" code of
132 it will be displayed correctly on any labelprinter used in any
country. Probably because i don't spesify a code page, a standard
code page page will be used which relates to the ascii chart 2.

No. It may *happen* to work for 132 (if that's common amongst all code
pages), but it won't work for all characters. Try the infinity sign
(236) for instance - in some code pages it'll work, in others it won't.
(In code page 850 for example, it'll be a y-acute.)

I believe the code page shown by MSDN looks like code page 437, by the
way.
 
K

Kai Bohli

Hi again Jon !

You've been right all along, but I guess that you're used to that :)

I've found that both printers support codepage 850 multilangual (amoung others). I can set this in
code and send it to the printer.

So in my code I got this:

protected internal string StringToAscii(string S)
{
Encoding.GetEncoding(850);
byte[] strArray = Encoding.UTF7.GetBytes(S);
string NewString = Encoding.UTF7.GetString(strArray);
return NewString;
}

It's not working properly yet, probably because there's no codepage called just 850. Do you know
where I can find a list of Codepage names ?

TIA again.

Best wishes
Kai Bohli
(e-mail address removed)
Norway
 
J

Jon Skeet [C# MVP]

Kai Bohli said:
You've been right all along, but I guess that you're used to that :)

Far from it - but I've a certain amount of pushing against people's
assumptions when it comes to encodings :)
I've found that both printers support codepage 850 multilangual
(amoung others). I can set this in code and send it to the printer.
Excellent.

So in my code I got this:

protected internal string StringToAscii(string S)
{
Encoding.GetEncoding(850);
byte[] strArray = Encoding.UTF7.GetBytes(S);
string NewString = Encoding.UTF7.GetString(strArray);
return NewString;
}

It's not working properly yet, probably because there's no codepage
called just 850. Do you know where I can find a list of Codepage
names ?

I still don't think you should be returning it as a string. Instead,
your method should be:

protected byte[] StringToCodepage(string s)
{
Encoding encoding = Encoding.GetEncoding(850);
return encoding.GetBytes(s);
}
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top