How do I create a one line text file with control codes? e.g.: 144 = 0x90 and 147 = 0x93?


D

Dan V.

How do I create a one line text file with these control codes? e.g.: 144 =
0x90 and 147 = 0x93?

I am trying to create a one line text file with these characters all one one
row with no spaces.

1. 144 = 0x90
2. 147 = 0x93
3. STX = (^B = 2 = 0x2)
4. NUL = (^@ = 0 = 0x0)
5. DC4 = (^T = 20 = 0x14)
6. SOH = (^A = 1 = 0x1)
7. GS (^] = 29 = 0x1d)
8. RS = (^^ = 30 = 0x1e


I tried StreamWriter, BinaryWriter... but it can't seem to handle the 147 =
0x93 without putting a strange 'A' character.
This is for use with a legacy program and that 'A' character can not be in
the text file.

I am using TextPad 4 to view the control characters.
The code below is close, but for 144 and 147 the strange 'A' character
exists prior to the Control code.

// BUG?: Do not specify encoding 'System.Text.Encoding.UTF8' else 147 will
not work: (even though documentation says UTF8 is default).
StreamWriter sw = new StreamWriter(@"C:\Text2.txt", false);

sw.WriteLine("Encoding: {0}", sw.Encoding.ToString());

sw.Write( (char) 144 );
sw.Write( (char) 147 );
sw.Write( (char) 2 );
sw.Write( (char) 0 );
sw.Write( (char) 20 );
sw.Write( (char) 1 );
sw.Write( (char) 29 );
sw.Write( (char) 30 );

sw.Close();
 
Ad

Advertisements

N

Niki Estner

Just curious: if you don't want codepage-translation, why do you use a
TextWriter class?
I think you should use a FileStream.

Niki
 
D

Dan V.

What is codepage-translation?

Tried FileStream in this way with similar results (I must be missing
something - though I am still using StreamWriter...)

// Create a 'FileStream' object.

FileStream myFileStream = new FileStream(@"C:\Text3.txt", FileMode.Create,
FileAccess.Write);

// Create a 'StreamWriter' to write the data into the file.

StreamWriter myStreamWriter = new StreamWriter(myFileStream,
System.Text.Encoding.UTF8);


myStreamWriter.WriteLine("Encoding: {0}",
myStreamWriter.Encoding.ToString());

myStreamWriter.Write( (char) 144 );

myStreamWriter.Write( (char) 147 );

myStreamWriter.Write( (char) 2 );

myStreamWriter.Write( (char) 0 );

myStreamWriter.Write( (char) 20 );

myStreamWriter.Write( (char) 1 );

myStreamWriter.Write( (char) 29 );

myStreamWriter.Write( (char) 30 );


// Update the 'StreamWriter'.

myStreamWriter.Flush();

// Close the 'StreamWriter' and FileStream.

myStreamWriter.Close();

myFileStream.Close();






Niki Estner said:
Just curious: if you don't want codepage-translation, why do you use a
TextWriter class?
I think you should use a FileStream.

Niki

Dan V. said:
How do I create a one line text file with these control codes? e.g.: 144 =
0x90 and 147 = 0x93?

I am trying to create a one line text file with these characters all one one
row with no spaces.

1. 144 = 0x90
2. 147 = 0x93
3. STX = (^B = 2 = 0x2)
4. NUL = (^@ = 0 = 0x0)
5. DC4 = (^T = 20 = 0x14)
6. SOH = (^A = 1 = 0x1)
7. GS (^] = 29 = 0x1d)
8. RS = (^^ = 30 = 0x1e


I tried StreamWriter, BinaryWriter... but it can't seem to handle the
147
=
0x93 without putting a strange 'A' character.
This is for use with a legacy program and that 'A' character can not be in
the text file.

I am using TextPad 4 to view the control characters.
The code below is close, but for 144 and 147 the strange 'A' character
exists prior to the Control code.

// BUG?: Do not specify encoding 'System.Text.Encoding.UTF8' else 147 will
not work: (even though documentation says UTF8 is default).
StreamWriter sw = new StreamWriter(@"C:\Text2.txt", false);

sw.WriteLine("Encoding: {0}", sw.Encoding.ToString());

sw.Write( (char) 144 );
sw.Write( (char) 147 );
sw.Write( (char) 2 );
sw.Write( (char) 0 );
sw.Write( (char) 20 );
sw.Write( (char) 1 );
sw.Write( (char) 29 );
sw.Write( (char) 30 );

sw.Close();
 
A

Austin Ehlers

<snip>

For straight byte reading/writing, use the FileStream class. Ex:

FileStream fs = new FileStream(@"C:\Text3.txt", FileMode.Create,
FileAccess.Write);

fs.WriteByte(144);
fs.WriteByte(147);
fs.WriteByte(2);
fs.WriteByte(0);
fs.WriteByte(20);
fs.WriteByte(1);
fs.WriteByte(29);
fs.WriteByte(30);
fs.Close();
 
D

Dan V.

Thank you it worked beautifully.

How would I write a string using WriteByte?
Like:

fs.WriteByte(0);
fs.Write(@"c:\TextToAdd.txt"); // want to add a control char before and
after this string, but this does not work, not sure how to use all
parameters

fs.WriteByte(29);
 
J

Jon Skeet [C# MVP]

Dan V. said:
Thank you it worked beautifully.

How would I write a string using WriteByte?
Like:

fs.WriteByte(0);
fs.Write(@"c:\TextToAdd.txt"); // want to add a control char before and
after this string, but this does not work, not sure how to use all
parameters

fs.WriteByte(29);

Rather than using WriteByte, you'd probably just use Write that takes a
byte array, after converting the string to a byte array using the
appropriate Encoding.

See http://www.pobox.com/~skeet/csharp/unicode.html for more info about
encodings.
 
Ad

Advertisements

J

Jon Skeet [C# MVP]

Dan V. said:
Thanks again, I read your article and this one:
http://www.joelonsoftware.com/articles/Unicode.html

Very helpful.
Goodo.

This code was what I needed.
s = (@"c:\TextToAdd.txt");

fs.Write(System.Text.Encoding.UTF8.GetBytes(s), 0, s.Length);

That's not quite the code you need. You'll end up writing s.Length
bytes, even though GetBytes may well have returned more bytes than
that.

Use

byte[] bytes = Encoding.UTF8.GetBytes(s);
fs.Write(bytes, 0, bytes.Length);

Can't say I like that page much, given the code it espouses. There's no
need to create a new instance of UnicodeEncoding etc each time. Why
bother with a whole extra method in the first place when you can just
do:

string x = Encoding.Unicode.GetString(bytes);

rather than

string x = FromUnicodeByteArray(bytes);
 
D

Dan V.

You must be thinking - "did he really understand those articles..."
Is this the point then?

string.Length may not equal bytes.length because:
1) some characters may not equal exactly one byte (even though in my case
it did)
2) some characters (if I used non 'English' ones may be more than one
byte) - I am recalling that in UTF8 ANSI and ASCII, they use only one byte
only for most 'English' characters and 2 or more bytes for the rest...


Jon Skeet said:
Dan V. said:
Thanks again, I read your article and this one:
http://www.joelonsoftware.com/articles/Unicode.html

Very helpful.
Goodo.

This code was what I needed.
s = (@"c:\TextToAdd.txt");

fs.Write(System.Text.Encoding.UTF8.GetBytes(s), 0, s.Length);

That's not quite the code you need. You'll end up writing s.Length
bytes, even though GetBytes may well have returned more bytes than
that.

Use

byte[] bytes = Encoding.UTF8.GetBytes(s);
fs.Write(bytes, 0, bytes.Length);

Can't say I like that page much, given the code it espouses. There's no
need to create a new instance of UnicodeEncoding etc each time. Why
bother with a whole extra method in the first place when you can just
do:

string x = Encoding.Unicode.GetString(bytes);

rather than

string x = FromUnicodeByteArray(bytes);
 
J

Jon Skeet [C# MVP]

Dan V. said:
You must be thinking - "did he really understand those articles..."

Only a *little* bit :)
Is this the point then?

string.Length may not equal bytes.length because:
1) some characters may not equal exactly one byte (even though in my case
it did)

Indeed. It will depend on the encoding and the characters being
encoded. For instance, using Encoding.Unicode you will always get twice
as many bytes as characters. Using Encoding.ASCII you'll always get
exactly the same number of bytes as characters (but you can only
properly encode characters 0-127). Using Encoding.UTF8 you'll get a
variable number depending on the character - ASCII values still end up
as one byte, but the number of bytes grows depending on the characters.
In fact, UTF-8 is even more complicated, because surrogate pairs should
be encoded into a single 6-byte UTF-8 sequence, rather than two 4-byte
sequences. Nasty stuff!
2) some characters (if I used non 'English' ones may be more than one
byte) - I am recalling that in UTF8 ANSI and ASCII, they use only one byte
only for most 'English' characters and 2 or more bytes for the rest...

"ANSI" isn't a single character set - but UTF8 and ASCII are certainly
one byte per character for all ASCII characters.
 
D

Dan V.

Now I am trying to get rid of that '@' symbol from a string, which I use
previously.
I don't want to write that '@' symbol to the file and WriteByte seems to try
to do this.

s = @"c:\text1.txt"

bytes = Encoding.UTF8.GetBytes(s);

fs.Write(bytes, 0, bytes.Length);
 
Ad

Advertisements

D

Dan V.

Never mind,
the problem was else where, WriteByte can handle it.

Dan V. said:
Now I am trying to get rid of that '@' symbol from a string, which I use
previously.
I don't want to write that '@' symbol to the file and WriteByte seems to try
to do this.

s = @"c:\text1.txt"

bytes = Encoding.UTF8.GetBytes(s);

fs.Write(bytes, 0, bytes.Length);


Jon Skeet said:
Only a *little* bit :)


Indeed. It will depend on the encoding and the characters being
encoded. For instance, using Encoding.Unicode you will always get twice
as many bytes as characters. Using Encoding.ASCII you'll always get
exactly the same number of bytes as characters (but you can only
properly encode characters 0-127). Using Encoding.UTF8 you'll get a
variable number depending on the character - ASCII values still end up
as one byte, but the number of bytes grows depending on the characters.
In fact, UTF-8 is even more complicated, because surrogate pairs should
be encoded into a single 6-byte UTF-8 sequence, rather than two 4-byte
sequences. Nasty stuff!
one
 
Ad

Advertisements

D

Dan V.

Sorry this was my problem. It works as advertised.

Dan V. said:
Now I am trying to get rid of that '@' symbol from a string, which I use
previously.
I don't want to write that '@' symbol to the file and WriteByte seems to try
to do this.

s = @"c:\text1.txt"

bytes = Encoding.UTF8.GetBytes(s);

fs.Write(bytes, 0, bytes.Length);


Jon Skeet said:
Only a *little* bit :)


Indeed. It will depend on the encoding and the characters being
encoded. For instance, using Encoding.Unicode you will always get twice
as many bytes as characters. Using Encoding.ASCII you'll always get
exactly the same number of bytes as characters (but you can only
properly encode characters 0-127). Using Encoding.UTF8 you'll get a
variable number depending on the character - ASCII values still end up
as one byte, but the number of bytes grows depending on the characters.
In fact, UTF-8 is even more complicated, because surrogate pairs should
be encoded into a single 6-byte UTF-8 sequence, rather than two 4-byte
sequences. Nasty stuff!
one
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top