How do I create a one line text file with control codes? e.g.: 144 = 0x90 and 147 = 0x93?

Dan V. · May 12, 2004

How do I create a one line text file with these control codes? e.g.: 144 =
0x90 and 147 = 0x93?

I am trying to create a one line text file with these characters all one one
row with no spaces.

1. 144 = 0x90
2. 147 = 0x93
3. STX = (^B = 2 = 0x2)
4. NUL = (^@ = 0 = 0x0)
5. DC4 = (^T = 20 = 0x14)
6. SOH = (^A = 1 = 0x1)
7. GS (^] = 29 = 0x1d)
8. RS = (^^ = 30 = 0x1e

I tried StreamWriter, BinaryWriter... but it can't seem to handle the 147 =
0x93 without putting a strange 'A' character.
This is for use with a legacy program and that 'A' character can not be in
the text file.

I am using TextPad 4 to view the control characters.
The code below is close, but for 144 and 147 the strange 'A' character
exists prior to the Control code.

// BUG?: Do not specify encoding 'System.Text.Encoding.UTF8' else 147 will
not work: (even though documentation says UTF8 is default).
StreamWriter sw = new StreamWriter(@"C:\Text2.txt", false);

sw.WriteLine("Encoding: {0}", sw.Encoding.ToString());

sw.Write( (char) 144 );
sw.Write( (char) 147 );
sw.Write( (char) 2 );
sw.Write( (char) 0 );
sw.Write( (char) 20 );
sw.Write( (char) 1 );
sw.Write( (char) 29 );
sw.Write( (char) 30 );

sw.Close();

Niki Estner · May 12, 2004

Just curious: if you don't want codepage-translation, why do you use a
TextWriter class?
I think you should use a FileStream.

Niki

Dan V. · May 12, 2004

What is codepage-translation?

Tried FileStream in this way with similar results (I must be missing
something - though I am still using StreamWriter...)

// Create a 'FileStream' object.

FileStream myFileStream = new FileStream(@"C:\Text3.txt", FileMode.Create,
FileAccess.Write);

// Create a 'StreamWriter' to write the data into the file.

StreamWriter myStreamWriter = new StreamWriter(myFileStream,
System.Text.Encoding.UTF8);

myStreamWriter.WriteLine("Encoding: {0}",
myStreamWriter.Encoding.ToString());

myStreamWriter.Write( (char) 144 );

myStreamWriter.Write( (char) 147 );

myStreamWriter.Write( (char) 2 );

myStreamWriter.Write( (char) 0 );

myStreamWriter.Write( (char) 20 );

myStreamWriter.Write( (char) 1 );

myStreamWriter.Write( (char) 29 );

myStreamWriter.Write( (char) 30 );

// Update the 'StreamWriter'.

myStreamWriter.Flush();

// Close the 'StreamWriter' and FileStream.

myStreamWriter.Close();

myFileStream.Close();

Niki Estner said:
Just curious: if you don't want codepage-translation, why do you use a
TextWriter class?
I think you should use a FileStream.

Niki

Dan V. said:

How do I create a one line text file with these control codes? e.g.: 144 =
0x90 and 147 = 0x93?

I am trying to create a one line text file with these characters all one one
row with no spaces.

1. 144 = 0x90
2. 147 = 0x93
3. STX = (^B = 2 = 0x2)
4. NUL = (^@ = 0 = 0x0)
5. DC4 = (^T = 20 = 0x14)
6. SOH = (^A = 1 = 0x1)
7. GS (^] = 29 = 0x1d)
8. RS = (^^ = 30 = 0x1e

I tried StreamWriter, BinaryWriter... but it can't seem to handle the

Click to expand...

147

=
0x93 without putting a strange 'A' character.
This is for use with a legacy program and that 'A' character can not be in
the text file.

I am using TextPad 4 to view the control characters.
The code below is close, but for 144 and 147 the strange 'A' character
exists prior to the Control code.

// BUG?: Do not specify encoding 'System.Text.Encoding.UTF8' else 147 will
not work: (even though documentation says UTF8 is default).
StreamWriter sw = new StreamWriter(@"C:\Text2.txt", false);

sw.WriteLine("Encoding: {0}", sw.Encoding.ToString());

sw.Write( (char) 144 );
sw.Write( (char) 147 );
sw.Write( (char) 2 );
sw.Write( (char) 0 );
sw.Write( (char) 20 );
sw.Write( (char) 1 );
sw.Write( (char) 29 );
sw.Write( (char) 30 );

sw.Close();

Click to expand...

Austin Ehlers · May 13, 2004

<snip>

For straight byte reading/writing, use the FileStream class. Ex:

FileStream fs = new FileStream(@"C:\Text3.txt", FileMode.Create,
FileAccess.Write);

fs.WriteByte(144);
fs.WriteByte(147);
fs.WriteByte(2);
fs.WriteByte(0);
fs.WriteByte(20);
fs.WriteByte(1);
fs.WriteByte(29);
fs.WriteByte(30);
fs.Close();

Dan V. · May 13, 2004

Thank you it worked beautifully.

How would I write a string using WriteByte?
Like:

fs.WriteByte(0);
fs.Write(@"c:\TextToAdd.txt"); // want to add a control char before and
after this string, but this does not work, not sure how to use all
parameters

fs.WriteByte(29);

Jon Skeet [C# MVP] · May 13, 2004

Dan V. said:
Thank you it worked beautifully.

How would I write a string using WriteByte?
Like:

fs.WriteByte(0);
fs.Write(@"c:\TextToAdd.txt"); // want to add a control char before and
after this string, but this does not work, not sure how to use all
parameters

fs.WriteByte(29);

Rather than using WriteByte, you'd probably just use Write that takes a
byte array, after converting the string to a byte array using the
appropriate Encoding.

See http://www.pobox.com/~skeet/csharp/unicode.html for more info about
encodings.

Dan V. · May 13, 2004

Thanks again, I read your article and this one:
http://www.joelonsoftware.com/articles/Unicode.html

Very helpful.

This code was what I needed.
s = (@"c:\TextToAdd.txt");

fs.Write(System.Text.Encoding.UTF8.GetBytes(s), 0, s.Length);

This page got me on the write track also:

http://www.ondotnet.com/pub/a/dotnet/excerpt/csharpckbk_chap01/

Jon Skeet [C# MVP] · May 13, 2004

Dan V. said:
Thanks again, I read your article and this one:
http://www.joelonsoftware.com/articles/Unicode.html

Very helpful.
Goodo.

This code was what I needed.
s = (@"c:\TextToAdd.txt");

fs.Write(System.Text.Encoding.UTF8.GetBytes(s), 0, s.Length);

That's not quite the code you need. You'll end up writing s.Length
bytes, even though GetBytes may well have returned more bytes than
that.

Use

byte[] bytes = Encoding.UTF8.GetBytes(s);
fs.Write(bytes, 0, bytes.Length);

This page got me on the write track also:

http://www.ondotnet.com/pub/a/dotnet/excerpt/csharpckbk_chap01/

Can't say I like that page much, given the code it espouses. There's no
need to create a new instance of UnicodeEncoding etc each time. Why
bother with a whole extra method in the first place when you can just
do:

string x = Encoding.Unicode.GetString(bytes);

rather than

string x = FromUnicodeByteArray(bytes);

Dan V. · May 13, 2004

You must be thinking - "did he really understand those articles..."
Is this the point then?

string.Length may not equal bytes.length because:
1) some characters may not equal exactly one byte (even though in my case
it did)
2) some characters (if I used non 'English' ones may be more than one
byte) - I am recalling that in UTF8 ANSI and ASCII, they use only one byte
only for most 'English' characters and 2 or more bytes for the rest...

Jon Skeet said:
Dan V. said:

Thanks again, I read your article and this one:
http://www.joelonsoftware.com/articles/Unicode.html

Very helpful.
Goodo.

This code was what I needed.
s = (@"c:\TextToAdd.txt");

fs.Write(System.Text.Encoding.UTF8.GetBytes(s), 0, s.Length);

Click to expand...

That's not quite the code you need. You'll end up writing s.Length
bytes, even though GetBytes may well have returned more bytes than
that.

Use

byte[] bytes = Encoding.UTF8.GetBytes(s);
fs.Write(bytes, 0, bytes.Length);

This page got me on the write track also:

http://www.ondotnet.com/pub/a/dotnet/excerpt/csharpckbk_chap01/

Click to expand...

Can't say I like that page much, given the code it espouses. There's no
need to create a new instance of UnicodeEncoding etc each time. Why
bother with a whole extra method in the first place when you can just
do:

string x = Encoding.Unicode.GetString(bytes);

rather than

string x = FromUnicodeByteArray(bytes);

Jon Skeet [C# MVP] · May 13, 2004

Dan V. said:
You must be thinking - "did he really understand those articles..."

Only a *little* bit

Is this the point then?

string.Length may not equal bytes.length because:
1) some characters may not equal exactly one byte (even though in my case
it did)

Indeed. It will depend on the encoding and the characters being
encoded. For instance, using Encoding.Unicode you will always get twice
as many bytes as characters. Using Encoding.ASCII you'll always get
exactly the same number of bytes as characters (but you can only
properly encode characters 0-127). Using Encoding.UTF8 you'll get a
variable number depending on the character - ASCII values still end up
as one byte, but the number of bytes grows depending on the characters.
In fact, UTF-8 is even more complicated, because surrogate pairs should
be encoded into a single 6-byte UTF-8 sequence, rather than two 4-byte
sequences. Nasty stuff!

2) some characters (if I used non 'English' ones may be more than one
byte) - I am recalling that in UTF8 ANSI and ASCII, they use only one byte
only for most 'English' characters and 2 or more bytes for the rest...

"ANSI" isn't a single character set - but UTF8 and ASCII are certainly
one byte per character for all ASCII characters.

Dan V. · May 13, 2004

Now I am trying to get rid of that '@' symbol from a string, which I use
previously.
I don't want to write that '@' symbol to the file and WriteByte seems to try
to do this.

s = @"c:\text1.txt"

bytes = Encoding.UTF8.GetBytes(s);

fs.Write(bytes, 0, bytes.Length);

Dan V. · May 13, 2004

Never mind,
the problem was else where, WriteByte can handle it.

Dan V. said:
Now I am trying to get rid of that '@' symbol from a string, which I use
previously.
I don't want to write that '@' symbol to the file and WriteByte seems to try
to do this.

s = @"c:\text1.txt"

bytes = Encoding.UTF8.GetBytes(s);

fs.Write(bytes, 0, bytes.Length);

Jon Skeet said:

Only a *little* bit

Indeed. It will depend on the encoding and the characters being
encoded. For instance, using Encoding.Unicode you will always get twice
as many bytes as characters. Using Encoding.ASCII you'll always get
exactly the same number of bytes as characters (but you can only
properly encode characters 0-127). Using Encoding.UTF8 you'll get a
variable number depending on the character - ASCII values still end up
as one byte, but the number of bytes grows depending on the characters.
In fact, UTF-8 is even more complicated, because surrogate pairs should
be encoded into a single 6-byte UTF-8 sequence, rather than two 4-byte
sequences. Nasty stuff!

Click to expand...

one

Jon Skeet [C# MVP] · May 14, 2004

Dan V. said:
Never mind,
the problem was else where, WriteByte can handle it.

That's because the @ isn't in the string at all - it just means it's a
verbatim string literal.

See http://www.pobox.com/~skeet/csharp/faq/#verbatim.literals

Dan V. · Jun 10, 2004

Sorry this was my problem. It works as advertised.

Dan V. said:
Now I am trying to get rid of that '@' symbol from a string, which I use
previously.
I don't want to write that '@' symbol to the file and WriteByte seems to try
to do this.

s = @"c:\text1.txt"

bytes = Encoding.UTF8.GetBytes(s);

fs.Write(bytes, 0, bytes.Length);

Jon Skeet said:

Only a *little* bit

Indeed. It will depend on the encoding and the characters being
encoded. For instance, using Encoding.Unicode you will always get twice
as many bytes as characters. Using Encoding.ASCII you'll always get
exactly the same number of bytes as characters (but you can only
properly encode characters 0-127). Using Encoding.UTF8 you'll get a
variable number depending on the character - ASCII values still end up
as one byte, but the number of bytes grows depending on the characters.
In fact, UTF-8 is even more complicated, because surrogate pairs should
be encoded into a single 6-byte UTF-8 sequence, rather than two 4-byte
sequences. Nasty stuff!

Click to expand...

one

How do I create a one line text file with control codes? e.g.: 144 = 0x90 and 147 = 0x93?

Dan V.

Niki Estner

Dan V.

Austin Ehlers

Dan V.

Jon Skeet [C# MVP]

Dan V.

Jon Skeet [C# MVP]

Dan V.

Jon Skeet [C# MVP]

Dan V.

Dan V.

Jon Skeet [C# MVP]

Dan V.