Problem Using TextReader and Binaray Reader on Same File

Guest · Apr 6, 2005

I am rewriting a C++ application in C#. This file has a combination of Text
and Binary data.

I used CFile before to read the text. If I hit a certain string that
denotes the following data is binary, I used the current position in the
file and another stream to read to the binary data.

All text data is ended with a carriage return / line feed while the binary
is actually an image file listed byte by byte. Preceding the binary data is
a text value ended by a cr/lf listing the actual number of bytes of binary
data.

The problem is that the position given by textreader.basestream seems to be
actually twice the actual position in bytes. Therefore positioning a Binary
Reader based on the position in a Text Reader is incorrect.

This worked with CFile.

Any ideas?

Thank You,
Jeff

Joshua Flanagan · Apr 7, 2005

It sounds like you are using a 16-bit encoding with the TextReader. I
wonder if specifying an 8-bit encoding (System.Text.Encoding.ASCII)
would solve your problem. It's worth a try.
Otherwise, if you can consistently see that the TextReader position is
always twice the binary position... well, divide by 2 (right shift 1)!

Guest · Apr 7, 2005

Joshua,

I am considering that except I did not want to do anything that would
potentially fail in converting to a 64 bit OS.

I did try the ASCII encoding, etc and the same problem occurred. I think I
am just going to use the binary reader and create my own readline
functionality.

I also just read the entire file into memory and it did not bog down the
machine too much. I am not sure about the deployment machine yet. The
FileStream / Binary Reader classes may provide some behind the scenes
buffering to help this.

This does give me the capability to scan the contents quickly and create
some file offset arrays to quickly process the data I need.

Thanks for the input,
Jeff

Jon Skeet [C# MVP] · Apr 7, 2005

I am rewriting a C++ application in C#. This file has a combination of Text
and Binary data.

I used CFile before to read the text. If I hit a certain string that
denotes the following data is binary, I used the current position in the
file and another stream to read to the binary data.

All text data is ended with a carriage return / line feed while the binary
is actually an image file listed byte by byte. Preceding the binary data is
a text value ended by a cr/lf listing the actual number of bytes of binary
data.

The problem is that the position given by textreader.basestream seems to be
actually twice the actual position in bytes. Therefore positioning a Binary
Reader based on the position in a Text Reader is incorrect.

This is an unfortunate problem with TextReader. You *may* find that
creating a StreamReader with a buffer size of 1 sorts the problem, but
it'll be very inefficient.

If your file is in ASCII, you could spot the CRLF while reading it as
binary data, and then convert the binary data to text when you find the
CRLF, all the while knowing where you are.

If you're in control of the file format, however, I'd suggest that you
prefix any text section with the number of bytes in that text section.
You can then read that amount of data and convert it to text
separately, without overrunning.

Jon Skeet [C# MVP] · Apr 7, 2005

Joshua Flanagan said:
It sounds like you are using a 16-bit encoding with the TextReader. I
wonder if specifying an 8-bit encoding (System.Text.Encoding.ASCII)
would solve your problem. It's worth a try.
Otherwise, if you can consistently see that the TextReader position is
always twice the binary position... well, divide by 2 (right shift 1)!

No, the problem is that StreamReader will read more than it's returned
so far into an internal buffer. The stream's position is then
"inaccurate" in terms of how much appears to have been read.

Guest · Apr 8, 2005

Jon,

I appreciate you response and it makes sense now but does not help my plight
much in migrating the code.

However is there any reason I can not just access the file via BinaryReader
and load the entire file into a byte[] buffer and parse it instead. It
seems to load the large file quickly and efficiently. The files are 35 to
60 megabytes. It is so quick it seems it is buffered or cached behind the
scenes anyway.

I am doing this way now so I just have deal with parsing the byte buffer
versus potentially dealing with network / buffering issues hitting the
BinaryStream directly.

Thank You,
Jeff

Jon Skeet [C# MVP] · Apr 8, 2005

I appreciate you response and it makes sense now but does not help my plight
much in migrating the code.

However is there any reason I can not just access the file via BinaryReader
and load the entire file into a byte[] buffer and parse it instead. It
seems to load the large file quickly and efficiently. The files are 35 to
60 megabytes. It is so quick it seems it is buffered or cached behind the
scenes anyway.

If you've got enough memory and if the text is in a simple encoding
(such as ASCII) where you don't need to worry about detecting multi-
byte characters, that may well be the easiest way of doing things.

I am doing this way now so I just have deal with parsing the byte buffer
versus potentially dealing with network / buffering issues hitting the
BinaryStream directly.

Right.

Problem using StreamReader and BinanryReader / Writer Together.	2	Apr 6, 2005
Help converting a vb6 function to vbnet.	8	Jun 2, 2010
reading a C++ structure from a binary file using C#	10	Mar 30, 2007
How to file-read ASCIIZ (null-terminated strings)?	3	Feb 27, 2006
Read text-segment from binary file	1	Jun 10, 2006
retrieve binary image data from resx file to byte Array for display?	7	Apr 2, 2010
Need for efficient method for File Parsing and copying	3	Aug 27, 2008
Identical binaries from same source code	20	Mar 12, 2008

Problem Using TextReader and Binaray Reader on Same File

Guest

Joshua Flanagan

Guest

Jon Skeet [C# MVP]

Jon Skeet [C# MVP]

Guest

Jon Skeet [C# MVP]

Ask a Question

Similar Threads