A CryptoStream puzzler...

P

pseudonym

To keep this short, assume the function working_test() works perfectly
(because it does), while
failing_test() is not. My GetCodec function returns a Stream (CryptoStream)
object wrapped around the Stream it is passed.

private function working_test()
{
string f = "c:\\test.dat";
StreamWriter w;
StreamReader r;

w = new StreamWriter(GetCodec(File.Create(f), CryptoStreamMode.Write));
w.WriteLine("shhh.. it's a secret");
w.Close();

r = new StreamReader(GetCodec(File.OpenRead(f), CryptoStreamMode.Read));
MessageBox.Show(r.ReadLine());
r.Close();
}

private function failing_test()
{
string f = "c:\\test.dat";
Stream s;
StreamWriter w;
StreamReader r;

s = File.Create(f);
w = new StreamWriter(stream);
w.WriteLine("MyFileTag");

w.Flush(); // **** see extra note about this ****

this.dataset.WriteXml(GetCodec(s, CryptoStreamMode.Write),
XmlWriteMode.WriteSchema));
s.Close();

s = File.OpenRead(f);
r = new StreamReader(s);
if("MyFileTag" != r.ReadLine())
{
throw new System.IO.IOException("not my file!");
}
this.dataset.Clear();
this.dataset.ReadXml(GetCodec(s, CryptoStreamMode.Read),
XmlReadMode.Auto));
s.Close();
}

In the second test, writing the file succeeds. When reading, however, the
call to this.dataset.ReadXml
throws the following exception:

System.Security.Cryptography.CryptographicException: Length of the data to
decrypt is invalid.
at
System.Security.Cryptography.CryptoAPITransform.TransformFinalBlock(Byte[]
inputBuffer, Int32 inputOffset, Int32 inputCount)
at System.Security.Cryptography.CryptoStream.FlushFinalBlock()
at System.Security.Cryptography.CryptoStream.Close()
at failing_test()

Opening c:\test.dat in WordPad shows the file as you'd expect:

MyFileTag
[ a large block of encrypted xml data]

Note: Before I added the explict call to w.Flush(), the file was written out
backwards like this:

[ a large block of encrypted xml data]
MyFileTag

Odd that. Anyway if anyone can tell me what I need to do to get my xml to
read properly, I'd appreciate it!

Nick
 
R

Rob Teixeira [MVP]

You need to call FlushFinalBlock.
Block ciphers typically run their algorithms on blocks larger than 1 byte,
so the last block (however big it is) needs special consideration and
possibly padding.
FlushFinalBlock allows the cipher to deal with the specifics of the last
block of encryption.

-Rob Teixeira [MVP]
 
P

pseudonym

Rob,

I'd thought of that, but it didn't seem to help. It looks like I'm missing
something fundamental when it comes to mixing encrypted data and
non-encrypted data.

I've attached a console application that demonstrates the problem.

When I write just unencrypted it reads in fine (duh)
When I write just encrypted it reads fine
When I write both, the unencrypted comes in ok, but the encrypted comes in
as blank strings.

Whassup with that? Help!

Nick

Rob Teixeira said:
You need to call FlushFinalBlock.
Block ciphers typically run their algorithms on blocks larger than 1 byte,
so the last block (however big it is) needs special consideration and
possibly padding.
FlushFinalBlock allows the cipher to deal with the specifics of the last
block of encryption.

-Rob Teixeira [MVP]

pseudonym said:
To keep this short, assume the function working_test() works perfectly
(because it does), while
failing_test() is not. My GetCodec function returns a Stream (CryptoStream)
object wrapped around the Stream it is passed.

private function working_test()
{
string f = "c:\\test.dat";
StreamWriter w;
StreamReader r;

w = new StreamWriter(GetCodec(File.Create(f), CryptoStreamMode.Write));
w.WriteLine("shhh.. it's a secret");
w.Close();

r = new StreamReader(GetCodec(File.OpenRead(f), CryptoStreamMode.Read));
MessageBox.Show(r.ReadLine());
r.Close();
}

private function failing_test()
{
string f = "c:\\test.dat";
Stream s;
StreamWriter w;
StreamReader r;

s = File.Create(f);
w = new StreamWriter(stream);
w.WriteLine("MyFileTag");

w.Flush(); // **** see extra note about this ****

this.dataset.WriteXml(GetCodec(s, CryptoStreamMode.Write),
XmlWriteMode.WriteSchema));
s.Close();

s = File.OpenRead(f);
r = new StreamReader(s);
if("MyFileTag" != r.ReadLine())
{
throw new System.IO.IOException("not my file!");
}
this.dataset.Clear();
this.dataset.ReadXml(GetCodec(s, CryptoStreamMode.Read),
XmlReadMode.Auto));
s.Close();
}

In the second test, writing the file succeeds. When reading, however, the
call to this.dataset.ReadXml
throws the following exception:

System.Security.Cryptography.CryptographicException: Length of the data to
decrypt is invalid.
at
System.Security.Cryptography.CryptoAPITransform.TransformFinalBlock(Byte[]
inputBuffer, Int32 inputOffset, Int32 inputCount)
at System.Security.Cryptography.CryptoStream.FlushFinalBlock()
at System.Security.Cryptography.CryptoStream.Close()
at failing_test()

Opening c:\test.dat in WordPad shows the file as you'd expect:

MyFileTag
[ a large block of encrypted xml data]

Note: Before I added the explict call to w.Flush(), the file was written out
backwards like this:

[ a large block of encrypted xml data]
MyFileTag

Odd that. Anyway if anyone can tell me what I need to do to get my xml to
read properly, I'd appreciate it!

Nick
 
B

BMermuys

Hi,

[snipped]
MyFileTag
[ a large block of encrypted xml data]

Note: Before I added the explict call to w.Flush(), the file was written out
backwards like this:

[ a large block of encrypted xml data]
MyFileTag

Odd that. Anyway if anyone can tell me what I need to do to get my xml to
read properly, I'd appreciate it!

Not so odd. You have two objects connected to the filestream, a
StreamWriter and a CryptoStream. Well if they both have their own cach and
one flushes before the other (maybe DataSet.savexml flushes after writing).
The order can be reversed.
So it would be good practice to flush before switching from encrypted to
unencrypted and vice versa.

====

Now about the reading error. The problem is the StreamReader. It has some
cach too. If you do one ReadLine it may and will have read more then just
that line, so the position of the underlying stream is incorrect.

Solution:
Use a fixed number of bytes to store your unencrypted data

After flushing the unencrypted data, move to a fixed position, the position
should be greater then the already written string:

if (i == 0 || i == 2)
{
textWriter = new StreamWriter(fileStream);
Write(textWriter, "Unencrypted Data");
textWriter.Flush();
fileStream.Position = 20;
}

Now after reading (unencrypted) do the same:

if (i == 0 || i == 2)
{
textReader = new StreamReader(fileStream);
Read(textReader);
fileStream.Position = 20;
}

-or-
create your own StreamReader which does not cach, probely this will be bad
for performance.
note: The StreamReader has a method DiscardBufferData, which loses the
cached data, but it does not restore the position.

Hope this helps,
Greetings
 
P

pseudonym

BMermuys,

Thanks for your explanation... I still find it odd that decorators like
StreamReader do not defer to the BaseStream object for things
like position and buffering. It really limits their flexibility, especially
when there is already a BufferingStream decorator that can
be added when needed.

I guess this is what I was really hoping for:

// warning dream code

Stream stream = new FileStream("myfile.dat");
// file access - unbuffered
BufferedStream buffered = new BufferedStream(stream, 1024); // add a 1k
buffer
StreamReader reader1 = new StreamReader(buffered); //
unencrypted reader
CryptoStream crypto = new CryptoStream(buffered, ...); //
encryption scheme
StreamReader reader2 = new StreamReader(crypto); //
decrypting reader

If all of the decorating wrappers were to defer to the base stream, (like
they should)

stream.Postion == buffered.Position == reader1.Position == crypto.Position
== reader2.Position

Then mixing and matching data would work fine, even if it isn't practical
:)

reader1.ReadLine(); // read unencrypted
reader2.ReadLine(); // read encrypted
reader1.ReadLine(); // read unencrypted
reader2.ReadLine(); // read encrypted
stream.Close()

BMermuys said:
Hi,

[snipped]
MyFileTag
[ a large block of encrypted xml data]

Note: Before I added the explict call to w.Flush(), the file was written out
backwards like this:

[ a large block of encrypted xml data]
MyFileTag

Odd that. Anyway if anyone can tell me what I need to do to get my xml to
read properly, I'd appreciate it!

Not so odd. You have two objects connected to the filestream, a
StreamWriter and a CryptoStream. Well if they both have their own cach and
one flushes before the other (maybe DataSet.savexml flushes after writing).
The order can be reversed.
So it would be good practice to flush before switching from encrypted to
unencrypted and vice versa.

====

Now about the reading error. The problem is the StreamReader. It has some
cach too. If you do one ReadLine it may and will have read more then just
that line, so the position of the underlying stream is incorrect.

Solution:
Use a fixed number of bytes to store your unencrypted data

After flushing the unencrypted data, move to a fixed position, the position
should be greater then the already written string:

if (i == 0 || i == 2)
{
textWriter = new StreamWriter(fileStream);
Write(textWriter, "Unencrypted Data");
textWriter.Flush();
fileStream.Position = 20;
}

Now after reading (unencrypted) do the same:

if (i == 0 || i == 2)
{
textReader = new StreamReader(fileStream);
Read(textReader);
fileStream.Position = 20;
}

-or-
create your own StreamReader which does not cach, probely this will be bad
for performance.
note: The StreamReader has a method DiscardBufferData, which loses the
cached data, but it does not restore the position.

Hope this helps,
Greetings





 
B

BMermuys

Hi,
inline

pseudonym said:
BMermuys,

Thanks for your explanation... I still find it odd that decorators like
StreamReader do not defer to the BaseStream object for things
like position and buffering. It really limits their flexibility, especially
when there is already a BufferingStream decorator that can
be added when needed.

AFAIK the decorator pattern is about adding extra functionality to a base
object without inheriting or changing the base class. Extending
functionality in a straight line as:
FileStream<-BufferedStream<-CryptoStream<-StreamReader

Usally one kind of data is written to a stream and if differents kinds of
data are written, a multiplexer and a demuliplexer are used.
A multiplexer could be a stream decorator which has methods like
AppendStream(....) or maybe GetNewWrittableStream(), etc. And a
demultiplexer is used to retrieve the different streams.

I don't think .NET has classes for this. If it's realy important, you might
consider to build it yourself. Sure you can find something on the internet
about mux/demux streams to get you some idea's.
I guess this is what I was really hoping for:

// warning dream code

Stream stream = new FileStream("myfile.dat");
// file access - unbuffered
BufferedStream buffered = new BufferedStream(stream, 1024); // add a 1k
buffer
StreamReader reader1 = new StreamReader(buffered); //
unencrypted reader
CryptoStream crypto = new CryptoStream(buffered, ...); //
encryption scheme
StreamReader reader2 = new StreamReader(crypto); //
decrypting reader
If all of the decorating wrappers were to defer to the base stream, (like
they should)

stream.Postion == buffered.Position == reader1.Position == crypto.Position
== reader2.Position

I did not say this wasn't true (except for the reader that doesn't have a
position); what I said was that ReadLine() reads more then the line it
returns (because of internal buffer) and therefore the position of the
underlying stream is advanced too much.

greetings
 
P

pseudonym

BMermuys said:
Hi,
inline


I did not say this wasn't true (except for the reader that doesn't have a
position); what I said was that ReadLine() reads more then the line it
returns (because of internal buffer) and therefore the position of the
underlying stream is advanced too much.

Accept that I meant to say reader1.baseStream.Position ;-) etc.

Here's a quick little console app that illustrates the my real problem with
this kind of file handling:

[STAThread]
static void Main(string[] args)
{
FileStream fs = File.OpenRead("c:\\test.txt");
StreamReader r1 = new StreamReader(fs);
StreamReader r2 = new StreamReader(fs);

Console.WriteLine("r1: " + r1.ReadLine());
Console.WriteLine("fs: " + fs.Position);
Console.WriteLine("r1: " + r1.BaseStream.Position);
Console.WriteLine("r2: " + r2.BaseStream.Position);
Console.WriteLine("r2: " + r2.ReadLine());
Console.WriteLine("fs: " + fs.Position);
Console.WriteLine("r1: " + r1.BaseStream.Position);
Console.WriteLine("r2: " + r2.BaseStream.Position);

Console.WriteLine("r1: " + r1.ReadLine());
Console.WriteLine("fs: " + fs.Position);
Console.WriteLine("r1: " + r1.BaseStream.Position);
Console.WriteLine("r2: " + r2.BaseStream.Position);
Console.WriteLine("r2: " + r2.ReadLine());
Console.WriteLine("fs: " + fs.Position);
Console.WriteLine("r1: " + r1.BaseStream.Position);
Console.WriteLine("r2: " + r2.BaseStream.Position);
r1.Close();
r2.Close();
fs.Close();
}

Assume that c:\test.txt is a text file with the numbers 1 - 100 spelled
out, one per line...
INPUT FILE (abridged)

one
two
three
....
NINETY
NINETY-ONE
NINETY-TWO
....

Console OUTPUT:

r1: one
fs: 1024
r1: 1024
r2: 1024
r2: ONE
fs: 1153
r1: 1153
r2: 1153
r1: two
fs: 1153
r1: 1153
r2: 1153
r2: NINETY-TWO
fs: 1153
r1: 1153
r2: 1153
Press any key to continue

Ok, from the output it's clear that when there isn't any data in the buffer,
or when the buffer doesn't contain enough data,
StreamReader.ReadLine reads starting from the current BaseStream.Position
value. For the first r1.ReadLine this is the
first byte of the file, and the line "one" is read.

Now when r2.ReadLine is called, the position value is already in the middle
of the line that reads "NINETY-ONE" so the
first line of text read by that object is "ONE".

Later when r1.ReadLine is again called, the line is pulled from the internal
buffer and "two" is returned. This is inconsistant behavior. Sometimes it's
using an internal offset, and sometimes it's using the BaseStream.Position
value. If the program continues to call r1.ReadLine the last line read by
r1 will read "NINETY-" because the call to r2.ReadLine reached the end of
the file, but the last line read by r2 will be "one hundred" which is the
last line of my test file.

IMHO the behavior should be either.

1. Each Reader reads sequentially "r1: one", "r2: two", "r1: three",
"r2: four" . This makes the most sense, since both readers in my sample
are hooked up to the same underlying FileStream instance.

2. Each Reader reads independantly "r1: one", "r2: one", "r1: two:, "r2:
two". This makes less sense, and in any case can be easily implemented by
using two different instances of FileStream.
 
B

BMermuys

Hi,
[inline]
Ok, from the output it's clear that when there isn't any data in the buffer,
or when the buffer doesn't contain enough data,
StreamReader.ReadLine reads starting from the current BaseStream.Position
value. For the first r1.ReadLine this is the
first byte of the file, and the line "one" is read.

Now when r2.ReadLine is called, the position value is already in the middle
of the line that reads "NINETY-ONE" so the
first line of text read by that object is "ONE".

Later when r1.ReadLine is again called, the line is pulled from the internal
buffer and "two" is returned. This is inconsistant behavior. Sometimes it's
using an internal offset, and sometimes it's using the BaseStream.Position
value. If the program continues to call r1.ReadLine the last line read by
r1 will read "NINETY-" because the call to r2.ReadLine reached the end of
the file, but the last line read by r2 will be "one hundred" which is the
last line of my test file.

I agree with almost everything. I do think that the behaviour is
consistant.

You have to see it from a point of performance, at least I think so. The
streamreader needs to return line by line. But how does it know when a line
ends ? By scanning each byte to see if it's a \n, however reading byte by
byte from disk to see if it is an \n is probely slower, then reading 1024
bytes at once into memory and then scanning the memory for \n when each line
is asked.
Now when it reaches the last line in memory it will try to read another 1024
bytes and so one...

You can read/write strings with binarywritter and binaryreader too. It
doesn't use "overread" because the strings are prefixed with the nr of chars
in the line so it doesn't need to scan.

IMHO the behavior should be either.

Wouldn't it be better if all three situations where possible.

HTH
greetz
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top