StreamReader

Bossie · Sep 16, 2008

Hi All,

I am having a little trouble with a StreamReader. I am currently reading a
pipe delimited file with around 1.8 million records (total size 150MB) and,
based on a flag, insert update or delete a record in a SQL Server DB using
System.Data.SqlClient.

So I do a ReadLine() on the streamreader, do my action and then step to the
next line. (Note I am using SqlBulkCopy to do the DB side, but I don't think
that really affects by problem).

Every once in a while (say every second or third complete run), the
ReadLine() pointer gets confused and jumps back somewhere between 30-60 lines
in memory and reprocess the records it has jumped back over. It does this
until it gets to where it was before the backwards jump and then jump
forwards to where it was meant to be originally. So for a simple example,
lets say I have records as follows:

1|Jack
2|John
3|Steve
4|Mary
5|Sarah
6|Michelle
7|Brian
8|Sally
9|David

The process starts and processes 1, then 2, then 3, then 4, then 5 then
jumps back (2 spaces for simplicity) to 3 again (instead of 6) and
reprocessed it, then it advances to 4 and reprocesses it (instead of 7). When
it gets back to 5 it realizes that it should have been at 8 now, so it jumps
to 8 and continues merily.

So essencially, 3 and 4 was processed twice and 6 and 7 was skipped. The
problem is that this is intermittend, it doesn't happen all the time and it
doesn't happen at the same place.

A little additional information that may be relevant:
- I'm running the process from within a thread
- The process is running on a VM with 2GB of memory
- Error happens both from running it as debug inside VS2008 and as a
standalone exe in release build

I can only assume it is either a memory leak or an issue with threading, but
any advice will help.

Bossie · Sep 16, 2008

Hi Peter,

Thanks to the quick reply, below is an extract of the code. The only time I
touch the StreamReader is in the while statement. I am "fairly" sure the code
is fine because it is not doing naything too complex. It's the inconsistency
that is getting me.

string line = "";
TextReader tr = new StreamReader(sFilename);
while ((line = tr.ReadLine()) != null)
{
val++;
if (linenum != realLineCount) //skip the last line
{
if (line.Split("|".ToCharArray())[0] == "D")
{
deleted++;
}
else
{
inserted++;
record = new DeliveryAddressLine(line.Split('|'), Pivotal_User_Id);

new_id_value = Common.IntegerToPivotalByteArray(last_id_value, val);

ImportTable.Rows.Add(((PivotalRecord)record).GetDataArray((byte[])new_id_value.Clone()));

if (inserted != 0 && inserted % 1000 == 0)
{
using (SqlBulkCopy bulkCopy = new SqlBulkCopy(cn))
{
bulkCopy.DestinationTableName =
((PivotalRecord)record).Pivotal_Table_Name;
List<Pair> ColumnMappings = ((PivotalRecord)record).ColumnMappings;
foreach (Pair c in ColumnMappings)
bulkCopy.ColumnMappings.Add(new
SqlBulkCopyColumnMapping(c.index, c.value.ToString()));
try
{
bulkCopy.WriteToServer(ImportTable);
}
catch (Exception ex)
{
MessageBox.Show(ex.Message);
}
ImportTable.Rows.Clear();
}
}
}
linenum++;
}
--
Cheers

Paul Reyneke
CRM Blog - http://paulreyneke.blogspot.com/

Peter Duniho said:
[...]
Every once in a while (say every second or third complete run), the
ReadLine() pointer gets confused and jumps back somewhere between 30-60
lines
in memory and reprocess the records it has jumped back over. It does this
until it gets to where it was before the backwards jump and then jump
forwards to where it was meant to be originally.
[...]
I can only assume it is either a memory leak or an issue with threading,
but
any advice will help.

Click to expand...

I think it is practically certain you have a bug somewhere in your code.
Probably in the form of some code meddling with the underlying stream
while the StreamReader is using it. Alternatively, somewhere in the data
flow after the StreamReader returns a specific line you might have a
problem. A "memory leak" or "an issue with threading" seem highly
unlikely to be a primary cause.

But, without a concise-but-complete code sample that reliably reproduces
the problem, it's not possible to find and describe the actual error in
your code. If you actually want an answer, you need to post code.

Pete

KH · Sep 16, 2008

How are you determining that the StreamReader is jumping around? I mean, can
you actually see it while debugging or are you seeing it in the results in
the database? If you're looking at the db there could be a problem in your
conditionals deciding what data to copy and when.

I would suggest removing the database stuff and try to just ReadLine() thru
the file, use a StreamWriter to write the exact text to another file and
compare them - if they're the same size then it's probably not StreamReader's
fault.

As an aside you don't need to create a new SqlBulkCopy every time or buffer
it yourself -- you can create one outside the loop, set how many rows you
want it to copy at a time, and it will take care of all that for you.

HTH

Bossie said:
Hi Peter,

Thanks to the quick reply, below is an extract of the code. The only time I
touch the StreamReader is in the while statement. I am "fairly" sure the code
is fine because it is not doing naything too complex. It's the inconsistency
that is getting me.

string line = "";
TextReader tr = new StreamReader(sFilename);
while ((line = tr.ReadLine()) != null)
{
val++;
if (linenum != realLineCount) //skip the last line
{
if (line.Split("|".ToCharArray())[0] == "D")
{
deleted++;
}
else
{
inserted++;
record = new DeliveryAddressLine(line.Split('|'), Pivotal_User_Id);

new_id_value = Common.IntegerToPivotalByteArray(last_id_value, val);

ImportTable.Rows.Add(((PivotalRecord)record).GetDataArray((byte[])new_id_value.Clone()));

if (inserted != 0 && inserted % 1000 == 0)
{
using (SqlBulkCopy bulkCopy = new SqlBulkCopy(cn))
{
bulkCopy.DestinationTableName =
((PivotalRecord)record).Pivotal_Table_Name;
List<Pair> ColumnMappings = ((PivotalRecord)record).ColumnMappings;
foreach (Pair c in ColumnMappings)
bulkCopy.ColumnMappings.Add(new
SqlBulkCopyColumnMapping(c.index, c.value.ToString()));
try
{
bulkCopy.WriteToServer(ImportTable);
}
catch (Exception ex)
{
MessageBox.Show(ex.Message);
}
ImportTable.Rows.Clear();
}
}
}
linenum++;
}
--
Cheers

Paul Reyneke
CRM Blog - http://paulreyneke.blogspot.com/

Peter Duniho said:

[...]
Every once in a while (say every second or third complete run), the
ReadLine() pointer gets confused and jumps back somewhere between 30-60
lines
in memory and reprocess the records it has jumped back over. It does this
until it gets to where it was before the backwards jump and then jump
forwards to where it was meant to be originally.
[...]
I can only assume it is either a memory leak or an issue with threading,
but
any advice will help.

Click to expand...

I think it is practically certain you have a bug somewhere in your code.
Probably in the form of some code meddling with the underlying stream
while the StreamReader is using it. Alternatively, somewhere in the data
flow after the StreamReader returns a specific line you might have a
problem. A "memory leak" or "an issue with threading" seem highly
unlikely to be a primary cause.

But, without a concise-but-complete code sample that reliably reproduces
the problem, it's not possible to find and describe the actual error in
your code. If you actually want an answer, you need to post code.

Pete

Click to expand...

Anthony Jones · Sep 16, 2008

Peter Duniho said:
Please see http://www.yoda.arachsys.com/csharp/complete.html

The code you posted was neither concise nor complete, both being
requirements for effective posting of programming-related questions.

I don't see anything obvious in the code you posted that would explain the
issue. But all that means is that whatever the problem, it's somewhere
other than in the code you posted.

Obviously "the code" is not fine. If it were, you wouldn't have the
problem. While there's a very tiny possibility that the part of "the
code" that's wrong is in .NET itself, the most likely explanation is that
there's something wrong in your actual program.

But either way, there's no way to identify the problem without a
concise-but-complete code sample that reliably reproduces the problem.
The code sample should require no effort other than to be compiled and
executed in order to reproduce the problem, and it should include no code
that isn't directly related to and needed in order to reproduce the
problem.

IOW, take out all the DB marlarky and instead write the read content to a
StreamWriter. Put the pipes back into the output file such that the
generated file ought to be identical to the original. Then use something
like WinMerge to compare the files.

If you find the files are different post your test code where along with a
description of how they are different.

BTW, what is it that invokes this code? I can't find any documentation on
the lock that StreamReader puts on the internal stream it creates when
passed a filename in its constructor. Could it be something is still
writing to the file?

WCG Stats Saturday 29 July 2023	3	Jul 29, 2023
WCG Stats Friday 14 July 2023	4	Jul 14, 2023
WCG Stats Wednesday 26 October 2022	3	Oct 26, 2022
WCG Stats Friday 22 September 2023	3	Sep 22, 2023
WCG Stats Sunday 01 October 2023	3	Oct 1, 2023
WCG Stats Tuesday 15 August 2023	5	Aug 15, 2023
WCG Stats Thursday 31 August 2023	3	Aug 31, 2023
WCG Stats Sunday 30 October 2022	2	Oct 30, 2022

StreamReader

Bossie

Bossie

KH

Anthony Jones

Ask a Question

Similar Threads