StreamReader

B

Bossie

Hi All,

I am having a little trouble with a StreamReader. I am currently reading a
pipe delimited file with around 1.8 million records (total size 150MB) and,
based on a flag, insert update or delete a record in a SQL Server DB using
System.Data.SqlClient.

So I do a ReadLine() on the streamreader, do my action and then step to the
next line. (Note I am using SqlBulkCopy to do the DB side, but I don't think
that really affects by problem).

Every once in a while (say every second or third complete run), the
ReadLine() pointer gets confused and jumps back somewhere between 30-60 lines
in memory and reprocess the records it has jumped back over. It does this
until it gets to where it was before the backwards jump and then jump
forwards to where it was meant to be originally. So for a simple example,
lets say I have records as follows:

1|Jack
2|John
3|Steve
4|Mary
5|Sarah
6|Michelle
7|Brian
8|Sally
9|David

The process starts and processes 1, then 2, then 3, then 4, then 5 then
jumps back (2 spaces for simplicity) to 3 again (instead of 6) and
reprocessed it, then it advances to 4 and reprocesses it (instead of 7). When
it gets back to 5 it realizes that it should have been at 8 now, so it jumps
to 8 and continues merily.

So essencially, 3 and 4 was processed twice and 6 and 7 was skipped. The
problem is that this is intermittend, it doesn't happen all the time and it
doesn't happen at the same place.

A little additional information that may be relevant:
- I'm running the process from within a thread
- The process is running on a VM with 2GB of memory
- Error happens both from running it as debug inside VS2008 and as a
standalone exe in release build

I can only assume it is either a memory leak or an issue with threading, but
any advice will help.
 
B

Bossie

Hi Peter,

Thanks to the quick reply, below is an extract of the code. The only time I
touch the StreamReader is in the while statement. I am "fairly" sure the code
is fine because it is not doing naything too complex. It's the inconsistency
that is getting me.


string line = "";
TextReader tr = new StreamReader(sFilename);
while ((line = tr.ReadLine()) != null)
{
val++;
if (linenum != realLineCount) //skip the last line
{
if (line.Split("|".ToCharArray())[0] == "D")
{
deleted++;
}
else
{
inserted++;
record = new DeliveryAddressLine(line.Split('|'), Pivotal_User_Id);

new_id_value = Common.IntegerToPivotalByteArray(last_id_value, val);


ImportTable.Rows.Add(((PivotalRecord)record).GetDataArray((byte[])new_id_value.Clone()));

if (inserted != 0 && inserted % 1000 == 0)
{
using (SqlBulkCopy bulkCopy = new SqlBulkCopy(cn))
{
bulkCopy.DestinationTableName =
((PivotalRecord)record).Pivotal_Table_Name;
List<Pair> ColumnMappings = ((PivotalRecord)record).ColumnMappings;
foreach (Pair c in ColumnMappings)
bulkCopy.ColumnMappings.Add(new
SqlBulkCopyColumnMapping(c.index, c.value.ToString()));
try
{
bulkCopy.WriteToServer(ImportTable);
}
catch (Exception ex)
{
MessageBox.Show(ex.Message);
}
ImportTable.Rows.Clear();
}
}
}
linenum++;
}
--
Cheers

Paul Reyneke
CRM Blog - http://paulreyneke.blogspot.com/


Peter Duniho said:
[...]
Every once in a while (say every second or third complete run), the
ReadLine() pointer gets confused and jumps back somewhere between 30-60
lines
in memory and reprocess the records it has jumped back over. It does this
until it gets to where it was before the backwards jump and then jump
forwards to where it was meant to be originally.
[...]
I can only assume it is either a memory leak or an issue with threading,
but
any advice will help.

I think it is practically certain you have a bug somewhere in your code.
Probably in the form of some code meddling with the underlying stream
while the StreamReader is using it. Alternatively, somewhere in the data
flow after the StreamReader returns a specific line you might have a
problem. A "memory leak" or "an issue with threading" seem highly
unlikely to be a primary cause.

But, without a concise-but-complete code sample that reliably reproduces
the problem, it's not possible to find and describe the actual error in
your code. If you actually want an answer, you need to post code.

Pete
 
K

KH

How are you determining that the StreamReader is jumping around? I mean, can
you actually see it while debugging or are you seeing it in the results in
the database? If you're looking at the db there could be a problem in your
conditionals deciding what data to copy and when.

I would suggest removing the database stuff and try to just ReadLine() thru
the file, use a StreamWriter to write the exact text to another file and
compare them - if they're the same size then it's probably not StreamReader's
fault.

As an aside you don't need to create a new SqlBulkCopy every time or buffer
it yourself -- you can create one outside the loop, set how many rows you
want it to copy at a time, and it will take care of all that for you.

HTH


Bossie said:
Hi Peter,

Thanks to the quick reply, below is an extract of the code. The only time I
touch the StreamReader is in the while statement. I am "fairly" sure the code
is fine because it is not doing naything too complex. It's the inconsistency
that is getting me.


string line = "";
TextReader tr = new StreamReader(sFilename);
while ((line = tr.ReadLine()) != null)
{
val++;
if (linenum != realLineCount) //skip the last line
{
if (line.Split("|".ToCharArray())[0] == "D")
{
deleted++;
}
else
{
inserted++;
record = new DeliveryAddressLine(line.Split('|'), Pivotal_User_Id);

new_id_value = Common.IntegerToPivotalByteArray(last_id_value, val);


ImportTable.Rows.Add(((PivotalRecord)record).GetDataArray((byte[])new_id_value.Clone()));

if (inserted != 0 && inserted % 1000 == 0)
{
using (SqlBulkCopy bulkCopy = new SqlBulkCopy(cn))
{
bulkCopy.DestinationTableName =
((PivotalRecord)record).Pivotal_Table_Name;
List<Pair> ColumnMappings = ((PivotalRecord)record).ColumnMappings;
foreach (Pair c in ColumnMappings)
bulkCopy.ColumnMappings.Add(new
SqlBulkCopyColumnMapping(c.index, c.value.ToString()));
try
{
bulkCopy.WriteToServer(ImportTable);
}
catch (Exception ex)
{
MessageBox.Show(ex.Message);
}
ImportTable.Rows.Clear();
}
}
}
linenum++;
}
--
Cheers

Paul Reyneke
CRM Blog - http://paulreyneke.blogspot.com/


Peter Duniho said:
[...]
Every once in a while (say every second or third complete run), the
ReadLine() pointer gets confused and jumps back somewhere between 30-60
lines
in memory and reprocess the records it has jumped back over. It does this
until it gets to where it was before the backwards jump and then jump
forwards to where it was meant to be originally.
[...]
I can only assume it is either a memory leak or an issue with threading,
but
any advice will help.

I think it is practically certain you have a bug somewhere in your code.
Probably in the form of some code meddling with the underlying stream
while the StreamReader is using it. Alternatively, somewhere in the data
flow after the StreamReader returns a specific line you might have a
problem. A "memory leak" or "an issue with threading" seem highly
unlikely to be a primary cause.

But, without a concise-but-complete code sample that reliably reproduces
the problem, it's not possible to find and describe the actual error in
your code. If you actually want an answer, you need to post code.

Pete
 
A

Anthony Jones

Peter Duniho said:
Please see http://www.yoda.arachsys.com/csharp/complete.html

The code you posted was neither concise nor complete, both being
requirements for effective posting of programming-related questions.

I don't see anything obvious in the code you posted that would explain the
issue. But all that means is that whatever the problem, it's somewhere
other than in the code you posted.


Obviously "the code" is not fine. If it were, you wouldn't have the
problem. While there's a very tiny possibility that the part of "the
code" that's wrong is in .NET itself, the most likely explanation is that
there's something wrong in your actual program.

But either way, there's no way to identify the problem without a
concise-but-complete code sample that reliably reproduces the problem.
The code sample should require no effort other than to be compiled and
executed in order to reproduce the problem, and it should include no code
that isn't directly related to and needed in order to reproduce the
problem.

IOW, take out all the DB marlarky and instead write the read content to a
StreamWriter. Put the pipes back into the output file such that the
generated file ought to be identical to the original. Then use something
like WinMerge to compare the files.

If you find the files are different post your test code where along with a
description of how they are different.

BTW, what is it that invokes this code? I can't find any documentation on
the lock that StreamReader puts on the internal stream it creates when
passed a filename in its constructor. Could it be something is still
writing to the file?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top