Performance issue with filestream.write on huge file

G

Guy

I have 6,235,099 records in my db table. Each record needs to be
converted to an XML element and outputed to a file. Since each XML
element have about 329 bytes, the resulting files will be about
2.05Gb.

I started my appliction yesterday at 5pm, and tody at 12pm, I got an
"System.OutOfMemoryException".

I read a few posts and someone wrote that the memory leaks are cause
by the type array written which are not released by the GC. So I did
this:

public override void Write(byte[] array, int offset, int count)
{
base.Write(array, offset, count);
array = null;
}

Now I am not sure wether I should call Flush(), or
CG.Collec(<something>), or use any other trick to improve performance.
There this also the mention of not passing the file path to the
FileStream ctor, but the handle? I did not got that one....

Any help would appreciated. Thanks

public class TransferXmlFile : FileStream
{
private string m_strFileNameAndPath = null;
private int m_iApplicationCount = 0;

public TransferXmlFile(string strFileNameAndPath)
: base( strFileNameAndPath, FileMode.Create)
{
m_strFileNameAndPath = strFileNameAndPath;
}
....etc
 
P

Peter Duniho

I have 6,235,099 records in my db table. Each record needs to be
converted to an XML element and outputed to a file. Since each XML
element have about 329 bytes, the resulting files will be about
2.05Gb.

I started my appliction yesterday at 5pm, and tody at 12pm, I got an
"System.OutOfMemoryException".

Why is it taking you 19 hours to fail? A 2GB file should be able to be
written completely in a fraction of that time. Is it possible that
your memory consumption bug is causing excessive memory swapping as
well, killing performance?
I read a few posts and someone wrote that the memory leaks are cause
by the type array written which are not released by the GC. So I did
this:

public override void Write(byte[] array, int offset, int count)
{
base.Write(array, offset, count);
array = null;
}

Setting your parameter "array" to null is pointless. The variable's
reference is already eligible for garbage collection at that point, at
least from the perspective of that method. The caller, of course,
still may have a reference and if so the byte[] won't be eligible for
garbage collection no matter what you do in that Write() method.
Now I am not sure wether I should call Flush(), or
CG.Collec(<something>), or use any other trick to improve performance.

The only "trick" is to not hang on to references that you no longer
need. It's highly unlikely that there's a bug in the framework, and
very likely that you're doing something wrong in your own code.

But you didn't post your own code, so no one could possibly say what that is.

You should post a concise-but-complete sample of code that reliably
demonstrates the problem. Only then can someone point out the error.
There this also the mention of not passing the file path to the
FileStream ctor, but the handle? I did not got that one....

I don't understand this part of your question. Where is "this also the
mention of [sic]" of which you write? What are you having trouble
with? What handle are you trying to pass?

Pete
 
G

Guy

I had a similar problem but in another processing section of the
software. The problem was I was not calling Dispose after using a
dbreader or db command. It was pretty easy to see the memory
consumption going up with task manager. But this time it not that
obvious.

The main loop is one db reader, reading my 6M records, then each
record gets some processing and is output as XML.
I got memory leak somewhere. Note that one of my CPU is at 100% for
hours processing this output thread, is it possible the GC does not
get any time to collect ?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top