Slow first-time access

  • Thread starter ahmedmaarouf2002
  • Start date
A

ahmedmaarouf2002

Hello,

I use some Forth code to access the contents of many documents for an
analysis. A folder typically contains 20,000 or more small(~few KB's
each) documents. I open each document read-only, then use a read-file
to get its contents. Here is the problem: the first time this runs it
takes so long to do the reads. The cpu load is very small(with a very
noisy HDD head) If I close my application and launch it again for a new
run, the reading is much MUCH faster. Subsequent runs seem to have the
high performance of the second one. I try the same code with the same
documents on two machines, a desktop with 2GB of RAM and a 180GB NTFS
drive, and a laptop with 768 MB of RAM and a 30GB NTFS drive, both
running XP. Same behavior is noticed. I tried to use small
folders(~1000 docs), and I still got the same story. I defragmented
both drives, but it didn't help. I have the indexing service turned
off, as well as system restore

I have recently moved to XP. I donot recall this being a problem on
2000. Is there a fix to overcome such terribly slow first-time
performance?

Help is most appreciated...

Thanks,
Ahmed
 
A

Andrew Haley

In said:
I use some Forth code to access the contents of many documents for
an analysis. A folder typically contains 20,000 or more small(~few
KB's each) documents. I open each document read-only, then use a
read-file to get its contents. Here is the problem: the first time
this runs it takes so long to do the reads. The cpu load is very
small(with a very noisy HDD head) If I close my application and
launch it again for a new run, the reading is much MUCH
faster. Subsequent runs seem to have the high performance of the
second one. I try the same code with the same documents on two
machines, a desktop with 2GB of RAM and a 180GB NTFS drive, and a
laptop with 768 MB of RAM and a 30GB NTFS drive, both running
XP. Same behavior is noticed. I tried to use small folders(~1000
docs), and I still got the same story. I defragmented both drives,
but it didn't help. I have the indexing service turned off, as well
as system restore
I have recently moved to XP. I donot recall this being a problem on
2000. Is there a fix to overcome such terribly slow first-time
performance?

I don't think there's anything wrong with your computer, or your OS.
The problem is that it takes a long time actually to read all your
data from the disk, but the second time around everything is in the
filesystem cache.

The classic way to solve this is to use read-ahead. But the operating
system's automatic read-ahead probably won't work if the next data to
be read are in a different file, so using many tiny files slows things
down a great deal. One simple trick would be to have another task
that reads the data before it's needed by your main task, the one
that does the analysis.

Andrew.
 
D

Dmitry Ponyatov

Here is the problem: the first time this runs it
takes so long to do the reads. If I close my application and launch
it again for a new run, the reading is much MUCH faster.

RTFM your OS disk cache manager

Maybe you should move to better technology -- use databases, or you
limited to use only file-system ?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top