Fast Index a File System?

J

james

Hi Everybody,

I am trying to build a tree of my entire file system. The tree
includes attributes like file size and last modified. My current
implementation is multi threaded using the threadpool, but it takes
around 10 minutes to complete. I don't understand why it would take
so long because I think everything I want is in the Master File
Table. Is there a faster implementation?

Thanks,
James
 
A

Alberto Poblacion

james said:
I am trying to build a tree of my entire file system. The tree
includes attributes like file size and last modified. My current
implementation is multi threaded using the threadpool, but it takes
around 10 minutes to complete. I don't understand why it would take
so long because I think everything I want is in the Master File
Table. Is there a faster implementation?

No, everything you need is not in the MFT. It is in various directories
scattered around your disk. Your bottleneck here is going to be the disk
access (look at the disk access light when you run the program), so I don't
think that you'll get any improvement from a multithreaded application
(versus a single-threaded one).
 
E

Eps

james said:
Hi Everybody,

I am trying to build a tree of my entire file system. The tree
includes attributes like file size and last modified. My current
implementation is multi threaded using the threadpool, but it takes
around 10 minutes to complete. I don't understand why it would take
so long because I think everything I want is in the Master File
Table. Is there a faster implementation?

Thanks,
James

I am kinda doing a similar thing, indexing my mp3 collection, the
approach I have gone for is to store all the meta information in a sql
lite database. All the program does when it starts up is check whether
each file is there (disable or delete the record from the db if it is
not), its very quick to load (i am using it on ~10,000 plus files).

The downside is that you need to populate the db to begin with which in
my case only took about five minutes, I think I should be able to run
that in a separate thread and still have my app useful, maybe you do the
same ?.

Anybody elses thoughts on how to deal with this type of situation would
be most welcome.
 
N

Nicholas Paldino [.NET/C# MVP]

Have you considered using the Windows Desktop Search API? It indexes
your files for you (and you can write custom filters for files if there
isn't a preexisting one) and you can query the results programmatically.
 
J

james

Hi Alberto,

I figured multithreading would keep the queue full to the disk drive,
but I never tested if it actually sped things up. It sounds like
there isn't much I can do... nowadays 10 minutes feels like an
eternity :p

Thanks for the responses,
James
 
H

Hilton

Multi-threading has a pretty good chance of slowing it down significantly.
Imagine just two threads, one reading data from the inside of the drive
(physically), and the other reading from the outside of the drive.

Hilton
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top