GetDirectories Performance

  • Thread starter Thread starter Tom Scales
  • Start date Start date
T

Tom Scales

I'm writing a VB.NET 2003 program that uses a Treeview to display the drive
structure on the computer. I am having a major problem with performance.
The are many files on one drive (over a million) and it is killing me. For
example, one directory has a structure:

Main

------ Sub directory

-------------- 10 Sub directories, each with roughly 30,000 files.



When I execute this code on the Sub directory (i.e. strPath = SubDirectory)

'------------------------------------------------------------------------
Dim strPath As String = tn.FullPath ' Get the parent's path
Dim diDirectory As New DirectoryInfo(strPath)
Dim adiDirectories() As DirectoryInfo

Try

' Get an array of all sub-directories as DirectoryInfo objects.

adiDirectories = diDirectory.GetDirectories()

Catch exp As Exception

Exit Sub

End Try
'------------------------------------------------------------------------

it takes over 15 minutes to complete. This is on a P4-2.66 running XP Pro.

Any suggestions for improvement?

Thanks,

Tom
 
Tom Scales said:
I'm writing a VB.NET 2003 program that uses a Treeview to display the
drive
structure on the computer. I am having a major problem with performance.
The are many files on one drive (over a million) and it is killing me.
For
example, one directory has a structure:

Instead of populating the whole treeview control on startup, only add the
nodes on the first level and check if the folders contain subfolders. If
the latter is the case, add a dummy subnode to the node representing the
folder. Then catch the node expand event and replace the dummy node with
nodes for the actual files and folders contained in the folder. This should
lead to acceptable performance and memory usage would be much lower than
populating the whole control. In addition, in many cases it's very unlikely
that the user will expand every single node and thus much less memory will
be occupied by your application in total.
 
Herfried K. Wagner said:
Instead of populating the whole treeview control on startup, only add the
nodes on the first level and check if the folders contain subfolders. If
the latter is the case, add a dummy subnode to the node representing the
folder. Then catch the node expand event and replace the dummy node with
nodes for the actual files and folders contained in the folder. This
should lead to acceptable performance and memory usage would be much lower
than populating the whole control. In addition, in many cases it's very
unlikely that the user will expand every single node and thus much less
memory will be occupied by your application in total.

Unfortunately, that is essentially what I am doing. I am adding enough
nodes to tell me if I need to add the + sign. That's where I get bitten,
because the directory UNDER the one I am working with has 30,000+ files.
GetDirectories must search every file to see if it is a directory. Very
inefficience code on MS' part.
 
Tom said:
Unfortunately, that is essentially what I am doing. I am adding enough
nodes to tell me if I need to add the + sign. That's where I get bitten,
because the directory UNDER the one I am working with has 30,000+ files.
GetDirectories must search every file to see if it is a directory. Very
inefficience code on MS' part.


Tom, it sounds like you would be better served by doing P/Invoke via
FindFirst and FindNext API calls. The way they work is through one file
at a time, thus, you'll have much more control over your operation.

In addition, another thing that maybe slowing you down is the actual
treeview (which is also very inefficient). I'd advise to you to devise
a quick test to load 30,000 nodes and see how fast it is loading.
You'll be surprised at how slow it will be. To get around this problem
I went with a 3rd party tree list control from
http://www.bennet-tec.com/ called TList. Their claim to fame is the
speed and based on my usage it is not idle talk. It truly is pedal to
the metal.

Regards
 
Tom Scales said:
Unfortunately, that is essentially what I am doing. I am adding enough
nodes to tell me if I need to add the + sign. That's where I get bitten,
because the directory UNDER the one I am working with has 30,000+ files.
GetDirectories must search every file to see if it is a directory. Very
inefficience code on MS' part.

I think it doesn't really matter whether there are files or folders in the
folder, because if there is more than one entry, you simply add a single
dummy node. You can use 'Directory.GetFileSystemEntries' for this purpose.
No need to deal with 'FileInfo' and 'DirectoryInfo'. Instead you can use
the 'Directory' class.
 
Herfried K. Wagner said:
I think it doesn't really matter whether there are files or folders in the
folder, because if there is more than one entry, you simply add a single
dummy node. You can use 'Directory.GetFileSystemEntries' for this
purpose. No need to deal with 'FileInfo' and 'DirectoryInfo'. Instead you
can use the 'Directory' class.

OK, I understand. Let me play around with it some more. I'm not adding the
30,000 entries to the treeview, of course, as they are not directories.

Good advice from all.

Thanks!

Tom
 
Back
Top