API FindFirstFile\FindNextFile vs GetFiles

L

Lance

Hi All,

I'm working on a program that requires searching multiple drives for multiple file types
and cataloging them based on certain geospatial attributes. All together, there are
hundreds of thousands of files on the drives. As part of the process, I'm currently using
the GetFiles method of the FileSystem object to retrieve collection of strings
representing a collection of a particular file type (for example, tif files). The problem
is that the GetFiles method doesn't seem to make any callbacks that would allow for me to
show some sort of meaningful progress, and the process can take a very long time. I know
I could fudge it with a marquee style progress bar and\or a busy mouse icon, but this
isn't ideal.

So, I was looking into using the API to do the grunt work. This would allow me to at
least display each filename in a label control as the search is being performed. I'm
trying to port Randy Birch's VB6 code available on his website at
http://vbnet.mvps.org/code/fileapi/recursivefiles_minimal_multiple.htm . It's working
somewhat, except for the FindNextFile function which I've declared like so:

/////
Private Declare Function FindNextFile Lib "kernel32" _
Alias "FindNextFileA" _
(ByVal hFindFile As Int32, _
ByVal lpFindFileData As WIN32_FIND_DATA) As IntPtr
/////

As currently implemented, the function is not moving the search to the next file. It is
always returning the first file in the directory and it returns all the properties of that
file correctly. And it returns this first file for as many times as there are files in
the directory. I'm thinking it's because the WIN32_FIND_DATA - which is a UDT in the VB6
code - is defined as an object in my code and Dim'd as

/////
Dim WFD As WIN32_FIND_DATA = New WIN32_FIND_DATA
/////

at the beginning of the SearchForFiles Sub. For reference, in my code WIN32_FIND_DATA
defined like so:

/////
<StructLayout(LayoutKind.Sequential, _
CharSet:=CharSet.Auto)> _
Friend Class WIN32_FIND_DATA
Friend sfileAttributes As Int32 = 0
Friend creationTime_lowDateTime As Int32 = 0
Friend creationTime_highDateTime As Int32 = 0
Friend lastAccessTime_lowDateTime As Int32 = 0
Friend lastAccessTime_highDateTime As Int32 = 0
Friend lastWriteTime_lowDateTime As Int32 = 0
Friend lastWriteTime_highDateTime As Int32 = 0
Friend nFileSizeHigh As Int32 = 0
Friend nFileSizeLow As Int32 = 0
Friend dwReserved0 As Int32 = 0
Friend dwReserved1 As Int32 = 0
<MarshalAs(UnmanagedType.ByValTStr, SizeConst:=MAX_PATH)> _
Friend fileName As String = Nothing
<MarshalAs(UnmanagedType.ByValTStr, SizeConst:=14)> _
Friend alternateFileName As String = Nothing
End Class
/////

How can I make it so the FindNextFile call actually moves on to the next file in the
directory?

Lance
 
L

Lance

what the heck? I didn't define WIN32_FIND_DATA as an object. It's a structure. There
goes my theory. Ugh, not enough coffee this morning. I'd still be interested in your
thoughts as to why the FindNextFile API function isn't working correctly.

Lance
 
M

Mike McIntyre

You could call the GetFiles method via a BackgroundWorker. In your get
files method you could call the BackgroundWorker ReportProgress method each
time a file is found/processed and in a handler for the BackgoundWorker
ProgressChanged event this will raise you could update the label control
with the filename.

For a free Visual Studio 2005 solution that provides source code for using
the BackgroundWorker class visit ->
http://www.getdotnetcode.com/GdncSt...hreadWithVisualBasicBackgroundWorkerClass.htm
 
L

Lance

I thought about that but assumed the GetFiles method would still not report any progress
back to the BackgroundWorker. Admittedly, I'm a little hazy on how the BackgroundWorker
receives messages anyway. It was my understanding that if the GetFiles method didn't make
callbacks, then well, it didn't make callbacks.

Another thing that is attractive about the API method, though, is that I can search for
multiple file types at once. GetFiles seems to be limited to one type at a time. I would
be searching for and cataloging about a dozen different file types

Lance
 
T

Tom Shelton

Lance said:
Hi All,

I'm working on a program that requires searching multiple drives for multiple file types
and cataloging them based on certain geospatial attributes. All together, there are
hundreds of thousands of files on the drives. As part of the process, I'm currently using
the GetFiles method of the FileSystem object to retrieve collection of strings
representing a collection of a particular file type (for example, tif files). The problem
is that the GetFiles method doesn't seem to make any callbacks that would allow for me to
show some sort of meaningful progress, and the process can take a very long time. I know
I could fudge it with a marquee style progress bar and\or a busy mouse icon, but this
isn't ideal.

So, I was looking into using the API to do the grunt work. This would allow me to at
least display each filename in a label control as the search is being performed. I'm
trying to port Randy Birch's VB6 code available on his website at
http://vbnet.mvps.org/code/fileapi/recursivefiles_minimal_multiple.htm . It's working
somewhat, except for the FindNextFile function which I've declared like so:

/////
Private Declare Function FindNextFile Lib "kernel32" _
Alias "FindNextFileA" _
(ByVal hFindFile As Int32, _
ByVal lpFindFileData As WIN32_FIND_DATA) As IntPtr
/////

The default in VB.NET functions is ByVal now rather then ByRef. If you
look at the original, your structure should be passed ByRef. Also, I
would not alias this function - it will cause you serious performance
issues on NT based systems (most current OS's :)

Your declares should probably look more like:

Private Declare Auto Function FindFirstFile Lib "kernel32" _
(ByVal lpFileName As String, _
ByRef lpFindData As WIN32_FIND_DATA) As IntPtr

Private Declare Auto Function FindNextFile Lib "kernel32" _
(ByVal hFindFile As IntPtr, _
ByRef lpFindData As WIN32_FIND_DATA) As IntPtr

Private Delcare Function FindClose Lib "kernel32" _
(ByVal hFindFile As IntPtr) As Boolean
 
T

Tom Shelton

Tom said:
The default in VB.NET functions is ByVal now rather then ByRef. If you
look at the original, your structure should be passed ByRef. Also, I
would not alias this function - it will cause you serious performance
issues on NT based systems (most current OS's :)

Your declares should probably look more like:

Private Declare Auto Function FindFirstFile Lib "kernel32" _
(ByVal lpFileName As String, _
ByRef lpFindData As WIN32_FIND_DATA) As IntPtr

Private Declare Auto Function FindNextFile Lib "kernel32" _
(ByVal hFindFile As IntPtr, _
ByRef lpFindData As WIN32_FIND_DATA) As IntPtr

Crap messed up FindNextFile - it should return Boolean rather then
IntPtr.
 
L

Lance

Ok. Initially I pasted your API declares in my code and ran it. I was presented with a
"FatalExecutionEngineError was detected" message from the MDA. So I changed the
WIN32_FIND_DATA from how it was originally posted as a class (and it *was* a class, not a
structure in my orginal post. I sent a reply to my original message saying it was a
structure <--wrong...ignore my own reply to the original please) to a structure as such:

////
Public Structure WIN32_FIND_DATA
Public sfileAttributes As Int32
Public creationTime_lowDateTime As Int32
Public creationTime_highDateTime As Int32
Public lastAccessTime_lowDateTime As Int32
Public lastAccessTime_highDateTime As Int32
Public lastWriteTime_lowDateTime As Int32
Public lastWriteTime_highDateTime As Int32
Public nFileSizeHigh As Int32
Public nFileSizeLow As Int32
Public dwReserved0 As Int32
Public dwReserved1 As Int32
<MarshalAs(UnmanagedType.ByValTStr, SizeConst:=MAX_PATH)> _
Public fileName As String
<MarshalAs(UnmanagedType.ByValTStr, SizeConst:=14)> _
Public alternateFileName As String
End Structure
////

and while now the FindNextFile API call works (yeah!), I don't think the structure is
correct. The filename member is always just the first character of the actual file name
(i.e., ReadMe.Txt is shown as "R") and the lowDates and highDates are too high for a
date-representing value.

Lance
 
T

Tom Shelton

Lance said:
Ok. Initially I pasted your API declares in my code and ran it. I was presented with a
"FatalExecutionEngineError was detected" message from the MDA. So I changed the
WIN32_FIND_DATA from how it was originally posted as a class (and it *was* a class, not a
structure in my orginal post. I sent a reply to my original message saying it was a
structure <--wrong...ignore my own reply to the original please) to a structure as such:

Aaah! I didn't notice that you were using a class instead of a
structure. I will take a stab at doing an actual conversion of the
code this evening (no time right now) - unless someone else beats me to
it :)
 
L

Lance

Thanks Tom. Remember, WIN32_FIND_DATA as a class caused the "FatalExecutionEngineError "
using your declarations, but as a structure, it was running (although the output of the
members didn't seem to jive).

Lance
 
H

Herfried K. Wagner [MVP]

Lance said:
I'm working on a program that requires searching multiple drives for
multiple file types and cataloging them based on certain geospatial
attributes. All together, there are hundreds of thousands of files on the
drives. As part of the process, I'm currently using the GetFiles method
of the FileSystem object to retrieve collection of strings representing a
collection of a particular file type (for example, tif files). The
problem is that the GetFiles method doesn't seem to make any callbacks
that would allow for me to show some sort of meaningful progress, and the
process can take a very long time.

<URL:http://dotnet.mvps.org/dotnet/samples/filesystem/FileSystemEnumerator.zip>
 
T

Tom Shelton

Lance said:
Thanks Tom. Remember, WIN32_FIND_DATA as a class caused the "FatalExecutionEngineError "
using your declarations, but as a structure, it was running (although the output of the
members didn't seem to jive).

Private Const MAX_PATH As Integer = 260

' if you using vb2005, you can acually use
System.Runtim.InteropServices.ComTypes.FILETIME
<StructLayout(LayoutKind.Sequential)> _
Private Structure FILETIME
Public dwLowDateTime As Integer
Public dwHighDateTime As Integer
End Structure

<StructLayout(LayoutKind.Sequential, CharSet:=CharSet.Auto)> _
Private Structure WIN32_FIND_DATA
Public dwFileAttributes As Integer
Public ftCreationTime As FILETIME
Public ftLastAccessTime As FILETIME
Public ftLastWriteTime As FILETIME
Public nFileSizeHigh As Integer
Public nFileSizeLow As Integer
Public dwReserved0 As Integer
Public dwReserved1 As Integer

<MarshalAs(UnmanagedType.ByValTStr, SizeConst:=MAX_PATH)> _
Public cFileName As String

<MarshalAs(UnmanagedType.ByValTStr, sizeconst:=14)> _
Public cAlternate As String
End Structure

Private Declare Function FindClose Lib "kernel32" (ByVal hFindFile
As IntPtr) As Boolean

Private Declare Auto Function FindFirstFile Lib "kernel32" ( _
ByVal lpFileName As String, _
ByRef lpFindFileData As WIN32_FIND_DATA) As IntPtr

Private Declare Function FindNextFile Lib "kernel32" ( _
ByVal hFindFile As IntPtr, _
ByRef lpFindFileData As WIN32_FIND_DATA) As Boolean

This set of declares seems to work for me. Try them out and see what
you get...
 
L

Lance

Herfried,

Thanks for that sample. I downloaded it last night and while it worked, I couldn't
initially wrap my head around it. It seems far, far, far different than any tutorial,
help file, and example that I have ever seen before. Because it seems to work so well, it
really makes me scratch my head as to why nothing like it is presented anyplace else.
They're always pushing the use of the GetFiles method. That's sort of frustrating. I
thought I was thinking outside the .Net box by going with the API, but this native .Net
method that you suggest is even further outside the box, IMO.

For now, I think I'm going to stick with the API method, as I can understand most of it.
But I'm going to play around with the code you linked to. Once I can figure out how to
debug it and then step through the code, I'm sure I will have learned a lot.

Thanks,
Lance
 
T

Tom Shelton

Herfried said:

Herfried, that is a very nice (and clever) example of using asnyc
delegates. You are to be commended. But, it still doesn't really
solve the problem - simply because at the heart it still calls
GetFiles. And as much as I love GetFiles, it still returns all the
files in the directory in one chunk. Normally, this isn't an issue.
But, if you have a directory with thousands of files in it, then you
still have to wait for GetFiles to return until you can cancel your
operation, or present any progress, or do any real filtering.

The only way to really get "real time" progress that I know of is to
use the API calls, since you iterate one file at a time, not a whole
directory at a time...
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top