Regex Question

  • Thread starter Thread starter Tony!
  • Start date Start date
T

Tony!

I've got a directory that contains many files with the filespec
filename#stringof14chars.wav

i.e-
"011420091557_ABC00097_400_5193760521_0008A17F_631122_002#20090114155411.wav"

In a database table, I find the filename minus the # and all after it
i.e- "011420091557_ABC00097_400_5193760521_0008A17F_631122_002"


I'm trying to check if the filename pulled from the database table
exists in the directory of files.

I've got
if (File.Exists(RemoteDir + filename))
{
code
}

from where I was puilling the filenames looping through the directory
and checking for a match, but this dir is gonna grow by several
thousand every day (they back up files daily, but there is always
gonna be many thousand files there)

Figured it would be faster to get file list from table instead of
directory to speed things up, but I can't figure out the Regex syntax
to check for the filename with the full #...wav name while only having
the first part of the filename.

I ended up using this

String[] Found = Directory.GetFiles(RemoteDir, filename + "*.wav");
Which I thought would take just about as long as
String[] Filenames = Directory.GetFiles(Directory, "*.wav");

But it comes back in about a second.


But I still wonder.. for future reference., if I did do it using
Regex, how would I do it? Can I even do it using Regex?

Thanks,

Tony!
 
but I can't figure out the Regex syntax
to check for the filename with the full #...wav name while only having
the first part of the filename.

I'm not a Regex expert (seems like it takes a long while to do stuff
using Regex) but you lost me there...what exactly are you trying to
do?

RL
 
I've got a directory that contains many files with the filespec
filename#stringof14chars.wav

i.e-
"011420091557_ABC00097_400_5193760521_0008A17F_631122_002#20090114155411.wa­v"

In a database table, I find the filename minus the # and all after it
i.e- "011420091557_ABC00097_400_5193760521_0008A17F_631122_002"

I'm trying to check if the filename pulled from the database table
exists in the directory of files.

I've got
 if (File.Exists(RemoteDir + filename))
{
code

}

from where I was puilling the filenames looping through the directory
and checking for a match, but this dir is gonna grow by several
thousand every day (they back up files daily, but there is always
gonna be many thousand files there)

Figured it would be faster to get file list from table instead of
directory to speed things up, but I can't figure out the Regex syntax
to check for the filename with the full #...wav name while only having
the first part of the filename.

I ended up using this

String[] Found = Directory.GetFiles(RemoteDir, filename + "*.wav");
Which I thought would take just about as long as
String[] Filenames = Directory.GetFiles(Directory, "*.wav");

But it comes back in about a second.

But I still wonder.. for future reference., if I did do it using
Regex, how would I do it? Can I even do it using Regex?

There are no standard filesystem search functions that operate with
regexes, so you'd have to enumerate the entire directory within your
program and match your regex against it. I guess that the stock
pattern matching (with ? and *) is somehow optimized by the system at
least for the more trivial cases (e.g. for your case, where the "*"
follows some prefix string, I'd expect NTFS to use its B-tree to
lookup that prefix).
 
You want to confirm that files exist in a directory on a windows machine
containing many many files.

in my experience directories with a huge number of files in them are very
very slow on windows. If you can read the entire directory into memory
once and then work out whether the file exists from that list you will
find that the whole process will be significantly faster.

Regex wont help a dictionary of filenames with a direct lookup will.
Load the dictionary by reading the while thing then when you loop your
database you can check for presence in the dictionary very easily.

As a bonus you can flag all files processed against the database do
something with the list of files that have not had anything done with it.

Ken
 
Back
Top