LINQ - Picking the most recently created file

A

AA2e72E

Given a list of files, how can I specify the 'where' in a LINQ query to
return the name of the most recent file name?

e.g.

var queryLatestFile = from file in Directory.GetFiles(@"c:\mypath", "*.JPG"))
where file.CreationTime== ??
select file.FullName;

Thanks for your help.
 
P

Peter Duniho

AA2e72E said:
Given a list of files, how can I specify the 'where' in a LINQ query to
return the name of the most recent file name?

e.g.

var queryLatestFile = from file in Directory.GetFiles(@"c:\mypath", "*.JPG"))
where file.CreationTime== ??
select file.FullName;

Your question doesn't make much sense to me. The GetFiles() method
returns a string[]. There's no "CreationTime" or "FullName" property
for string.

Even if I make the assmption that you really mean to use
DirectoryInfo.GetFiles() instead of Directory.GetFiles(), the "where"
clause isn't really a way to get an individual element with a specific
relationship to the other elements. Instead, it seems to me you should
use "orderby":

var query = from file in DirectoryInfo.GetFiles(@"c:\mypath", "*.JPG")
orderby file.CreationTime descending
select file.FullName;

string str = query.FirstOrDefault();

Then "str" will either be null (if there are no files), or will contain
the name of the most recent file.

Pete
 
A

AA2e72E

Thanks. However, I don't think I asked the right question. I'll try again:

Given:

System.IO.DirectoryInfo dir = new System.IO.DirectoryInfo(@"c:\test");
IEnumerable<System.IO.FileInfo> fileList =
dir.GetFiles("*.txt",System.IO.SearchOption.AllDirectories);

fileList will contain files in the directory tree c:\test and files by the
same name will exist in c:\test\one\myfile.txt and c:\test\two\myfile.txt etc.

I would like to be able to pick the myfile.txt that has the latest creation
time; obviously when a file exixts uniquely i.e. in one sub directory in the
tree, it will have the latest creation time (by default) and will get picked.

I hope I have explained this adequately: thanks for your help.
 
P

Peter Duniho

AA2e72E said:
Thanks. However, I don't think I asked the right question. I'll try again:

Given:

System.IO.DirectoryInfo dir = new System.IO.DirectoryInfo(@"c:\test");
IEnumerable<System.IO.FileInfo> fileList =
dir.GetFiles("*.txt",System.IO.SearchOption.AllDirectories);

fileList will contain files in the directory tree c:\test and files by the
same name will exist in c:\test\one\myfile.txt and c:\test\two\myfile.txt etc.

I would like to be able to pick the myfile.txt that has the latest creation
time; obviously when a file exixts uniquely i.e. in one sub directory in the
tree, it will have the latest creation time (by default) and will get picked.

I hope I have explained this adequately: thanks for your help.

Yes, I think that clarifies things a little more. But I'm still not
entirely sure I understand.

Do you already have a specific filename in mind when you execute this
code? Or are you looking to select _all_ of the files in the
enumeration, but only the most recent for any given filename?

That is, is the operation something like "look for any file named X,
return the one file named X that is the most recent"? Or is it "return
all distinct file names, return only the path for the most recent file
with a given name"?

The former seems to me to be best solved simply by providing file name
"X" as the search pattern to the GetFiles() method. Then you can use
the code I posted previously to get the most recent from the beginning
of an ordered enumeration of that search result.

The latter is more complicated, and I'm not sure that a good solution
will use only LINQ. You can do something like this:

var query = from file in fileList
orderby file.Name, file.CreationTime descending;

And then you can run through the list, picking the first unique name as
you go:

string previousFilename = null;
List<string> latest = new List<string>();

foreach (var file in query)
{
if (previousFilename == null || previousFilename != file.Name)
{
latest.Add(file.FullName);
previousFilename = file.Name;
}
}

Then at the end, you'll have a list of the paths to each file within
your original directory search, where any given filename appears only
once, and is the path to the file with the most recent creation
timestamp for that given filename.

If that doesn't answer your question, perhaps you should provide a
specific example of what the input might be (that is, the list of files
after you've called GetFiles()), and what output you want to get.

Pete
 
A

AA2e72E

Thanks. I am after

'That is, is the operation something like "look for any file named X, return
the one file named X that is the most recent"?'

I am inclined to agree that there may not be a LINQ only solution although I
thought Linq87 (Max - Grouped) in the 101 Linq samples may provide a suitable
basis for a solution.

I have a solution which is very time consuming: I use DirectoryInfo to
extract the path, filename, and file creation date into three columns which I
write to a SQL Server table and then use SQL to get the file names. This is
taking just under three hours for 650,000 files spread across 6,200
subfolders: I wanted a LINQ solution to see if I could get it done quicker. I
expected it to be quicker (had it been possible!) as it would avoid the
re-iterative calls to SQL Server.

Thanks for looking into this.
 
B

Bobby C. Jones

AA2e72E said:
Thanks. However, I don't think I asked the right question. I'll try
again:

Given:

System.IO.DirectoryInfo dir = new System.IO.DirectoryInfo(@"c:\test");
IEnumerable<System.IO.FileInfo> fileList =
dir.GetFiles("*.txt",System.IO.SearchOption.AllDirectories);

fileList will contain files in the directory tree c:\test and files by the
same name will exist in c:\test\one\myfile.txt and c:\test\two\myfile.txt
etc.

I would like to be able to pick the myfile.txt that has the latest
creation
time; obviously when a file exixts uniquely i.e. in one sub directory in
the
tree, it will have the latest creation time (by default) and will get
picked.

I hope I have explained this adequately: thanks for your help.


Perhaps something like this

DirectoryInfo dir = new DirectoryInfo(@"C:\test");

var newestFiles = from file in dir.GetFiles("*.txt",
SearchOption.AllDirectories)
orderby file.LastWriteTime descending
group file by file.Name.ToUpper() into files
select files.ElementAt(0);
 
A

AA2e72E

A 'quick' test seems to indicate that this will work; I'll do some more
testing. Thanks.
 
P

Peter Duniho

AA2e72E said:
Thanks. I am after

'That is, is the operation something like "look for any file named X, return
the one file named X that is the most recent"?'

I am inclined to agree that there may not be a LINQ only solution although I
thought Linq87 (Max - Grouped) in the 101 Linq samples may provide a suitable
basis for a solution.

As I wrote before, if that's what you are trying to do, the solution is
simple. Just pass the filename as the search pattern for GetFiles(),
and then use the LINQ example I posted first.

Pete
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top