DIR and COPY misbehaving with lots of files

P

PhilHibbs

I have an XP server with a directory with over half a million files in
it, with long numerical names such as 123456_234567.TIF. I wanted to
get a sample of them, so I ran a command prompt and executed "dir
12345*". It returned a list of files that mostly - but not entirely -
began with "12345". One might have been, for example,
"124679_124680.PCX". I tried "dir 12345*.*" but that did the same.

I then ran a "copy" command to copy the files to another empty
directory on a different drive. The command ran for a while, then
stopped with an "overwirite Yes, No, or All" prompt. The destination
directory was empty at the start, yet it was clearly trying to copy
the file a second time. I replied 'N', and it carried on for a while
and stopped again with the same prompt. After a couple of these I
pressed 'A' because I couldn't wait around and nurse it all night.

Is there a bug with doing directory listings or copying from
directories with a large number of files?

The files are in a TrueCrypt container file on an external USB drive,
might that be a factor?

Phil Hibbs.
 
P

Pegasus \(MVP\)

PhilHibbs said:
I have an XP server with a directory with over half a million files in
it, with long numerical names such as 123456_234567.TIF. I wanted to
get a sample of them, so I ran a command prompt and executed "dir
12345*". It returned a list of files that mostly - but not entirely -
began with "12345". One might have been, for example,
"124679_124680.PCX". I tried "dir 12345*.*" but that did the same.

I then ran a "copy" command to copy the files to another empty
directory on a different drive. The command ran for a while, then
stopped with an "overwirite Yes, No, or All" prompt. The destination
directory was empty at the start, yet it was clearly trying to copy
the file a second time. I replied 'N', and it carried on for a while
and stopped again with the same prompt. After a couple of these I
pressed 'A' because I couldn't wait around and nurse it all night.

Is there a bug with doing directory listings or copying from
directories with a large number of files?

The files are in a TrueCrypt container file on an external USB drive,
might that be a factor?

Phil Hibbs.

This is not a bug but a limiation in the LFN/SFN naming system
that manifests itself when you have lots of similar file names
whose SFNs are generated by the operating system. Try this
little experiment:
- Click Start / Run / cmd {OK}
- Type these commands:
md \SFNTest {Enter}
cd \SFNTest {Enter}
for /L %a in (1,1,1000) do @echo. > "ABCDEFGH %a.txt" {Enter}
dir abb*.* {Enter}

You will see a number of files, none of which start with the
letters "ABB". Why? Try this command:

dir /x /p {Enter}

Having half a million files in the one folder is IMHO not a good
idea. Not only are you likely to run into LFN/SFN problems but
the performance of your machine will suffer badly. I think that
5,000 files is a reasonable limit.
 
P

PhilHibbs

Pegasus said:
This is not a bug but a limiation in the LFN/SFN naming system...

You say potato, I say potato.
Having half a million files in the one folder is IMHO not a good
idea. Not only are you likely to run into LFN/SFN problems but
the performance of your machine will suffer badly. I think that
5,000 files is a reasonable limit.

We've been thinking of partitioning the files based on the prefix of
the file name already. If I create subdirectories based on the first
two digits, that should give us 99 directories with just under 6,000
files each. Thanks for the info, nice to know why these things happen.

Phil Hibbs.
 
G

G. Peppard

This is not a bug but a limiation in the LFN/SFN naming system
that manifests itself when you have lots of similar file names
whose SFNs are generated by the operating system. Try this
little experiment:
- Click Start / Run / cmd {OK}
- Type these commands:
md \SFNTest {Enter}
cd \SFNTest {Enter}
for /L %a in (1,1,1000) do @echo. > "ABCDEFGH %a.txt" {Enter}
dir abb*.* {Enter}

You will see a number of files, none of which start with the
letters "ABB". Why? Try this command:

dir /x /p {Enter}

Having half a million files in the one folder is IMHO not a good
idea. Not only are you likely to run into LFN/SFN problems but
the performance of your machine will suffer badly. I think that
5,000 files is a reasonable limit.


Such "limitations" really should be explicit in the manuals. They are
not. I have wasted a lot of time on this problem first thinking I was
just not getting it and then thinking my directory structures might
be corrupted (lots of fun waiting for chkdsk to process 2x250 gb and
1x750 gb). Then a lot of time trying to find documentation. Finally
found this thread and picked up a couple more limiting terms to get a
few sparse entries from google.

Any idea where we can look for solutions, workarounds and
documentation? Do any of the command line programs ignore the hidden
autogenerated SFN and just do what we expect them to do with the LFN?

Does the "limitation" apply to xcopy and robocopy as well?

Is there a list of the utilities besides DIR and COPY that apply
wildcard patterns to both SFN and LFN?

What about 3rd party products? I just downloaded xDir and it does
not seem to be subject to the same problem. At least it does not fail
the simple tests I have applied thus far.
 
P

Pegasus \(MVP\)

G. Peppard said:
Such "limitations" really should be explicit in the manuals. They are
not. I have wasted a lot of time on this problem first thinking I was
just not getting it and then thinking my directory structures might
be corrupted (lots of fun waiting for chkdsk to process 2x250 gb and
1x750 gb). Then a lot of time trying to find documentation. Finally
found this thread and picked up a couple more limiting terms to get a
few sparse entries from google.

Any idea where we can look for solutions, workarounds and
documentation? Do any of the command line programs ignore the hidden
autogenerated SFN and just do what we expect them to do with the LFN?

Does the "limitation" apply to xcopy and robocopy as well?

Is there a list of the utilities besides DIR and COPY that apply
wildcard patterns to both SFN and LFN?

What about 3rd party products? I just downloaded xDir and it does
not seem to be subject to the same problem. At least it does not fail
the simple tests I have applied thus far.

Simple: HKLM\SYSTEM\CurrentControlSet\Control\FileSystem
set NtfsDisable8dot3NameCreation to 1.
 
G

G. Peppard

Simple: HKLM\SYSTEM\CurrentControlSet\Control\FileSystem
set NtfsDisable8dot3NameCreation to 1.
Thanks. Did that and I see that it will stop future generation of
SFN. Is there any way to ditch the SFNs that have already been
generated?

Thanks,

Alan
 
P

Pegasus \(MVP\)

Thanks. Did that and I see that it will stop future generation of
SFN. Is there any way to ditch the SFNs that have already been
generated?

Thanks,

Alan

Equally simple. Rename the files, e.g. by temporarily giving them
a new extension: ren *.txt *.xyz
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top