Reliability of FileSystemWatcher over networked storage?

G

Guest

We have a small farm of servers, with common resource data shared between
them. The resource data is sitting in a moderately complex directory
structure on a shared Network Storage device. Each server has several
processes that care about the resources, each of which monitors the resources
directory using a FileSystemWatcher. They use this monitor to maintain a
lazily-initialized cache of the resources.

This architecture works fine on a single machine, and seems to *generally*
work on multiple machines. But we've noticed that the multiple-machine case
is considerably less stable in a couple of respects:

First, we get a lot more FileSystemWatcher.Error events being raised. I
haven't ever managed to cause this buffer overflow to happen on a local
machine, but it seems to happen occasionally in the networked environment,
sometimes without apparent cause. My best guess is that the
FileSystemWatcher is intolerant of network glitches, and that these tend to
lead to Errors, but that's just a guess.

More seriously, we now seem to have gotten into a situation where the file
watching is just plain dead. On some of the servers, some of the processes
apparently aren't seeing changes at all any more. It's hard to pin down
exactly why: the processes have been running for months, and there was
definitely some network glitchiness in the middle there. But we were
disturbed to find that the FileSystemWatcher has gone totally silent on us:
no reports, no errors, no nothing.

So this is a general request for insights. Has anyone been using
FileSystemWatcher in this sort of hardcore way, with lots of processes
examining a network storage? If so, have you observed any reliability
issues? In general, does anyone have any *deep* information on exactly how
the FileSystemWatcher works under the hood? I know about the underlying
Windows calls -- I'm talking about below *those*, trying to understand how
the servers are communicating with each other, and particularly how they deal
with communication interruptions, so we can better understand what we need to
be watching for, and whether FileSystemWatcher is really appropriate for the
situation.

(Oh, and note that this is all with .NET Framework 1.1. We'll be upgrading
to 2.0 eventually, but not until VS 2005 is officially released and we're at
a convenient point in a release cycle. I suspect that that isn't relevant to
this problem, but I mention it in case anything has changed...)
 
P

Peter Huang [MSFT]

Hi

If you want to use FileSystemWatcher to monitor potentially large amounts
of directory changes, and on a network share, you need to handle the error
event which is fired when the internal buffer overflows and when there are
other network/IO errors . For more information on the error event, please
see the documentation reference
below:
FileSystemWatcher.Error Event (.NET Framework Class Library)
<http://msdn.microsoft.com/library/default.asp?url=/library/en-us/cpref/html
/frlrfsystemiofilesystemwatcherclasserrortopic.asp>

To keep the buffer from overflowing, use the FileSystemWatcher.NotifyFilter
and FileSystemWatcher.IncludeSubdirectories properties to filter out your
unwanted change notifications. You can also increase the size of the
internal buffer through the FileSystemWatcher.InternalBufferSize property.
However, when you use FileSystemWatcher (ReadDirectoryChangesW) on a
network path, the buffer size is limited by the SMB packet size, which in
our current implementation is 64K. Since FileSystemWatcher internally calls
CreateFile() to get a handle to the directory that is being watched and
then makes a call to ReadDirectoryChangesW() to monitor the changes, when
there is a network disruption, the handle may likely become invalid. This
means once there is a network disconnection your application may no longer
be able to receive the directory changes on the same file system handle.

Talking of network I/O, there is another limitation -- the maximum number
of concurrent outstanding network requests between a Server Message Block
(SMB) client
and server. Please refer to the following article, which has more details
on this:
Q271148 MaxMpxCt and MaxCmds Limits in Windows 2000
<http://support.microsoft.com/?id=271148>
In essence, any application monitoring directory changes should be designed
to address all of the above. It is recommended to monitor the directories
locally, as it eliminates the complexity caused by network disconnections
and, the SMB packet and outstanding request limitations.

Also you may take a look at the link below. Try to restart the FSW after
there is error occur.
http://groups.google.com/group/microsoft.public.dotnet.framework/browse_thre
ad/thread/16a17580d467b4b5/9f7e3d088c77e7eb?lnk=st&q=FileSystemWatcher+%22Ne
twork+Share%22&rnum=2&hl=zh-CN#9f7e3d088c77e7eb

Best regards,

Peter Huang
Microsoft Online Partner Support

Get Secure! - www.microsoft.com/security
This posting is provided "AS IS" with no warranties, and confers no rights.
 
G

Guest

Okay, fine. I knew about the Error event, and we're handling it in at least
a rudimentary way, but the info about the packet limits is new, and fits in
with what I had been hypothesizing. I'd already begun to suspect that I need
to be doing the monitoring locally and implementing our own scheme for
publishing the changes; this seems to confirm that that's the best course of
action. Thanks...
 
P

Peter Huang [MSFT]

Hi

Thanks for your quickly reply!
You are welcomed!

Best regards,

Peter Huang
Microsoft Online Partner Support

Get Secure! - www.microsoft.com/security
This posting is provided "AS IS" with no warranties, and confers no rights.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top