Windows Services works for years/months/weeks, then chokes

T

Trevor

This is driving us mad - please help!

Back in 2003, I coded a Windows Service in VB.NET for framework v1.1.4322.
I deployed it in Nov. 2003, and it worked fine until the end of May 2005,
when it choked (see below). We restarted it, and it worked fine for another
7 months until it choked again at the end of Dec. 2005. It has now failed
again (mid Feb. 2006). So it's failing more frequently now?

OnStart, the service reads a number of settings from the configuration file.
Among thsoe settings there is:
- a path for a FileSystemWatcher
- a time when the file is expected to arrive, used to set a timer

FileSystemWatcher is set to watch the specified path for the creation
(arrival) of a *.CSV file from another system. When it arrives, it's
contents are read and loaded into a database. In return, other data is
collected from the database and put in an output file (*.OUT). Then, a
global variable called LastExecuted is set to Now(). Declaration is at the
top of the code: Private LastExecuted As Date.

If the time value is 02:00, then we expect file arrival at 2:00AM.
Therefore, the timer is set at the proper millissecond interval between now
and the next 3:00AM that comes around (whether that's today or tomorrow,
depending on the current time). On Timer_Elapsed, I check that LastExecuted
is within the last 75 minutes. If not, the file failed to arrive today, so
make an entry in the EventLog and SmtpMail.Send an error message. Then,
re-read the configuration file (in case the expected arrival time was
changed) and then reset and restart the Timer (which typically results in a
24-hour period).

When the process chokes, this is what happens:
- the FileSystemWatcher worked as expected that day.
- the timer interval suddenly changes from 24 hours to less than a second.
Therefore, 75 minutes after the file was processed (75 minutes after
LastExecuted) several EventLog entries and e-mails are generated EVERY
SECOND.
- the process fails to update itself from the configuration file, even
though the call to that function is the next line of code to execute after
the Smtp.Send

Result: By 8AM, when people arrive for work, I get a call saying that the
server has sent them 5000+ error e-mails, and to please make it stop. I
reboot the service, and everything is fine. I even have a debug mode which
tells me what the timer calculates for its intervals and what that
transaltes into. The values are always correct. Until it's been running
for a while. Then the thing chokes.

Did something change in the framework around May 2005? Am I not doing some
necessary memory cleanup (I have Dim MyLog As New EventLog in every
function/sub - do I need to release that)? The Timer and LastExecuted ar
global variable used only once, right - there's nothing to clean up there,
is there? I don't know if this is relevant, but there is another Timer in
my code - one that starts when the FileSystemWatcher is triggered, waits 20
seconds until FTP is done transferring the file, then stops. I can't image
that interfereing - or is it? There's nothing wrong with my interval
calculation, is there?

Private Function ProperTimerInterval() As Double
Dim MyLog As New EventLog
MyLog.Source = "MyCompany"

Dim NextCheckTime As Date
If CInt(Time.Text.Substring(0, 2)) < CInt(Date.Now.Hour) Then
NextCheckTime = Date.Parse(Date.Now.AddDays(1).Date & " " &
Time.Text)
Else
NextCheckTime = Date.Parse(Date.Now.Date & " " & Time.Text)
End If
NextCheckTime = NextCheckTime.AddHours(1)

Dim IntervalToReturn As Double =
NextCheckTime.Subtract(Date.Now).TotalMilliseconds

If Debug.Text.ToLower = "true" Then
MyLog.WriteEntry("The daily timer is set to expire in " &
IntervalToReturn & " milliseconds, which is " & IntervalToReturn / 1000 / 60
/ 60 & " hours.", EventLogEntryType.Information)
End If

Return IntervalToReturn
End Function
 
T

tomb

This sounds like a very complex app - and sounds nicely done. The
difficulty with something like this is that you are relying on an
outside source to provide a file. Scroll down.
This is driving us mad - please help!

Back in 2003, I coded a Windows Service in VB.NET for framework v1.1.4322.
I deployed it in Nov. 2003, and it worked fine until the end of May 2005,
when it choked (see below). We restarted it, and it worked fine for another
7 months until it choked again at the end of Dec. 2005. It has now failed
again (mid Feb. 2006). So it's failing more frequently now?

OnStart, the service reads a number of settings from the configuration file.
Among thsoe settings there is:
- a path for a FileSystemWatcher
- a time when the file is expected to arrive, used to set a timer

FileSystemWatcher is set to watch the specified path for the creation
(arrival) of a *.CSV file from another system. When it arrives, it's
contents are read and loaded into a database. In return, other data is
collected from the database and put in an output file (*.OUT). Then, a
global variable called LastExecuted is set to Now(). Declaration is at the
top of the code: Private LastExecuted As Date.

If the time value is 02:00, then we expect file arrival at 2:00AM.
Therefore, the timer is set at the proper millissecond interval between now
and the next 3:00AM that comes around (whether that's today or tomorrow,
depending on the current time). On Timer_Elapsed, I check that LastExecuted
is within the last 75 minutes. If not, the file failed to arrive today, so
make an entry in the EventLog and SmtpMail.Send an error message. Then,
re-read the configuration file (in case the expected arrival time was
changed) and then reset and restart the Timer (which typically results in a
24-hour period).
How often does the file not arrive as expected? Is the expected arrival
time always modified? If it is not modified, what is the result within
the application? Does it affect the time_interval?
This may be a shot in the dark, but it's all I can think of.

Tom
 
S

Stephany Young

You say that once the FileSystemWatcher is triggered, you wait another 20
seconds until FTP finishes transferring the file.

The types of things I would be looking at include:

- How do you determine if the inwards FTP operation has completed?

- What size is a 'normal' or inward file and how long does it normally
take to
transfer?

- What should happen if the inwards transfer does not complete within
20 seconds?

- On the days it 'choked' was the transferred file significantly larger
than normal?

- On the days it 'choked' what does the FTP log show?

I get the impression that this is file that normally arrives sometime during
the early hours of the morning and is processed ready for the day ahead. If
this is the case then what is the latest time the file could arrive before
one could assume that is not going to arrive today and how long does the
processing of the file take? Also, is it a requirement that the processing
of the file is finished before a certain time?

If the latest time for arrival was within a relatively short time of the
normal arrival time (say up to 1 hour of 2:00 AM) and the processing time
was relatively short (say 1 hour at the most) I would be inclined to use
task scheduler rather than a service. With the job scheduled for say 4:00 AM
it would be all over by 5:00 AM.
The job would simply check to see if the file exists, process it if it does
and report appropriately if it doesn't.
 
R

Rick

My guess is that you have an overflow error in a variable used in the
timer. WinNT used to have a bug where it would crash after running for
something like a week. MS used an integer to track the number of
seconds since the last reboot and it would overflow in about a week. If
you have something that tracks the milliseconds since startup, you may
be facing a similar issue.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top