I wrote a similar service for an enterprise application that dealt with the
same kinds of problems you are facing here: power outtages, lost
connections, etc. Here are a few things to keep in mind while you design
your service:
1) Make your service fail safe from the stand point that if network is not
available it can just idle and periodically check for network availability,
your service should not "hang" or "crash" if the network goes down. This
will involve wrapping all of your network calls with error handlers so that
you won't be using any sockets that are disconnected or in an invalid state.
2) Logs are really useful. If possible log all acitivity to a local
database. For example: Service start time, service shutdown time, and if
the network goes up and down log the time the network became unavailable and
when it was detected up again. Log messages are extremely useful when you
have to trace through a problem that may not be necessarily happening with
your code/design but instead with the network that is connected to the
server.
3) Assign priority to log entries such as Notice, Caution, and Critical
depending on the kinds of problems you foresee encountering during operation
of the server. In the product I worked on the logs were available through a
web interface and color coded accordingly, it was very easy to go straight
to critical and see what happened leading up to that critical log.
4) If your service is going to be multithreaded you add another dimension of
complexity. This is not a bad thing but you must make sure that your
service main thread is able to recycle threads that crash or terminate
unexpectedly. You should be able to gracefully deallocate the memory from a
crashed thread and respawn it to continue what it had been doing previously.
This is not something that is trivial to do.
5) Lastly don't leave any connections open in an environment like this.
Crashing because of a power outtage may actually corrupt the files or
databases for which you are writing to when it happens. So when making
entries into a database or file, if it is not a performance hit go with the
open->write->close sequence and try to make each transaction as atomic as it
can be.
Project I worked on ran a multithreaded service in a fairly hostile
environment where it received data from one end, processed the data, and
sent data to another end point where it needed to be able to detect if the
data received was valid (not garbled) and that the receving system was
available (not crashed) and if not it would have to queue the data until it
became available (each transaction carried a $$$ amount so no transactions
could be lost). In addition locations were often hostile at the very least
and prone to lightning strikes and other hazards, go figure systems running
in closets at air force bases.
This is really long, but I hope it will give you a few things to consider as
you design your system.
Alex