Forcing Resource Hog To Play Nice

S

Smithers

I'm writing a Console app that will run on a periodical basis on busy Web
servers. The console app is responsible for backing up files uploaded to the
Web server (copying the files out to a backup server). The Windows Task
Manager will schedule the utility to run at least once per day. This backup
utility zips files and goes through hundreds of directories as it does its
job. While I don't have objective data on performance yet, there is every
reason to believe that this utility could be a resource hog while it's
zipping files etc.

I am considering putting a call to System.Threading.Thread.Sleep() in a few
of the loops (e.g., between each Web site)... perhaps causing it to sleep
for a second or two here and there. My thinking is that this would let the
Web server perform live production tasks with a substantially reduced impact
from this backup utility.

What do you think. Is that a reasonable way to force the utility to play
nice? Or are there some other, perhaps better alternatives, like reducing
it's priority?

Thanks.
 
P

Peter Duniho

Smithers said:
[...]
What do you think. Is that a reasonable way to force the utility to play
nice? Or are there some other, perhaps better alternatives, like reducing
it's priority?

IMHO, adjusting the priority is a better way to manage it.

In some cases, using Sleep() or similar would make sense. But I think
that the interval should be at least a second or more in order to bother
with a mechanism like that. That is, an interval so long that basic
priority management would still cause your code to execute too often.

One thing to consider is that i/o is likely to be a significant amount
of your "processing". Simply adjusting the thread priority may not
affect your application's impact on system performance that much. The
calculations needed for compressing files are easily handled as compared
to the time it takes to move the data around.

However, if you are deploying this on Vista, it includes the new ability
to adjust your i/o priority in addition to your CPU priority. If you're
not currently planning to deploy on Vista (or the server equivalent...I
forget what they're calling that), this might be a reason to change your
mind about that. :)

Pete
 
M

Michael A. Covington

However, if you are deploying this on Vista, it includes the new ability
to adjust your i/o priority in addition to your CPU priority. If you're
not currently planning to deploy on Vista (or the server equivalent...I
forget what they're calling that), this might be a reason to change your
mind about that. :)

Hear, hear! This is one of the best things about Vista -- it at least
provides the *possibility* for things like virus checkers to *stay out of
the way*!
 
S

Smithers

Hummm, hadn't thought about CPU vs file I/O across the wire. I think the IO
is going through a different NIC than that going out to the public Internet
(it better be!). Not sure if that would make much difference. Also, this is
going onto Windows Server 2003, where it will need to run for at least a
couple of years (guaranteed we're NOT upgrading to latest Windows server OS
until it's been out for a while).

So, in your opinion, it sounds like Sleep() for 2 seconds or so would be a
perfectly reasonable thing to do in my scenario. That, coupled with low
thread priority. Anything wrong with that approach?

FWIW, this backup process can take it's sweet time (within reason) - it's
not like we need for it to complete within x minutes, so if sleeping adds a
total of 15 minutes to the process that's fine (and at 2-seconds per sleep,
there's no way we'd ever approach 15 minutes), depending of course on where
I place the Sleep() calls.

-S


Peter Duniho said:
Smithers said:
[...]
What do you think. Is that a reasonable way to force the utility to play
nice? Or are there some other, perhaps better alternatives, like reducing
it's priority?

IMHO, adjusting the priority is a better way to manage it.

In some cases, using Sleep() or similar would make sense. But I think
that the interval should be at least a second or more in order to bother
with a mechanism like that. That is, an interval so long that basic
priority management would still cause your code to execute too often.

One thing to consider is that i/o is likely to be a significant amount of
your "processing". Simply adjusting the thread priority may not affect
your application's impact on system performance that much. The
calculations needed for compressing files are easily handled as compared
to the time it takes to move the data around.

However, if you are deploying this on Vista, it includes the new ability
to adjust your i/o priority in addition to your CPU priority. If you're
not currently planning to deploy on Vista (or the server equivalent...I
forget what they're calling that), this might be a reason to change your
mind about that. :)

Pete
 
P

Peter Duniho

Smithers said:
Hummm, hadn't thought about CPU vs file I/O across the wire. I think the IO
is going through a different NIC than that going out to the public Internet
(it better be!).

One hopes so. However, your network i/o isn't the only issue (and in
fact, it's not the one I was thinking of...I did assume that you
wouldn't use the same network adapter for the backup as for the web
server itself).

Presumably the web server will need to access the disk that is being
backed up (at least it sounded like that from your description), and if
it's reading from the disk when the web server is trying to serve up
some data to a client, there's a conflict that changing your i/o
priority could help.
Not sure if that would make much difference. Also, this is
going onto Windows Server 2003, where it will need to run for at least a
couple of years (guaranteed we're NOT upgrading to latest Windows server OS
until it's been out for a while).

In that case, the thread i/o priority is obviously a non-starter. :)
So, in your opinion, it sounds like Sleep() for 2 seconds or so would be a
perfectly reasonable thing to do in my scenario. That, coupled with low
thread priority. Anything wrong with that approach?

Nothing, really. Absent a way to control thread priority, I'm not sure
there's any other practical way and that method is basically fine.

Your biggest issue doing it that way will be to ensure that not only is
the interval long enough (seems like 2 seconds ought to be), but also
that the time you spend executing is short. With a 2 second interval,
if you spend 500 ms processing, then you're still taking up as much as
25% of the system resources as you would otherwise. If you want the
process to be truly "background", I might aim for 50 ms or less.

Heck, for that matter, you might just break your work into single i/o
calls. When you wake up from the sleep timer, submit a new i/o request,
process the previous one, then sleep again. That way, you might even be
able to keep your processing within a single timeslice, depending on
what exactly you're doing. A thread consuming a single timeslice once
every two seconds ought to be practically unnoticeable, and doing it
that way has the added advantage that practically all of your i/o should
happen while you're sleeping anyway. No blocking on the i/o necessary,
since your thread's asleep anyway. :)

What constitutes an "i/o request" depends on how you've structured the
program of course. You said you're doing some compression, so depending
on how you've broken that work up, it might be something like reading a
block of data, compressing it, then writing it over the network. In
that case, you might find it most useful to order the operations so that
when your thread wakes up, there's a block of data ready for it to
process, so the last thing it would do before calling sleep would be to
read the next block to process.
FWIW, this backup process can take it's sweet time (within reason) - it's
not like we need for it to complete within x minutes, so if sleeping adds a
total of 15 minutes to the process that's fine (and at 2-seconds per sleep,
there's no way we'd ever approach 15 minutes), depending of course on where
I place the Sleep() calls.

At 2 seconds per sleep, it would add 15 minutes if you called Sleep()
450 times. How many times you actually call Sleep() depends of course
on how you determine the intervals in your processing to call it. But
I'd guess that if you keep the processing time per interval very short,
you could easily wind up calling Sleep() 450 times or even more.

The quick-and-dirty way to figure this out, of course, is to see how
long the process would take without any delay, calculate an non-idle
percentage based on the sleep interval and your target processing time
per interval, and then divide the no-delay time by that percentage.

So, if the process takes 5 minutes normally, you are sleeping for 2
seconds, and you limit each interval to 50 ms of processing time, that's
a 2.5% non-idle time, which gives you a total processing time of 40
minutes. It's not an exact calculation, because it ignores the
difference between having to wait for i/o and not, among other things.
But it should be a good ballpark number.

If it's literally true that you don't need the process to complete in
any specific amount of time, that sort of inflation might be fine.
Otherwise, you're going to have to compromise between getting the
process done fast enough and minimizing its effect on web server
performance.

All that said, you don't say what sort of computer this is running on,
but if you've got (for example) an 8-core box with a huge striped RAID
array with 10,000 RPM disks, all of this might just be a moot point.
With enough performance to spare, it's pointless to waste developer time
on trying to minimize the effects of something like this that isn't
going to run but once a day and for a very short period of time.

Pete
 
S

Smithers

Thanks for the additional discussion... much appreciated! I'll be adding
many more test scenarios before finalizing the Sleep() strategy. Part of the
difficulty here is that it's difficult to predict how much data will be
backed up on any given run of the utility, as it is being designed to back
up (and possibly compress) only new or modified files - and most of our
customers cause such activity only a couple of times per month. So if
customers don't upload any new files or otherwise cause them to be generated
on the server, then there would be very little work for the utility to do
beyond comparing dates/times between production and the backup NAS/Server
thingie Of course there would be a new IIS log file per site per day. I
think I'll shoot for a middle of the wide road strategy, then collect
objective data over an extended period, and adjust accordingly. I'd love to
throw hardware at this project - but for now it's initially going onto an
old dual PIII 800 (or so), mirrored 10K RPM disks, and 2GB RAM. In 2004 we
took a "let's see how well we wrote the ASP.NET app" approach, thinking, if
it works well on that clunker, then our efforts were good. The app actually
smokes on that thing! It has s even handled increaded workload nicely over
the years. So we just left it there.

-S
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top