Funny file permission problems.

P

pjp

Getting odd problems once and awhile, all seem related.

Word will occasionally not allow me to save document insisting original is
read-only. I have to save under new name, delete original and rename update.

Can't delete/move/rename a file after a program has created it, Any Video
Converter usually. Reopen and close program and then file's ok.

Get very high but very sporadic DPC latency's. System seems fine after
reboot with random period before problem re-emerges. Seldom same sequence of
events leads to repetition.

Chkdsk occasionally needs boot option, e.g. Chkdsk /f.

Disks burn fine initially after reboot but errors occur seem related to DPC
at this time.

PC is loaded with things. Dual-head video and tv all connected, Hauppage TV
input, printer, onboard ethernet, two usb hubs (both self powered, one USB 1
other USB 2), webcam, wheel, flight stick and joypad all connected, a midi
usb piano, midi usb electronic drums, usb mic, three external hard drives,
multi-card reader (4 drives). Also has an add-on 5.1. sound card with on
board sound disabled in bios. Keyboard is PS2 mouse is USB.

Any ideas where to start and/or how to proceed :)
 
P

philo

Getting odd problems once and awhile, all seem related.

Word will occasionally not allow me to save document insisting original is
read-only. I have to save under new name, delete original and rename update.

Can't delete/move/rename a file after a program has created it, Any Video
Converter usually. Reopen and close program and then file's ok.

Get very high but very sporadic DPC latency's. System seems fine after
reboot with random period before problem re-emerges. Seldom same sequence of
events leads to repetition.

Chkdsk occasionally needs boot option, e.g. Chkdsk /f.


<snip>

That's not a good sign

Before you do anything else...make sure your hardware is OK

run a RAM test

and run the mfg's HD diagnostic
 
B

Bob Willard

Getting odd problems once and awhile, all seem related.

Word will occasionally not allow me to save document insisting original is
read-only. I have to save under new name, delete original and rename update.

Can't delete/move/rename a file after a program has created it, Any Video
Converter usually. Reopen and close program and then file's ok.

Get very high but very sporadic DPC latency's. System seems fine after
reboot with random period before problem re-emerges. Seldom same sequence of
events leads to repetition.

Chkdsk occasionally needs boot option, e.g. Chkdsk /f.

Disks burn fine initially after reboot but errors occur seem related to DPC
at this time.

PC is loaded with things. Dual-head video and tv all connected, Hauppage TV
input, printer, onboard ethernet, two usb hubs (both self powered, one USB 1
other USB 2), webcam, wheel, flight stick and joypad all connected, a midi
usb piano, midi usb electronic drums, usb mic, three external hard drives,
multi-card reader (4 drives). Also has an add-on 5.1. sound card with on
board sound disabled in bios. Keyboard is PS2 mouse is USB.

Any ideas where to start and/or how to proceed :)

Sounds like a PC you've had for a while that has developed sporadic
problems. If that's the case, then I suggest cleaning the PC (if you
are comfortable poking around inside): clean the filters (if that PC
has any); clean the fans if they are clogged with dust; and, carefully
blow out the heatsinks.

Since you have a lot of USB add-on stuff, check the USB power
configuration. If you have any USB widgets that are powered from the
PC, try re-connecting them via powered USB hubs. USB KBDs and mice are
usually OK since they draw tiny amounts of power.

Next, worry about the PC power supply. Three approaches: (1) add up
all the currents drawn at each voltage from all those power sinks, and
compare it with the nameplate data on your PS; (2) alternatively -- and
better -- if you have an equivalent or stronger PS laying around, swap
it in in place of your current PS and see if your problems go away; (3)
disconnect a bunch of power loads and see if the PC becomes stable.
Note that PSs do age and (like old humans) become feeble.

And, as philo suggested, run a RAM test (Memtest86+ at least overnight)
and run the HD vendor's diagnostic. I also suggest running Prime95 as a
stress test on the CPU; it really pounds the FPU(s) and heats up the
whole CPU.
 
P

pjp

philo said:
<snip>

That's not a good sign

Before you do anything else...make sure your hardware is OK

run a RAM test

and run the mfg's HD diagnostic

Both pass. Long time tech here since early 80's so familiar with a lot.
Knowing hardware and way system acts leads me to suspect a software problem
in that software is putting a piece of hardware into some state that's
causing side issues.

Think I'm going to have to strip it back to basic hardware and try each
piece by itself then couple together then more etc. etc. and see when
problems arises. Before doing that though, my suspicion is there's a problem
with the disabled on-board sound and the add-on card. Easy enough to test
but means I loose audio input capabilities which means composite off TV-in
doesn't work or loose surround sound which means need another set of
speakers cause ones I have don't work in just stereo mode but needs all
three outputs from pc.

Think there's likely a new pc coming last of Dec. anyway as this one is
getting kinda long on the tooth. Likely won't resolve problem until it's a
give-away and I strip it clean etc.
 
P

pjp

Bob Willard said:
Sounds like a PC you've had for a while that has developed sporadic
problems. If that's the case, then I suggest cleaning the PC (if you are
comfortable poking around inside): clean the filters (if that PC has
any); clean the fans if they are clogged with dust; and, carefully blow
out the heatsinks.

Since you have a lot of USB add-on stuff, check the USB power
configuration. If you have any USB widgets that are powered from the PC,
try re-connecting them via powered USB hubs. USB KBDs and mice are
usually OK since they draw tiny amounts of power.

Next, worry about the PC power supply. Three approaches: (1) add up all
the currents drawn at each voltage from all those power sinks, and compare
it with the nameplate data on your PS; (2) alternatively -- and better --
if you have an equivalent or stronger PS laying around, swap it in in
place of your current PS and see if your problems go away; (3) disconnect
a bunch of power loads and see if the PC becomes stable. Note that PSs do
age and (like old humans) become feeble.

And, as philo suggested, run a RAM test (Memtest86+ at least overnight)
and run the HD vendor's diagnostic. I also suggest running Prime95 as a
stress test on the CPU; it really pounds the FPU(s) and heats up the whole
CPU.

Power supply may be an issue. Don't have a spare to test with. Well aware of
USB power issues hence why external USB hubs with own power supply. Ram and
HD tests always pass.

Yes, admit it likely needs cleaning but experience tells me that's not the
problem this time

See other reply for my suspicions, e.g. disabled on-board and add-on sound
card conflict causing side issues under selected unknown circumstances.
 
P

Paul

pjp said:
Getting odd problems once and awhile, all seem related.

Word will occasionally not allow me to save document insisting original is
read-only. I have to save under new name, delete original and rename update.

Can't delete/move/rename a file after a program has created it, Any Video
Converter usually. Reopen and close program and then file's ok.

Get very high but very sporadic DPC latency's. System seems fine after
reboot with random period before problem re-emerges. Seldom same sequence of
events leads to repetition.

Chkdsk occasionally needs boot option, e.g. Chkdsk /f.

Disks burn fine initially after reboot but errors occur seem related to DPC
at this time.

PC is loaded with things. Dual-head video and tv all connected, Hauppage TV
input, printer, onboard ethernet, two usb hubs (both self powered, one USB 1
other USB 2), webcam, wheel, flight stick and joypad all connected, a midi
usb piano, midi usb electronic drums, usb mic, three external hard drives,
multi-card reader (4 drives). Also has an add-on 5.1. sound card with on
board sound disabled in bios. Keyboard is PS2 mouse is USB.

Any ideas where to start and/or how to proceed :)

The problems don't seem related to me. The DPC one is the one that stands alone.

DPC is part of the response to interrupts. When there is a hardware interrupt,
the system runs at interrupt level while servicing it. A small portion of
the servicing is done at that interrupt level. To keep the computer responsive,
a DPC is scheduled to handle any of the "heavy lifting" required, as part of the
hardware interrupt servicing. So if an interrupt needed 1 millisecond of total
processing, perhaps 0.1 millisecond is spent at interrupt level, and the other
0.9 millisecond might be a DPC serviced at user level. Less time spent at
interrupt level, means another hardware interrupt coming in, sees a relatively
low latency to get serviced (as now, a new hardware interrupt will preempt a
DPC if needed, and get serviced).

DPCs sit in a queue. The queue is checked and processed, at user level.
(Exactly how that works, I don't know, and I don't know what priority it
has when compared to user programs.)

Using a tool such as "DPC Latency" tool, you can check the difference between
arrival time in the queue, and when it's serviced. Normally, the queue service
time is relatively low. DPCs are serviced in a timely manner, and the
latency is in the hundreds of microseconds.

If you see "DPC latency spikes", it implies something is causing the system
to run at interrupt level, and activities at user level aren't getting any
processor time. And that isn't good for any software, that has real time
processing requirements (like multimedia movie playback or sound recording).

An example of a "normal" spike, is when a 3D game is entering 3D mode. That
seems to make the system unresponsive for a significant time, until 3D starts
rendering.

In the case of a few Gigabyte motherboards, the spikes in the DPC Latency tool
graphs, are caused by SMM code in the BIOS. The BIOS is able to interrupt the
OS at regular intervals (many times a second) and the SMM code mechanism
allows the OS to be completely pre-empted. (The OS cannot even tell it is
happening. There is no log.) Normally, the SMM code has a short runtime, and
then the computer performance isn't compromised. But if the SMM code takes too
long, some regular latency spikes will be seen in the DPC Service Latency graph.
A BIOS update will fix that, in cases where Gigabyte has been alerted to the fact,
and figured it out. Sometimes, the SMM code is used to configure the VCore voltage
converter, and turn on or turn off converter phases as required (as part of some
cheesy "green" power conversion strategy). Asus also does VCore converters like
that (dynamic phases, with some phases being turned off during moments when the
system is idle).

You could use the Performance plugin, and look for an interrupt counter, and
see if the hardware is generating a high rate of interrupts.

You could use Process Explorer from Sysinternals, which has an entry for DPC
activity. The number of DPCs per second, should have some relationship to the
interrupt count in the Performance plugin.

With tools in place like that, including the DPC Latency tool, remove some
hardware and retest.

Just going from my poor memory, I might see hundreds of interrupts a second
on an "idle" system. If I alt-tab out of a 3D game, the video card continues
to run in the background, and I might see two thousand interrupts per second.
And if I use a cheap GbE LAN card and do link rate testing, I've seen as
high as around 20,000 interrupts per second. And the computer was still able
to operate when that was going on. 20,000 is too high a number for that
activity, and there are too many interrupts per packet on that card. An
Intel LAN chip by comparison, only had a fraction of that level of interrupt
activity, when running at 117MB/sec packet transfer rate. The level of interrupts
is high enough, that the max link rate you can transmit/receive with the
cheap LAN card, is CPU limited, and would require a 4GHz Core2 to run flat out.
Which is ridiculous, as a design. Anyway, that's to give some idea of the spectrum
of values you might see. Interrupts and DPCs should never really drop to
zero, because there is always a small amount of regular interrupt activity.

By removing hardware, you might be able to isolate a high interrupt issue.
But identifying "DPC Latency spikes", is much more difficult, because
you have absolutely no control over SMM (short of changing BIOS versions
and praying something good happens).

*******

Your other symptoms are pretty strange. I don't see anything wrong with
testing memory, and it's a good suggestion even on an otherwise working
computer.

I've seen some pretty strange things here, when it comes to
RAM errors, such as a RAM problem popping up out of the blue after
running VirtualBox. (Moving RAM sticks to alternate slots, setting
command latency from 1 to 2, no adjustment to voltage, and it was
all fixed again. Very strange. The RAM had been stable and error free,
for a year or a year and a half before that. The memory was vetted
with memtest86+, the errors were visible after the VirtualBox runs,
and the errors disappeared after the slight tweaks. Since memtest86+
boots the computer, VirtualBox or its drivers cannot be running
at that time.) The reason I was running a RAM test, is I was
having problems installing a guest OS in VirtualBox, and out of
frustration, checked memory, and was shocked to see errors on
what is normally, rock solid memory.

In terms of RAM test coverage, no memory program can test the BIOS
reserved area. To fix that, and to make it possible to isolate errors
to a single stick, requires running in a special configuration. If
you have a dual channel RAM motherboard, you arrange two memory sticks
in single channel configuration (i.e. two sticks sitting on the one
channel, none on the other channel). That means one of the two
sticks will provide the BIOS storage area, while the second stick can
be fully tested. Then, if you shut down, and rotate those two sticks
in their slots, the stick that was doing the BIOS storage, gets
moved to high memory, where it's fully exposed to memtest86+. By doing
two memtest86+ runs in single channel mode, with two sticks of RAM,
it's possible to completely test both sticks. If errors show up,
then due to the usage of single channel, the addresses shown can be
easily correlated with a particular stick of the two. (No tricky
address calculation, and guessing which stick it might be.) If you
own four sticks of memory, this means there will be testing as
two groups of two tests each, for a total of four memtest86+ runs,
to cover completely the four sticks.

You would think, that if the RAM was bad, and bad data was being
written into the file system, the computer would be bricked in no
time at all. Or a chkdsk run, would be showing "spaghetti" if
something like that was going on. (Using chkdsk to "repair errors",
in a system with flaky hardware, can absolutely ruin a file system.)
If you're experiencing some things going "read-only" on you, that's
too specific for a simple explanation, at least to my way of thinking.
Something like that, requires more intelligence, a more specific
interference of some sort (like a software issue). I tried Googling
on that, but didn't see any good candidate matches for a cause.

Paul
 
P

philo

Both pass. Long time tech here since early 80's so familiar with a lot.
Knowing hardware and way system acts leads me to suspect a software problem
in that software is putting a piece of hardware into some state that's
causing side issues.

Think I'm going to have to strip it back to basic hardware and try each
piece by itself then couple together then more etc. etc. and see when
problems arises. Before doing that though, my suspicion is there's a problem
with the disabled on-board sound and the add-on card. Easy enough to test
but means I loose audio input capabilities which means composite off TV-in
doesn't work or loose surround sound which means need another set of
speakers cause ones I have don't work in just stereo mode but needs all
three outputs from pc.

Think there's likely a new pc coming last of Dec. anyway as this one is
getting kinda long on the tooth. Likely won't resolve problem until it's a
give-away and I strip it clean etc.


I use removable drives here
so just pop in another installation to confirm whether or not it's a h/w
vs software issue.

I'm going to bow out now as it's obvious that Paul is way more
experienced in this area than I am.
 
P

Paul

philo said:
I use removable drives here
so just pop in another installation to confirm whether or not it's a h/w
vs software issue.

I'm going to bow out now as it's obvious that Paul is way more
experienced in this area than I am.

Hey, it's my turn to bow out :)

I've never seen any "random permissions" problems here. I'm kinda
interested in what can do that. If it was plain corruption (in either
RAM or disk I/O), it should just as easily cause the file system
layer to "fall over", as just toggle a few permission bits. There's
nothing worse than "orderly chaos".

Differential testing is a great idea. At least, as long as you've got
the bits and pieces to set it up. If a "clean OS" does it, or another
OS behaves strangely, then that's a hint it's hardware.

When my old 440BX based box fell over, a test in Linux showed a similar behavior
(crash in desktop, with hardly any compute activity at all). And when I was
doing some overclock testing on another machine, it was fun to watch icons
disappear from the Linux desktop, one at a time, as stuff crashed :) Trying
another OS disk can give you some little hints, even if the software isn't
exactly the same. The only thing I can't do well in Linux, is set up
a "heavy graphics load". I've been trying to do that for ages, and
virtually any useful setup, takes me a week to do, so is hardly the subject
of a one post "recipe". Whenever I figure out a good test case, it's got
to be something that is easy to explain (like something that installs
direct from a package manager).

The same would apply for a clean Windows install on another disk. The only
thing that wouldn't be covering, is the behavior of the original disk, and
you can still attempt to create and edit files on it, as a test case.

Paul
 
P

pjp

Paul said:
Hey, it's my turn to bow out :)

I've never seen any "random permissions" problems here. I'm kinda
interested in what can do that. If it was plain corruption (in either
RAM or disk I/O), it should just as easily cause the file system
layer to "fall over", as just toggle a few permission bits. There's
nothing worse than "orderly chaos".

Differential testing is a great idea. At least, as long as you've got
the bits and pieces to set it up. If a "clean OS" does it, or another
OS behaves strangely, then that's a hint it's hardware.

When my old 440BX based box fell over, a test in Linux showed a similar
behavior
(crash in desktop, with hardly any compute activity at all). And when I
was
doing some overclock testing on another machine, it was fun to watch icons
disappear from the Linux desktop, one at a time, as stuff crashed :)
Trying
another OS disk can give you some little hints, even if the software isn't
exactly the same. The only thing I can't do well in Linux, is set up
a "heavy graphics load". I've been trying to do that for ages, and
virtually any useful setup, takes me a week to do, so is hardly the
subject
of a one post "recipe". Whenever I figure out a good test case, it's got
to be something that is easy to explain (like something that installs
direct from a package manager).

The same would apply for a clean Windows install on another disk. The only
thing that wouldn't be covering, is the behavior of the original disk, and
you can still attempt to create and edit files on it, as a test case.

Paul

I don't seem to get problems using a live Linux cd but truth be told can't
say I've given it enough time to call it a fair shot.

Machine was originally bought 2nd hand and I've always had suspicion guy had
overclocked it and when it started becoming unstable for the games he wanted
to play like that he set everything back to default. As said earlier, time
taken to try and find problem with current hardware likely not worth it.
Likely just get a new one and strip this back before starting clean and
seeing what happens.
 
P

pjp

Paul said:
The problems don't seem related to me. The DPC one is the one that stands
alone.

DPC is part of the response to interrupts. When there is a hardware
interrupt,
the system runs at interrupt level while servicing it. A small portion of
the servicing is done at that interrupt level. To keep the computer
responsive,
a DPC is scheduled to handle any of the "heavy lifting" required, as part
of the
hardware interrupt servicing. So if an interrupt needed 1 millisecond of
total
processing, perhaps 0.1 millisecond is spent at interrupt level, and the
other
0.9 millisecond might be a DPC serviced at user level. Less time spent at
interrupt level, means another hardware interrupt coming in, sees a
relatively
low latency to get serviced (as now, a new hardware interrupt will preempt
a
DPC if needed, and get serviced).

DPCs sit in a queue. The queue is checked and processed, at user level.
(Exactly how that works, I don't know, and I don't know what priority it
has when compared to user programs.)

Using a tool such as "DPC Latency" tool, you can check the difference
between
arrival time in the queue, and when it's serviced. Normally, the queue
service
time is relatively low. DPCs are serviced in a timely manner, and the
latency is in the hundreds of microseconds.

If you see "DPC latency spikes", it implies something is causing the
system
to run at interrupt level, and activities at user level aren't getting any
processor time. And that isn't good for any software, that has real time
processing requirements (like multimedia movie playback or sound
recording).

An example of a "normal" spike, is when a 3D game is entering 3D mode.
That
seems to make the system unresponsive for a significant time, until 3D
starts
rendering.

In the case of a few Gigabyte motherboards, the spikes in the DPC Latency
tool
graphs, are caused by SMM code in the BIOS. The BIOS is able to interrupt
the
OS at regular intervals (many times a second) and the SMM code mechanism
allows the OS to be completely pre-empted. (The OS cannot even tell it is
happening. There is no log.) Normally, the SMM code has a short runtime,
and
then the computer performance isn't compromised. But if the SMM code takes
too
long, some regular latency spikes will be seen in the DPC Service Latency
graph.
A BIOS update will fix that, in cases where Gigabyte has been alerted to
the fact,
and figured it out. Sometimes, the SMM code is used to configure the VCore
voltage
converter, and turn on or turn off converter phases as required (as part
of some
cheesy "green" power conversion strategy). Asus also does VCore converters
like
that (dynamic phases, with some phases being turned off during moments
when the
system is idle).

You could use the Performance plugin, and look for an interrupt counter,
and
see if the hardware is generating a high rate of interrupts.

You could use Process Explorer from Sysinternals, which has an entry for
DPC
activity. The number of DPCs per second, should have some relationship to
the
interrupt count in the Performance plugin.

With tools in place like that, including the DPC Latency tool, remove some
hardware and retest.

Just going from my poor memory, I might see hundreds of interrupts a
second
on an "idle" system. If I alt-tab out of a 3D game, the video card
continues
to run in the background, and I might see two thousand interrupts per
second.
And if I use a cheap GbE LAN card and do link rate testing, I've seen as
high as around 20,000 interrupts per second. And the computer was still
able
to operate when that was going on. 20,000 is too high a number for that
activity, and there are too many interrupts per packet on that card. An
Intel LAN chip by comparison, only had a fraction of that level of
interrupt
activity, when running at 117MB/sec packet transfer rate. The level of
interrupts
is high enough, that the max link rate you can transmit/receive with the
cheap LAN card, is CPU limited, and would require a 4GHz Core2 to run flat
out.
Which is ridiculous, as a design. Anyway, that's to give some idea of the
spectrum
of values you might see. Interrupts and DPCs should never really drop to
zero, because there is always a small amount of regular interrupt
activity.

By removing hardware, you might be able to isolate a high interrupt issue.
But identifying "DPC Latency spikes", is much more difficult, because
you have absolutely no control over SMM (short of changing BIOS versions
and praying something good happens).

*******

Your other symptoms are pretty strange. I don't see anything wrong with
testing memory, and it's a good suggestion even on an otherwise working
computer.

I've seen some pretty strange things here, when it comes to
RAM errors, such as a RAM problem popping up out of the blue after
running VirtualBox. (Moving RAM sticks to alternate slots, setting
command latency from 1 to 2, no adjustment to voltage, and it was
all fixed again. Very strange. The RAM had been stable and error free,
for a year or a year and a half before that. The memory was vetted
with memtest86+, the errors were visible after the VirtualBox runs,
and the errors disappeared after the slight tweaks. Since memtest86+
boots the computer, VirtualBox or its drivers cannot be running
at that time.) The reason I was running a RAM test, is I was
having problems installing a guest OS in VirtualBox, and out of
frustration, checked memory, and was shocked to see errors on
what is normally, rock solid memory.

In terms of RAM test coverage, no memory program can test the BIOS
reserved area. To fix that, and to make it possible to isolate errors
to a single stick, requires running in a special configuration. If
you have a dual channel RAM motherboard, you arrange two memory sticks
in single channel configuration (i.e. two sticks sitting on the one
channel, none on the other channel). That means one of the two
sticks will provide the BIOS storage area, while the second stick can
be fully tested. Then, if you shut down, and rotate those two sticks
in their slots, the stick that was doing the BIOS storage, gets
moved to high memory, where it's fully exposed to memtest86+. By doing
two memtest86+ runs in single channel mode, with two sticks of RAM,
it's possible to completely test both sticks. If errors show up,
then due to the usage of single channel, the addresses shown can be
easily correlated with a particular stick of the two. (No tricky
address calculation, and guessing which stick it might be.) If you
own four sticks of memory, this means there will be testing as
two groups of two tests each, for a total of four memtest86+ runs,
to cover completely the four sticks.

You would think, that if the RAM was bad, and bad data was being
written into the file system, the computer would be bricked in no
time at all. Or a chkdsk run, would be showing "spaghetti" if
something like that was going on. (Using chkdsk to "repair errors",
in a system with flaky hardware, can absolutely ruin a file system.)
If you're experiencing some things going "read-only" on you, that's
too specific for a simple explanation, at least to my way of thinking.
Something like that, requires more intelligence, a more specific
interference of some sort (like a software issue). I tried Googling
on that, but didn't see any good candidate matches for a cause.

Paul

Yea, it is odd behavior. DPC does goes thru the roof and it does seem
related to after sound card is used, playing mp3 file. But it doesn't seem
to act consistently in a direct cause/effect scenario. Gonna have to do it
the hard way me thinks, e.g. basic hardware, add, test etc. It may turn out
the mb is going as it's 2nd hand and original guy seemed to be a gamer who
didn't mind overclocking.
 
P

philo

I don't seem to get problems using a live Linux cd but truth be told can't
say I've given it enough time to call it a fair shot.

Machine was originally bought 2nd hand and I've always had suspicion guy had
overclocked it and when it started becoming unstable for the games he wanted
to play like that he set everything back to default. As said earlier, time
taken to try and find problem with current hardware likely not worth it.
Likely just get a new one and strip this back before starting clean and
seeing what happens.


At this point it looks like a lot of trial and error
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top