XP sometimes balks on boot-up

T

Tocapet

About every 4th time I boot my machine, I get a sudden blue screen saying
that the system has shut down. It says
uninstall any new hardware (I haven't installed any) and software (ditto)
and run the diagnostic software that came with my computer (I built it). It
says technical information Stop: 0x00000007f (0x0000000d 0x00000000
0x00000000)
It then does a memory dump, reboots and then scans drives D and C. After
that it's OK until the next time.

This has been going on now for the last 2 weeks. It only happens in WinXP
PRO. I also have the 64-bit WinXP evaluation and Mandrake Linux 64-bit in a
triple-boot setup. The only problem is in WinXP PRO. I have had SP2
running now since it first came out with no problems before.

This is something new. I have scanned it for spyware and viruses using
McAfee, Trendmicro, Avast, Spybot and Ad-Aware. I have defragged all 4
partitions. I have unseated and re-seated the memory and the expansion
cards. The motherboard is Asus K8VSE running Athlon 64 3200 and 512MB of
PC3200 RAM (2 256MB sticks). The video is Nvidia Gforce FX 5200 128MB.

This is just an annoying problem that I would like to fix. Does anyone have
an idea before I do a repair install of XP PRO. I have an install disk
integrated with SP2 on hand, but I hesitate thinking I will have to
reinstall all my other stuff.

(e-mail address removed)
 
D

David Candy

Driver Development Tools: Windows DDK

Bug Check 0x7F: UNEXPECTED_KERNEL_MODE_TRAP
The UNEXPECTED_KERNEL_MODE_TRAP bug check has a value of 0x0000007F. This indicates that a trap was generated by the Intel CPU and the kernel failed to catch this trap.

This could be either a bound trap (a trap the kernel is not permitted to catch) or a double fault (a fault that occurred while processing an earlier fault, which always results in a system crash).

Parameters
The first parameter displayed on the blue screen specifies the trap number.

Here are some of the most common trap codes:

a.. 0x00000000, or Divide by Zero Error, is caused when a DIV instruction is executed and the divisor is zero. Memory corruption, other hardware problems, or software failures can cause this error.
b.. 0x00000004, or Overflow, occurs when the processor executes a call to an interrupt handler when the overflow (OF) flag is set.
c.. 0x00000005, or Bounds Check Fault, is generated when the processor, while executing a BOUND instruction, finds the operand exceeds the specified limits. A BOUND instruction is used to ensure that a signed array index is within a certain range.
d.. 0x00000006, or Invalid Opcode, is generated when the processor attempts to execute an invalid instruction. This is generally caused when the instruction pointer has become corrupted and is pointing to the wrong location. The most common cause of this is hardware memory corruption.
e.. 0x00000008, or Double Fault, is when an exception occurs while trying to call the handler for a prior exception. Normally, the two exceptions can be handled serially. However, there are several exceptions that cannot be handled serially, and in this situation the processor signals a double fault. There are two common causes of a double fault:
1.. A kernel stack overflow. This occurs when a guard page is hit, and then the kernel tries to push a trap frame. Since there is no stack left, a stack overflow results, causing the double fault. If you suspect this has occurred, use the !thread debugger extension to determine the stack limits, and then use the KB (Display Stack Backtrace) debugger command with a large parameter (for example, kb 100) to display the full stack.
2.. A hardware problem.
The less-common trap codes include:

a.. 0x00000001 — A system-debugger call
b.. 0x00000003 — A debugger breakpoint
c.. 0x00000007 — A hardware coprocessor instruction with no coprocessor present
d.. 0x0000000A — A corrupted Task State Segment
e.. 0x0000000B — An access to a memory segment that was not present
f.. 0x0000000C — An access to memory beyond the limits of a stack
g.. 0x0000000D — An exception not covered by some other exception; a protection fault that pertains to access violations for applications
For other trap numbers, consult an Intel architecture manual.

Cause
Bug check 0x7F usually occurs after the installation of faulty or mismatched hardware (especially memory) or in the event that installed hardware fails.

A double fault can occur when the kernel stack overflows. This can happen if multiple drivers are attached to the same stack. For example, two file system filter drivers can be attached to the same stack and then the file system can recurse back in, overflowing the stack.

Resolving the Problem
Debugging: Always begin with the !analyze debugger extension.

If this is not sufficient, use the KV (Display Stack Backtrace) debugger command.

a.. If KV shows a taskGate, then use the .tss (Display Task State Segment) command on the part before the colon.
b.. If KV shows a trap frame, then use the .trap (Display Trap Frame) command to format the frame.
c.. Otherwise, use the .trap (Display Trap Frame) command on the appropriate frame. (On x86 platforms, this frame is associated with the procedure NT!KiTrap.)
After this, use KV again to display the new stack.

Troubleshooting: If hardware was recently added to the system, remove it to see if the error recurs. If existing hardware has failed, remove or replace the faulty component. Run hardware diagnostics supplied by the system manufacturer, to determine which hardware component has failed. The memory scanner is especially important; faulty or mismatched memory can cause this bug check. For details on these procedures, see the owner’s manual for your computer. Check that all adapter cards in the computer are properly seated. Use an ink eraser or an electrical contact treatment, available at electronics supply stores, to ensure adapter card contacts are clean.

If the error appears on a newly installed system, check the availability of updates for the BIOS, the SCSI controller or network cards. Updates of this kind are typically available on the Web site or BBS of the hardware manufacturer.

Confirm that all hard disks, hard disk controllers, and SCSI adapters are listed on the Microsoft Windows Hardware Compatibility List (HCL).

If the error occurred after the installation of a new or updated device driver, the driver should be removed or replaced. If, under this circumstance, the error occurs during the startup sequence and the system partition is formatted with NTFS, you might be able to use Safe Mode to rename or delete the faulty driver. If the driver is used as part of the system startup process in Safe Mode, you need to start the computer using the Recovery Console in order to access the file. Also try restarting your computer, and press F8 at the character-based menu that displays the operating system choices. At the resulting Windows Advanced Options menu, choose the Last Known Good Configuration option. This option is most effective when only one driver or service is added at a time.

Overclocking (setting the CPU to run at speeds above the rated specification) can cause this error. If this has been done to the computer experiencing the error, return the CPU to the default clock speed setting.

Check the System Log in Event Viewer for additional error messages that might help pinpoint the device or driver that is causing the error. Disabling memory caching of the BIOS might also resolve it.

If you encountered this error while upgrading to a new version of Windows, it might be caused by a device driver, a system service, a virus scanner, or a backup tool that is incompatible with the new version. If possible, remove all third-party device drivers and system services and disable any virus scanners prior to upgrading. Contact the software manufacturer to obtain updates of these tools. Also make sure that you have installed the latest Windows Service Pack.

Finally, if all the above steps fail to resolve the error, take the system motherboard to a repair facility for diagnostic testing. A crack, a scratched trace, or a defective component on the motherboard can also cause this error.

Send feedback on this topic. / Built on Thursday, February 13, 2003

A memory tester is available at
http://oca.microsoft.com/en/windiag.asp
 
T

Tocapet

I ran the MS diagnostic and it came up clean. I'm working in 64-bit Winxp
right now and it works fine. No problem in Linux either. It's only XP Pro
that's doing it. I guess I'm gonna try a repair install. 0x7F? My memory
is a matched pair. And it has run fine for many months up until recently.
I suspect either WinXP or maybe need to run SpinRite on drive D.

(e-mail address removed)



"David Candy" <.> wrote in message

Driver Development Tools: Windows DDK

Bug Check 0x7F: UNEXPECTED_KERNEL_MODE_TRAP
The UNEXPECTED_KERNEL_MODE_TRAP bug check has a value of 0x0000007F. This
indicates that a trap was generated by the Intel CPU and the kernel failed
to catch this trap.

This could be either a bound trap (a trap the kernel is not permitted to
catch) or a double fault (a fault that occurred while processing an earlier
fault, which always results in a system crash).

Parameters
The first parameter displayed on the blue screen specifies the trap number.

Here are some of the most common trap codes:

a.. 0x00000000, or Divide by Zero Error, is caused when a DIV instruction
is executed and the divisor is zero. Memory corruption, other hardware
problems, or software failures can cause this error.
b.. 0x00000004, or Overflow, occurs when the processor executes a call to
an interrupt handler when the overflow (OF) flag is set.
c.. 0x00000005, or Bounds Check Fault, is generated when the processor,
while executing a BOUND instruction, finds the operand exceeds the specified
limits. A BOUND instruction is used to ensure that a signed array index is
within a certain range.
d.. 0x00000006, or Invalid Opcode, is generated when the processor
attempts to execute an invalid instruction. This is generally caused when
the instruction pointer has become corrupted and is pointing to the wrong
location. The most common cause of this is hardware memory corruption.
e.. 0x00000008, or Double Fault, is when an exception occurs while trying
to call the handler for a prior exception. Normally, the two exceptions can
be handled serially. However, there are several exceptions that cannot be
handled serially, and in this situation the processor signals a double
fault. There are two common causes of a double fault:
1.. A kernel stack overflow. This occurs when a guard page is hit, and
then the kernel tries to push a trap frame. Since there is no stack left, a
stack overflow results, causing the double fault. If you suspect this has
occurred, use the !thread debugger extension to determine the stack limits,
and then use the KB (Display Stack Backtrace) debugger command with a large
parameter (for example, kb 100) to display the full stack.
2.. A hardware problem.
The less-common trap codes include:

a.. 0x00000001 — A system-debugger call
b.. 0x00000003 — A debugger breakpoint
c.. 0x00000007 — A hardware coprocessor instruction with no coprocessor
present
d.. 0x0000000A — A corrupted Task State Segment
e.. 0x0000000B — An access to a memory segment that was not present
f.. 0x0000000C — An access to memory beyond the limits of a stack
g.. 0x0000000D — An exception not covered by some other exception; a
protection fault that pertains to access violations for applications
For other trap numbers, consult an Intel architecture manual.

Cause
Bug check 0x7F usually occurs after the installation of faulty or mismatched
hardware (especially memory) or in the event that installed hardware fails.

A double fault can occur when the kernel stack overflows. This can happen if
multiple drivers are attached to the same stack. For example, two file
system filter drivers can be attached to the same stack and then the file
system can recurse back in, overflowing the stack.

Resolving the Problem
Debugging: Always begin with the !analyze debugger extension.

If this is not sufficient, use the KV (Display Stack Backtrace) debugger
command.

a.. If KV shows a taskGate, then use the .tss (Display Task State Segment)
command on the part before the colon.
b.. If KV shows a trap frame, then use the .trap (Display Trap Frame)
command to format the frame.
c.. Otherwise, use the .trap (Display Trap Frame) command on the
appropriate frame. (On x86 platforms, this frame is associated with the
procedure NT!KiTrap.)
After this, use KV again to display the new stack.

Troubleshooting: If hardware was recently added to the system, remove it to
see if the error recurs. If existing hardware has failed, remove or replace
the faulty component. Run hardware diagnostics supplied by the system
manufacturer, to determine which hardware component has failed. The memory
scanner is especially important; faulty or mismatched memory can cause this
bug check. For details on these procedures, see the owner’s manual for your
computer. Check that all adapter cards in the computer are properly seated.
Use an ink eraser or an electrical contact treatment, available at
electronics supply stores, to ensure adapter card contacts are clean.

If the error appears on a newly installed system, check the availability of
updates for the BIOS, the SCSI controller or network cards. Updates of this
kind are typically available on the Web site or BBS of the hardware
manufacturer.

Confirm that all hard disks, hard disk controllers, and SCSI adapters are
listed on the Microsoft Windows Hardware Compatibility List (HCL).

If the error occurred after the installation of a new or updated device
driver, the driver should be removed or replaced. If, under this
circumstance, the error occurs during the startup sequence and the system
partition is formatted with NTFS, you might be able to use Safe Mode to
rename or delete the faulty driver. If the driver is used as part of the
system startup process in Safe Mode, you need to start the computer using
the Recovery Console in order to access the file. Also try restarting your
computer, and press F8 at the character-based menu that displays the
operating system choices. At the resulting Windows Advanced Options menu,
choose the Last Known Good Configuration option. This option is most
effective when only one driver or service is added at a time.

Overclocking (setting the CPU to run at speeds above the rated
specification) can cause this error. If this has been done to the computer
experiencing the error, return the CPU to the default clock speed setting.

Check the System Log in Event Viewer for additional error messages that
might help pinpoint the device or driver that is causing the error.
Disabling memory caching of the BIOS might also resolve it.

If you encountered this error while upgrading to a new version of Windows,
it might be caused by a device driver, a system service, a virus scanner, or
a backup tool that is incompatible with the new version. If possible, remove
all third-party device drivers and system services and disable any virus
scanners prior to upgrading. Contact the software manufacturer to obtain
updates of these tools. Also make sure that you have installed the latest
Windows Service Pack.

Finally, if all the above steps fail to resolve the error, take the system
motherboard to a repair facility for diagnostic testing. A crack, a
scratched trace, or a defective component on the motherboard can also cause
this error.

Send feedback on this topic. / Built on Thursday, February 13, 2003

A memory tester is available at
http://oca.microsoft.com/en/windiag.asp
 
T

Tocapet

I did a repair install and it's good so far. No more kernel mode trap
errors. But I'm interested in this solution you were giving me. Are you
referring to the Debug command? Also, you mention Intel CPU, mine is AMD
64. Please elaborate. I would like to learn more about the debug sequence
you mentioned.

(e-mail address removed)

"David Candy" <.> wrote in message

Driver Development Tools: Windows DDK

Bug Check 0x7F: UNEXPECTED_KERNEL_MODE_TRAP
The UNEXPECTED_KERNEL_MODE_TRAP bug check has a value of 0x0000007F. This
indicates that a trap was generated by the Intel CPU and the kernel failed
to catch this trap.

This could be either a bound trap (a trap the kernel is not permitted to
catch) or a double fault (a fault that occurred while processing an earlier
fault, which always results in a system crash).

Parameters
The first parameter displayed on the blue screen specifies the trap number.

Here are some of the most common trap codes:

a.. 0x00000000, or Divide by Zero Error, is caused when a DIV instruction
is executed and the divisor is zero. Memory corruption, other hardware
problems, or software failures can cause this error.
b.. 0x00000004, or Overflow, occurs when the processor executes a call to
an interrupt handler when the overflow (OF) flag is set.
c.. 0x00000005, or Bounds Check Fault, is generated when the processor,
while executing a BOUND instruction, finds the operand exceeds the specified
limits. A BOUND instruction is used to ensure that a signed array index is
within a certain range.
d.. 0x00000006, or Invalid Opcode, is generated when the processor
attempts to execute an invalid instruction. This is generally caused when
the instruction pointer has become corrupted and is pointing to the wrong
location. The most common cause of this is hardware memory corruption.
e.. 0x00000008, or Double Fault, is when an exception occurs while trying
to call the handler for a prior exception. Normally, the two exceptions can
be handled serially. However, there are several exceptions that cannot be
handled serially, and in this situation the processor signals a double
fault. There are two common causes of a double fault:
1.. A kernel stack overflow. This occurs when a guard page is hit, and
then the kernel tries to push a trap frame. Since there is no stack left, a
stack overflow results, causing the double fault. If you suspect this has
occurred, use the !thread debugger extension to determine the stack limits,
and then use the KB (Display Stack Backtrace) debugger command with a large
parameter (for example, kb 100) to display the full stack.
2.. A hardware problem.
The less-common trap codes include:

a.. 0x00000001 — A system-debugger call
b.. 0x00000003 — A debugger breakpoint
c.. 0x00000007 — A hardware coprocessor instruction with no coprocessor
present
d.. 0x0000000A — A corrupted Task State Segment
e.. 0x0000000B — An access to a memory segment that was not present
f.. 0x0000000C — An access to memory beyond the limits of a stack
g.. 0x0000000D — An exception not covered by some other exception; a
protection fault that pertains to access violations for applications
For other trap numbers, consult an Intel architecture manual.

Cause
Bug check 0x7F usually occurs after the installation of faulty or mismatched
hardware (especially memory) or in the event that installed hardware fails.

A double fault can occur when the kernel stack overflows. This can happen if
multiple drivers are attached to the same stack. For example, two file
system filter drivers can be attached to the same stack and then the file
system can recurse back in, overflowing the stack.

Resolving the Problem
Debugging: Always begin with the !analyze debugger extension.

If this is not sufficient, use the KV (Display Stack Backtrace) debugger
command.

a.. If KV shows a taskGate, then use the .tss (Display Task State Segment)
command on the part before the colon.
b.. If KV shows a trap frame, then use the .trap (Display Trap Frame)
command to format the frame.
c.. Otherwise, use the .trap (Display Trap Frame) command on the
appropriate frame. (On x86 platforms, this frame is associated with the
procedure NT!KiTrap.)
After this, use KV again to display the new stack.

Troubleshooting: If hardware was recently added to the system, remove it to
see if the error recurs. If existing hardware has failed, remove or replace
the faulty component. Run hardware diagnostics supplied by the system
manufacturer, to determine which hardware component has failed. The memory
scanner is especially important; faulty or mismatched memory can cause this
bug check. For details on these procedures, see the owner’s manual for your
computer. Check that all adapter cards in the computer are properly seated.
Use an ink eraser or an electrical contact treatment, available at
electronics supply stores, to ensure adapter card contacts are clean.

If the error appears on a newly installed system, check the availability of
updates for the BIOS, the SCSI controller or network cards. Updates of this
kind are typically available on the Web site or BBS of the hardware
manufacturer.

Confirm that all hard disks, hard disk controllers, and SCSI adapters are
listed on the Microsoft Windows Hardware Compatibility List (HCL).

If the error occurred after the installation of a new or updated device
driver, the driver should be removed or replaced. If, under this
circumstance, the error occurs during the startup sequence and the system
partition is formatted with NTFS, you might be able to use Safe Mode to
rename or delete the faulty driver. If the driver is used as part of the
system startup process in Safe Mode, you need to start the computer using
the Recovery Console in order to access the file. Also try restarting your
computer, and press F8 at the character-based menu that displays the
operating system choices. At the resulting Windows Advanced Options menu,
choose the Last Known Good Configuration option. This option is most
effective when only one driver or service is added at a time.

Overclocking (setting the CPU to run at speeds above the rated
specification) can cause this error. If this has been done to the computer
experiencing the error, return the CPU to the default clock speed setting.

Check the System Log in Event Viewer for additional error messages that
might help pinpoint the device or driver that is causing the error.
Disabling memory caching of the BIOS might also resolve it.

If you encountered this error while upgrading to a new version of Windows,
it might be caused by a device driver, a system service, a virus scanner, or
a backup tool that is incompatible with the new version. If possible, remove
all third-party device drivers and system services and disable any virus
scanners prior to upgrading. Contact the software manufacturer to obtain
updates of these tools. Also make sure that you have installed the latest
Windows Service Pack.

Finally, if all the above steps fail to resolve the error, take the system
motherboard to a repair facility for diagnostic testing. A crack, a
scratched trace, or a defective component on the motherboard can also cause
this error.

Send feedback on this topic. / Built on Thursday, February 13, 2003

A memory tester is available at
http://oca.microsoft.com/en/windiag.asp
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top