Do you have a question? Post it now! No Registration Necessary.  Now with pictures!

Threaded View
I've been trying to rehabilitate an HP Pavilion a6130n desktop pc.


It would not power up--no motherboard LEDs lit, no reaction on the CPU
fan, nothing.  Replacing the PSU with known good supply immediately got
the machine to boot.

However, after a few minutes of light use on the desktop, the machine
would BSOD.  I decided to reset the software to factory settings.

The initial part of the factory reset went well, but the subsequent
"first run" procedures would produce BSODs.  Eventually I got through
it, but the desktop would still BSOD after a few minutes use.

I broke down the system, ultimately, to the motherboard outside of the
case, with the original CPU, but everything else (RAM, optical drive,
hard drive) replaced with known good components.

Attempting to install Vista or Windows 7 from one of several generic
installers would produced a BSOD early on in the process--just before
the part where you can manage/view partitions.  I tried other hard
drives--same thing.

At this point, I felt like it had to be either the motherboard or CPU,
or possibly both.  It seemed like an odd failure for the CPU, so I
ordered a replacement motherboard.


I'm seeing the same thing today with this new motherboard.  Any
combination of components and installers produces a BSOD or spontaneous
restart at a specific point in the installation process.

For Vista and Windows 7, it's early on, for Windows XP it's at the point
where it really starts installing in earnest, after it has copied the
files to the hard drive and rebooted.

It's hard for me to escape the conclusion that there's a problem with
the CPU, but it seems like an odd failure.  I was under the impression
that a bad CPU should result in no POST, but this one can do quite a bit.

Any ideas as to what might be going on here?

Re: Flummoxed.

Grinder wrote:
Quoted text here. Click to load it

Too bad the BIOS doesn't have a VCore adjustment. You could try
increasing VCore a notch or two, and rerun the test. It could be
that the processor has electromigration failure, and is no longer
capable of running at full speed.

Athlon 64 X2 (B) 5000+ 2.6 GHz (65W)

So your options might be, to either drop the core speed by adjusting
the multiplier, or bump the VCore, to change the performance of the
silicon a bit (improve it enough to stop crashing).

Cool N' Quiet adjusts both multiplier and VCore dynamically. To
make your testing more consistent, (again, as a function of
what the BIOS will allow), you'd disable CNQ and keep the processor
running at full speed, while you do what effectively is "overclock
testing". You're trying to find a value of VCore, that makes the
processor work at its current speed. When that processor leaves
the factory, it has several hundred MHz of headroom. If the processor
was abused by someone else (if it was bought off Ebay), it could
be electromigration damaged, and might no longer have the regular
amount of headroom. Bumping VCore, is to see whether you can
make it work or not.

I think I noticed my Intel Northwood on my second backup computer,
doesn't have the same headroom it used to have, so these things

I've seen evidence that some of the AMD processors released, can be
damaged by overclocking them (show symptoms of electromigration
failure), and that implies it could happen to regular un-abused
processors, but just take longer to happen. The design rules
for the silicon, normally provide wire dimensions for a higher
clock speed, so that doesn't happen. (It's a function of current
density - when you overclock, the level of current flow in the
wires on the silicon die, goes up, and that can enhance the
electromigration process.) Picture here, shows how the wires
get chewed up.


Another potential failure mechanism, would be a processor that
leaves the factory with design bugs. (They all do, so don't panic :-) )
The errata list usually consists of at least 100 bugs in digital design.
The processor receives a microcode patch, both in the BIOS, but
also potentially when the OS is running. Windows has a microcode
loader, whose job is to load a microcode patch and then exit.
I don't know the details of how that works for AMD, but on
Intel processors, you can find the patches in the BIOS file (for
when the BIOS does it's patching). If, somehow, a bug wasn't getting
patched, and the bug affected booting, then that's a far fetched way
to explain the processor taking a dump. On Asus motherboards,
sometimes you can see comments like "supports new processors"
for some BIOS updates, and that can include the latest version
of microcode patch. It means as well as fix other things, they
also have to include a different set of microcode patches, so
more processor models can be supported.

On an Intel processor, when you use the "processor identification"
utility, it lists a "version" for the processor. If the "version"
is zero, then for whatever reason, no microcode patch is installed.

If my processor showed "version 023", that number could be coming
from whatever was used to patch it. (The patch is stored in a block
or RAM inside the processor.)  Either the BIOS applied that
patch (and by extracting the BIOS I could verify that). Or Windows
loaded that patch, when it's microcode loader ran.

That's a pretty far fetched failure mechanism - my only point in
explaining it, is to show that the processors are seldom totally
correctly designed when they leave the factory, and sometimes that patch
can be crucial to stable operation. For the early Phenoms, AMD
actually had to disable some operation in the processor (something
to do with TLB ?), and they actually added code to the BIOS for that.
That one seemed to be a bit more than a regular microcode patch.

The TLB thing is mentioned here.



Re: Flummoxed.

On 03/21/2012 11:58 PM, Grinder wrote:
Quoted text here. Click to load it

Run a RAM test

if that's ok

run the mfg's HD diagnostic

Re: Flummoxed.

On 3/22/2012 6:06 PM, philo wrote:
Quoted text here. Click to load it

Done and done.  Also, I've swapped out RAM and HD with known good

Re: Flummoxed.

On 03/22/2012 06:44 PM, Grinder wrote:
Quoted text here. Click to load it

I guess I did not read your original post well enough.
Is there anything you did not replace or substitute???  ie: try another
optical device

It is extremely unlikely that the cpu is bad...they are usually all or

I did once have a cpu heat sink that caused problems by radiating excess
rf...but that was a really odd fluke


Re: Flummoxed.

On 3/23/2012 5:13 AM, philo wrote:
Quoted text here. Click to load it

That's exactly where I am.  I've replaced every component, including the
motherboard now, and still the problem persists.  It really has to be
the CPU.

As per Paul suggestions, I have bumped up the Processor Voltage a bit.
Also, I have have stepped the clock down.  That seems to have given me a
longer mean time between failures, but the system is still unstable.

Quoted text here. Click to load it

I suppose that could be happening here.  The only original parts are the
CPU *and* the heatsink/fan.  I've ordered a replacement CPU.

Re: Flummoxed.

On 03/23/2012 10:24 AM, Grinder wrote:
Quoted text here. Click to load it

Just remembered.
I recently was working on a machine with a similar problem
and got it to work by turning off ACPI in the bios


Re: Flummoxed.

On 3/23/2012 2:15 PM, philo wrote:
Quoted text here. Click to load it

Motherboard #2 has an ACPI Settings section in the Advanced Tab.  It's
default settings are:

      Suspend to RAM             [Disabled]
      Away Mode Support          [Disabled]

      Restore on AC/Power Loss   [Power Off]
      Ring-In Power On           [Disabled]
      PCI Devices Power On       [Disabled]
      PS/2 Keyboard Power On     [Disabled]
      RTC Alarm Power On         [By OS]

      ACPI HPET Table            [Disabled]
      OSC Control                [Auto]

RTC Alarm and OSC Control both have Enabled and Disabled options.  I set
them both to Disabled, and am currently stress testing.

Re: Flummoxed.

On 3/23/2012 3:49 PM, Grinder wrote:
Quoted text here. Click to load it

It's still sporadically crashing with WHEA_UNCORRECTABLE_ERROR.  There's
a replacement CPU on the way, so I'll see how that does.

Re: Flummoxed.

Grinder wrote:
Quoted text here. Click to load it

Cool error. Did you record the parameters as well ? It could be a machine check.



Re: Flummoxed.

On 3/23/2012 10:21 PM, Paul wrote:
Quoted text here. Click to load it

Here are the parameters for the last 9 crashes:

0x0, 0xFFFFFFFF9C240028, 0xFFFFFFFFF65AC000, 0x135
0x0, 0xFFFFFFFF9C240028, 0xFFFFFFFFF65AC000, 0x135
0x0, 0xFFFFFFFF83BE03F0, 0xFFFFFFFFF66E4000, 0x135
0x0, 0xFFFFFFFF84CCB3F0, 0xFFFFFFFFB66E4000, 0x145
0x0, 0xFFFFFFFF841CD0A8, 0xFFFFFFFFB66E4000, 0x135
0x0, 0xFFFFFFFF83B353F0, 0xFFFFFFFFF66E4000, 0x135
0x0, 0xFFFFFFFF812E2028, 0xFFFFFFFFB2584000, 0x175
0x0, 0xFFFFFFFF8BCB4028, 0xFFFFFFFFB2584000, 0x175
0x0, 0xFFFFFFFF83FF03F0, 0xFFFFFFFFF665C000, 0x135

I'm getting this reportage from resplendence.com's WhoCrashed.  Is the
entire WHEA_ERROR_RECORD structure in the minidumps?  What can be used
to lay it all out postmortem?

Re: Flummoxed.

Grinder wrote:
Quoted text here. Click to load it

Your guess is as good as mine :-)

I found a picture of a STOP 0x124 here. One thing that bothers me a bit,
is the numbers don't look to exactly match the Microsoft description.

http://www.evga.com/forums/tm.aspx?m=435991&amp ;mpage=1

It's my guess, that 9C240028 and F65AC000 are 32 bit registers in the
processor. Machine Specific Registers of some sort perhaps.

I decided I'd start with AMD 26094.pdf page, and chapter
five covers "Machine Check Architecture". I'm having trouble
aligning what Microsoft is giving us there, versus the contents
of 26094.pdf. I'm not really sure I'm lining those numbers up with
the correct register definition.

There are going to be more MSRs than are listed in the Microsoft
error report. It would be nice if they were captured somewhere,
to trace what the CPU is complaining about.


Re: Flummoxed.

On 22/03/2012 04:58, Grinder wrote:
Quoted text here. Click to load it

Check your BIOS version and see if an update is required for your
particular CPU.  Paul's reply tells you why that may be necessary/
worth trying.
One other easy thing worth doing is to replace hard & optical drive
data cables, if you haven't tried that yet.  It's rare for those to
be faulty, but it does happen.

Re: Flummoxed.

On 3/23/2012 5:54 AM, Rob wrote:
Quoted text here. Click to load it

The version of BIOS in the new motherboard explicitly supports my CPU.

Quoted text here. Click to load it

Yep, changed 'em out.

Re: Flummoxed.

On 23/03/2012 15:25, Grinder wrote:
Quoted text here. Click to load it

Must be the CPU then.  I've only seen one partially working CPU before.
It was a Pentium IV which ran XP just fine, but gave BSODs in W2k.
Swapping the CPU and nothing else fixed it (I did that a couple of
times to check as I couldn't quite believe it would run XP but not 2k..)

Re: Flummoxed. [Update]

On 3/21/2012 11:58 PM, Grinder wrote:
Quoted text here. Click to load it

I've replaced the CPU with one of the same spec (Thank you Goodwill of
Austin, TX) and the machine has been stable, under stress, for about 50
consecutive hours.

This mode of failure is still perplexing, but I cannot deny what has
fixed it.

Re: Flummoxed. [Update]

Grinder wrote:

Quoted text here. Click to load it

I guess that's one expensive repair, in terms of the time you
put into it. I hope the owner is appreciative.


Re: Flummoxed. [Update]

On 3/27/2012 10:38 PM, Paul wrote:
Quoted text here. Click to load it

It's really undermined my diagnostic method.  Lots of times I get down
to claiming that it has to be the motherboard or processor, by a process
of elimination, but it's been easy to say: "I've never seen it be the
processor--there are just too many ways for the motherboard to go
wrong."  I can't say that anymore.

Re: Flummoxed. [Update]


Quoted text here. Click to load it
I know the feeling.

One of my own computers has a Biostar P4M800CE-8237 motherboard. A couple
of years ago it started freezing up on me ocassionly. Like you I tried
everything and finally concluded it was the motherboard.

I found one on eBay that included a Pentium 4 2.93 GHz processor.
Mine had a 3.2 GHz processor so when I got the motherboard I put the faster
processor in it. Same symptoms.

So I took the 3.2 out and put the 2.93 that came with it back in and have
never had a problem with it since. Of course in changing times it has been
demoted to my "shop" computer but it is dependable.

I too had never seen a processor be the culprit in such a situation.

I think I finally tossed the old (but good) motherboard away.

            -- I'm out of white ink --

Re: Flummoxed. [Update]

On 03/27/2012 08:39 PM, Grinder wrote:
Quoted text here. Click to load it

Looks like the cpu was bad...or maybe a pin was not making good contact.


Site Timeline