This is Interesting: Free IT Magazines  
Home > Archive > Sun Solaris Hardware > December 2004 > CPU FRUprom missing





You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

Author CPU FRUprom missing
yh

2004-12-07, 5:52 pm

I have a SunBlade1000 Ultra Sparc-III with two 750 processors and with the
following prob:

First, after running for a while (which means in this case 5 minutes to over
1 day) it froozes. You can do nothing, just a hard reset.
In the messages I found some errors:

406394 kern.info] NOTICE: [AFT0] WDC Event detected by CPU0 at TL=0,
errID 0x00000163.a08630f8
[ID 846041 kern.info] [AFT0] errID 0x00000163.a08630f8 Data Bit 97 was in
error and corrected
[ID 152727 kern.info] NOTICE: [AFT0] EDC Event detected by CPU0 at TL=0,
errID 0x00000163.a9148d00
0x00000010<EDC>.00000121 AFAR 0x00000000.1b070020
0x10033e30 Esynd 0x0121
[ID 730578 kern.info] [AFT0] errID 0x00000163.a9148d00 Data Bit 97 was in
error and corrected
[ID 143566 kern.info] [AFT2] errID 0x00000163.a9148d00 PA=0x00000000.1b070000
[ID 511436 kern.info] NOTICE: [AFT0] EDC Event detected by CPU0 at TL=0,
errID 0x00000166.55577490
0x00000010<EDC>.00000121 AFAR 0x00000000.1c87d840
ID 149807 kern.info] [AFT0] errID 0x00000166.55577490 Data Bit 97 was in
error and corrected

[ID 270833 kern.info] cpu0: UltraSPARC-III (portid 0 impl 0x14
ver 0x34 clock 750 MHz)
083kern.info] cpu1: UltraSPARC-III (portid 1 impl 0x14 ver 0x34 clock 750 MHz)
Dec 1 09:31:54 xxxxxxxxxx ID 721127 kern.info] cpu 1
initialization complete - online


The events repeated more and more often. So I put CPU0 offline via psardm.
But after 6 hours the machine freezes again with an WDC error.
OK...but now, no reboot, no hard reset just two beeps after starting the
machine.
No output on the monitor. I get along with a serial connection, to show me some
diag information.
Here it is (it shows when a cpu is in slot 0, with a cpu only in slot 1
theres it shows a tem misconfiguration). It doesn't matter if I
put the cpu0 or the cpu1 in slot0.

CPU FRU access failed, Using ver@
CPU seeprom format: 0000.0000.0000.0001. .ERROR:
SYSTEM FRUprom missing?
Powering OFF System

Our guess is, that slot0 is destructed. But maybe there is hope.......

bye,
yvonne


Fredrik Lundholm

2004-12-07, 5:52 pm

In article <slrncrb7u6.5fi.yvonne@cayenne.imtek.uni-freiburg.de>,
yh <yhaller@imtek.de> wrote:
>I have a SunBlade1000 Ultra Sparc-III with two 750 processors and with the
> following prob:



>[ID 511436 kern.info] NOTICE: [AFT0] EDC Event detected by CPU0 at TL=0,
> errID 0x00000166.55577490
>0x00000010<EDC>.00000121 AFAR 0x00000000.1c87d840
>ID 149807 kern.info] [AFT0] errID 0x00000166.55577490 Data Bit 97 was in
> error and corrected


Seems like CPU0 is bad.

>The events repeated more and more often. So I put CPU0 offline via psardm.


Good idea, however the first CPU is still used to connect to all
the memory available. So even if no jobs are executed the memory
controller and the system bus of that CPU is still used.

>Here it is (it shows when a cpu is in slot 0, with a cpu only in slot 1
>theres it shows a tem misconfiguration). It doesn't matter if I
>put the cpu0 or the cpu1 in slot0.


I would first try to put the old CPU1 in slot 0 and make sure it
is properly seated using the correct torque. If this doesn't help I would
try to take out the first 4 memory DIMMs and putting the other 4 in their
place.

In the procedure CPU0 is always taken out and later the first 4 memory
DIMMs are taken out.

/wfr
FREDRIK

--
Fredrik Lundholm
dol @ ce.chalmers.se

yh

2004-12-08, 7:50 am

Fredrik Lundholm <dol@ce.chalmers.se> schrieb:
> In article <slrncrb7u6.5fi.yvonne@cayenne.imtek.uni-freiburg.de>,
> yh <yhaller@imtek.de> wrote:
>
>
>
> Seems like CPU0 is bad.
>
>
> Good idea, however the first CPU is still used to connect to all
> the memory available. So even if no jobs are executed the memory
> controller and the system bus of that CPU is still used.
>
>
> I would first try to put the old CPU1 in slot 0 and make sure it
> is properly seated using the correct torque. If this doesn't help I would
> try to take out the first 4 memory DIMMs and putting the other 4 in their
> place.

OK, I did as you told me, but with 2 memory DIMMs (I have only four in total)
but....same as before: the FRUprom (whatever this is..I didn“t find a
useful description) is not to be found.

bye,
Yvonne
Sponsored Links






Free braindumps | Software forum | Database administration forum

Copyright 2003 - 2010 webservertalk.com