| SRDB ID | Synopsis | Date | ||
| 48370 | Sun Fire[TM] 12K/15K: Dstop: Slot1 dtransid input parity error | 26 Nov 2002 |
| Status | Issued |
| Description |
- Problem Statement/Title:
Dstop: Slot1 dtransid input parity error
- Symptoms:
'wfail' output reports something similar to the following:
01 redxl> dumpf load dsmd.dstop.020513.1102.14
02 Created Mon May 13 11:02:17 2002
03 By hpost v. 1.2 Generic 112488-04 Mar 18 2002 14:43:00 executing as pid=7485
04 On ssc name = rasputin-sc0.SD_RASCAL.West.Sun.COM
05 Domain = 0=A Platform = rasputin
06 Boards in dump: master SC CPs/CSBs[1:0]: 3
07 EXB[17:0]: 12100
08 Slot0[17:0]: 12100
09 Slot1[17:0]: 12100
10 -D option, -d
11 "DSMD DomainStop Dump"
12 0 errors occurred while creating this dump.
13 redxl> wfail
14 SDI EX08/S0 Master_Stop_Status0[31:0] = 2004000F
15 MStop0[3:0]: All SDI logic is DStopped + Recordstopped.
16 SDI EX08/S0 Dstop0[31:0] = 01018100
17 Dstop0[16]: D DARB texp requests all Dstop (M)
18 Dstop0[24]: D 1E SDI internal Slot1 port requested Dstop
19 SDI EX08/S0 Slot1_Error1[31:0] = 08008800 Mask = 31404EBF
20 S1Err1[27]: D 1E Slot1 dtransid input parity error (M)
21 slt1_{datidbusp,datareqin,datid_vld,datareqout} = 4
22 slt1_datidbus[11:0] = 000
23 FAIL Slot IO8: Dstop/Rstop detected by SDI
24 Primary service FRU is Slot IO8.
25 Secondary service FRU is EXB EX8.
26 SDI EX13/S0: All SDI is DStopped and RStopped, requested by DARB.
27 SDI EX16/S0: All SDI is DStopped and RStopped, requested by DARB.
28 DARB C0: enabled ports (expanders) [17:0]: 16100
29 DARB C0: other darb req Dstop+Rstop for exps[17:0]: 00100
30 DARB C1: enabled ports (expanders) [17:0]: 16100
31 DARB C1: other darb req Dstop+Rstop for exps[17:0]: 00100
SOLUTION SUMMARY:
- Troubleshooting:
The dump header tells us that this Dstop was generated by dsmd (lines 10,11)
while a domain was active. This is also evident by the dumpf file name -
dsmd.dstop files are created by dsmd as part of an ASR. Walking the
error chain:
- Master SDI on EX8 calls for Dstop as directed by itself (line 18)
- Master SDI on EX8 detects a DTransID input parity error for
Slot 1 (line 20).
- IO8 is FAILed from the configuration (line 23)
- IO8 and EX8 are named as primary and secondary FRUs (lines 24,25)
The SDC on a L1 board and the SDI exchange DTransID and DTarg information.
The data exchange is parity protected. Since these pathways cross an
interconnect, a single FRU cannot be isolated. Suspect components are
IO8 and EX8.
- Resolution:
Replace IO8. If errors persist, replace EX8.
- Summary of part number and patch ID's
http://infoserver.central/data/syshbk/Devices/I_O/IO_SunFire_15K_hsPCI_IO_Board.html
http://infoserver.central/data/syshbk/Systems/SunFire15K/component.centerplane.html
- References and bug IDs
SunSolve Article 48122
SDI ASIC Specification
- Additional background information:
DTransIDs are used for arbitration, deriving data path steering, and
data identification (tags). The SDC must request access to use the data
bus to the SDI. When granted, the SDC supplies the target information
and generates parity. The SDI receives the information and checks
parity. A parity error constitutes a Dstop.
The SDC<-->SDI Slot 1 interface signals are double pumped across a half
width path. The arbitration connections between the SDI and SDC on a
Slot 1 board are as follows:
slt1_datareqin_l Unidirectional SDC request to use bidirectional path
slt1_datareqout_l High Priority request to SDC to get off the data path
in 2 cycles
slt1_dataid_vld_l Only valid for an L1 hPCI board. Unidirection phase signal
generated by the SDC and sampled by the SDI. It is monitored
as a paranoid phase check.
slt1_dataidbus_l<11:0> Bidirectional data tag and arbitration steering information.
Only bits [5:0] are valid in an hPCI board and have different
meanings across cycles. For an MCPU, all bits [11:0] are used
and repeated across cycles.
slt1_dataibusp_l Unidirectional parity on above signals. Generated only by
the SDC and sent to the SDI for checking
In the case of a parity error, the wfail provides an indication of the
faulty signal (line 21), but the overall diagnosis is as above.
- Keywords
15K, 12K, SF15K, SF12K, Sun Fire 15K, Enterprise, Server, Sun Fire 12K,
Slot1 dtransid input parity error, dstop
INTERNAL SUMMARY:
SUBMITTER: Scott Davenport APPLIES TO: Hardware/Sun Fire /15000, Hardware/Sun Fire /12000 ATTACHMENTS: