| SRDB ID | Synopsis | Date | ||
| 48483 | Sun Fire[TM] 12K/15K: POST: IBIST Failures | 21 Nov 2002 |
| Status | Issued |
| Description |
- Problem Statement:
POST: IBIST Failures
- Symptoms:
POST reports an IBIST failure. Some examples: #1:
stage ibist: Interconnect BIST...
AXQ-RMX IBIST...
ERR: IBIST error: AXQ EX3 RMX C0 Exp 0x0aaaaaaaa Obs 0x03c345555 XOR 0x0969effff.
FAIL EXB EX3: IBIST failure
Primary service FRU is EXB EX3.
Secondary service FRU is CSB C0 or the logic centerplane.
#2:
stage ibist: Interconnect BIST...
ERR: IBIST error: DMX C1/D0 SDI EX4/S3 Error bits = 0x1555554. FAIL EXB EX4: IBIST failure SOLUTION SUMMARY:
- Troubleshooting:
IBIST is the Interconnect built-in-self-test between two ASICs.
One of the ASICs acts as the master driving preset/programmable
bit patterns, and the other ASIC receives the patterns and then
echoes them back. If the echoed pattern received by the master
does not match the original pattern, the test fails.
In the first example above, AXQ EX3 is the master and RMX0 is the
slave. The AXQ EX3 is expecting the pattern 0x0aaaaaaaa, but
0x03c345555 is received. 0x0aaaaaaaa XOR 0x03c345555 = 0x0969effff
shows the bits in error.
However, note that example #1 is bug 4704614 , corrected in SMS 1.2
patch 112488-10 (or higher).
- Resolution:
If the IBIST failure is an AXQ<-->RMX0 error, first confirm that
POST patch 112488-10 (or higher) is applied to the system. Otherwise,
all IBIST failures within close proximity must be considered when
deciding the appropriate FRU. If there's only a single failure,
as shown above, it is logical to replace what POST suggests as
the primary FRU: EX3 in this example.
However, if multiple IBIST failures are present, they must be
considered holistically. For example, suppose SDI2 on 4 expanders
all report IBIST failures to a given DMX. Taken together, this would
call the DMX (i.e., the centerplane) into question as it is unlikely
that multiple expanders would fail.
Finally, improper board seating is a possible cause for IBIST
failures. If a service action involving a suspect FRU was recently
conducted, check seating.
- Summary of part number and patch ID's
112488-10
- References and bug IDs
4704614
- Additional background information:
For details on what IBIST tests are available, refer to the
online documentation in 'redx'.
redx> ? ibist
Under no circumstances should IBIST be executed on a component
supporting a running domain. It will crash all domains relying
on that component. Furthermore, if IBIST is run manually, the
component must be power cycled after completion to return the
ASIC(s) to a known, clean state. Refer to bug 4743556 for an
example of why.
- Meta-Data/Problem categorization:
Product/Platform: SF12K/SF15K
Category:
- Keywords
15K, 12K, SF15K, SF12K, Sun Fire 15K, Enterprise, Server, Sun Fire 12K,
post, ibist
INTERNAL SUMMARY:
SUBMITTER: Scott Davenport BUG REPORT ID: 4704614, 4704614, 4743556 PATCH ID: 112488-10, 112488-10, 112488-10 APPLIES TO: Hardware/Sun Fire /15000, Hardware/Sun Fire /12000 ATTACHMENTS: