| InfoDoc ID | Synopsis | Date | ||
| 49202 | Sun Fire[TM] 3800-6800: CPU/Memory Board Dynamic Reconfiguration (DR) Considerations | 8 Jan 2003 |
| Status | Issued |
| Description |
This document provides guidance for using dynamic reconfiguration (DR) on CPU/Memory boards in Sun Fire 3800-6800 systems. Using DR for configuration changes, servicing CPU/Memory boards increases the overall application uptime. The Solaris[TM] Operating Environment (OE) and applications keep running while DR tasks are performed.
TERMS:
When describing DR operations, this document uses terms as defined by the Solaris OE DR command cfgadm.
Configure Adding a CPU/Memory board to a running Domain.
Disconnect Removing a CPU/Memory board from running Domain.
ASSUMPTIONS:
It's assumed that the Domain and the Sun Fire System Controller (SSC) are running the minimum Solaris/Firmware versions required for supporting DR:
All loaded third party device drivers have to fulfill the Device Driver specifications (See Writing Device Drivers
http://www.sun.com/io_technologies/pci/pci.cards.cat.html
CONFIGURING A CPU/MEMORY BOARD INTO A RUNNING DOMAIN:
High level overview of Procedure:
Detailed Procedure:
sunfire-sc0:SC> showboards -p prom
Component Compatible Version
--------- ---------- -------
SSC0 Reference 5.13.4
SB0 Yes 5.13.4
/N0/SB2 Yes 5.13.4
/N0/IB6 Yes 5.13.4
/N0/IB8 Yes 5.13.4
If the CPU/Memory board has a different Firmware level, use the flashupdate command on the SSC to change it appropriately.
#cfgadm
Ap_Id Type Receptacle Occupant Condition
N0.IB6 PCI_I/O_Boa connected configured ok
N0.SB0 CPU_Board_V connected configured ok
N0.SB2 CPU_Board_V disconnected unconfigured unknown
c0 scsi-bus connected configured unknown
#cfgadm -o platform=diag=default -c configure N0.SB2
Dec 9 09:19:01 domA unix: cpu 0 initialization complete - restarted
Dec 9 09:19:01 domA unix: cpu 1 initialization complete - restarted
Dec 9 09:19:01 domA unix: cpu 2 initialization complete - restarted
Dec 9 09:19:01 domA unix: cpu 3 initialization complete - restarte
DISCONNECTING A CPU/MEMORY BOARD FROM A RUNNING DOMAIN
High level overview of Procedure:
Detailed Procedure:
#cfgadm -av | grep memory
Ap_Id Receptacle Occupant Condition Information
When Type Busy Phys_Id
N0.SB0::memory connected configured ok base address 0x0, 8388608 KBytes total,
1529776 KBytes permanent
Jul 26 13:30 memory n /devices/ssm@0,0:N0.SB1::memory
N0.SB2::memory connected configured ok base address 0x400000000, 8388608 KBytes total
Jul 26 13:30 memory n /devices/ssm@0,0:N0.SB2::memory
In the example above, CPU/Memory SB0 contains permanent memory.
sunfire-sc0:SC> showdomain -p bootparams
diag-level = quick
verbosity-level = max
error-level = max
interleave-scope = within-board
#pbind
process id 181: 0
In this example process id 181 is bound to CPU 0. If CPU 0 is on the CPU/Memory board which should be disconnected, the process must be bound to a different CPU. This can be done with pbind as well.
If the requirements are fulfilled, proceed to step 3. If the CPU/Memory contains permanent memory, additional requirements have to be met. The system will automatically check for these conditions and the DR operation will abort if not met.
#ps -efc
UID PID PPID CLS PRI STIME TTY TIME CMD
root 0 0 SYS 96 02:56:47 ? 0:00 sched
root 1 0 TS 58 02:56:47 ? 0:00 /etc/init -
root 367 1 RT 140 19:23:16 ? 0:00 /opt/perf/bin/midaemon
Real Time processes can be identified by the RT tag in the CLS column. In the above example, the midaemon with PID 367 is running in the RT class.
# modinfo | grep STMS
120 781e4000 4834 - 1 STMS (Multipath Interface Library)
# modinfo | grep scsi_vhci
121 781ea000 6a20 225 1 scsi_vhci (Sun Multiplexed SCSI vHCI)
If the requirements are checked and satisfied, initiate the disconnect operation with the cfgadm command:
#cfgadm -c disconnect N0.SB0
On disconnecting a CPU/Memory board, the following messages are logged in /var/adm/messages:
Dec 9 09:12:43 domA genunix: /ssm@0,0/memory-controller@3,400000 (mc-us36) offline
Dec 9 09:12:43 domA genunix: /ssm@0,0/memory-controller@2,400000 (mc-us35) offline
Dec 9 09:12:43 domA genunix: /ssm@0,0/memory-controller@1,400000 (mc-us39) offline
Dec 9 09:12:43 domA genunix: /ssm@0,0/memory-controller@0,400000 (mc-us34) offline
References:
Sun Fire 3800-6800 Servers Dynamic Reconfiguration Blueprint (
Sun Fire 3800-6800 Systems Dynamic Reconfiguration Users Guide (
Sun Cluster 3.0 Concepts (
Writing Device Drivers (
man page cfgadm, cfgadm_sbd, rcmscript
Keywords:
Sun Fire 3800-6800, Dynamic Reconfiguration, DR, best practices, permanent
INTERNAL SUMMARY:This is a living document. As features/requirement change, all attempts to keep this document current will be made. If while using its content, an oversight or discrepancy is noted, contact the submitter.
Internally the following URLs are most useful:
http://pts-americas.west.sun.com/esg/msg/techinfo/platform/sun_fire/
http://systems.corp.sun.com/tools/salestools/datacenter/avail/dr/index.html
SUBMITTER: Peter Gonscherowski BUG REPORT ID: 4618861 APPLIES TO: Hardware/Sun Fire /3800, Hardware/Sun Fire /4800, Hardware/Sun Fire /4810, Hardware/Sun Fire /6800 ATTACHMENTS: