The history of the RIM10B loader is included at the end of this
document.
This program is the bootstrap loader that was used to do a "cold
start" on a PDP-10 (model KA or KI) via the paper-tape reader. It
reads 36-bit words from 6 consecutive 8-bit paper-tape frames, loads
them into core memory, and verifies checksums. The entire program is
less than 16 words long, and therefore can fit into locations 000000
through 000017 (octal) - the locations used by the accumulators when
the "FM Enable" switch is off. (FM = Fast Memory = 16 accumulators, 36
bits each, implemented as flip-flops instead of core memory.)
RIM10B Loader
RIM10B ; Causes RIM10B loader to be punched
00/ 777762,,0 XWD -16,0
01/ 710600,,60 ST: CONO PTR,60
02/ 541400,,4 ST1: HRRI A,RD+1
03/ 710740,,10 RD: CONSO PTR,10
04/ 254000,,3 JRST .-1
05/ 710470,,7 DATAI PTR,@TBL1-RD+1(A)
06/ 256010,,7 XCT TBL1-RD+1(A)
07/ 256010,,12 XCT TBL2-RD+1(A)
10/ 364400,,0 A: SOJA A, ; Magic occurs here ****
11/ 312740,,16 TBL1: CAME CKSM,ADR
12/ 270756,,1 ADD CKSM,1(ADR)
13/ 331740,,16 SKIPL CKSM,ADR
14/ 254200,,1 TBL2: JRST 4,ST
15/ 253700,,3 AOBJN ADR,RD
16/ 254000,,2 ADR: JRST ST1
17/ CKSM=ADR+1
Here is an example of a two-word program as output by RIM10B
17/ 777776,,777 LOC 1000 ; Set starting address
20/ 201740,,3777 START: MOVEI 17,4000-1
21/ 505740,,777600 HLRI 17,-200
22/ 707677,,4576 ; Sum of previous 3 words
23/ 254000,,1000 END START
Analysis
RIM
When the Read-In Mode (RIM) switch is pressed on the console of
a KA or KI, it sends a reset pulse down the I/O bus, sets the
PC flags to zero, and executes "DATAI D,0" (where D is the
device code selected by a set of 7 switches, the paper tape
reader is device 104). The DATAI reads in an IOWD, which has
the negative word count in the left half and starting address
minus one in the right half. The CPU then repeatedly executes
"BLKI D,0" until the left half of location 0 reaches zero.
("BLKI D,X" increments both halves of location X, reads in a
word from device D, and stores it the address that the right
half of location X now points to.)
00/ XWD -16,0
Transfer 16 octal (14 decimal) words, starting at location 1.
01/ST: CONO PTR,60
Start paper tape reader in binary mode
02/ST1: HRRI A,RD+1
Reset finite-state machine to looking for IOWD
State RD+1 = Looking for IOWD or JRST
03/RD: CONSO PTR,10
Read paper tape reader status, skip if "DONE" bit is set
04/ JRST .-1
Not set, keep looping until the bit does get set
05/ DATAI PTR,@TBL1-RD+1(A)
Index register A has RD+1, indexing TBL1-RD+1+RD+1 is TBL1+2,
which is the SKIPL CKSM,ADR instruction, therefore the
effective address is ADR. Store the IOWD in ADR.
06/ XCT TBL1-RD+1(A)
Same effective address, "SKIPL CKSM,ADR" loads the IOWD into
accumulator CKSM, and skips next instruction because its
negative.
07/ XCT TBL2-RD+1(A)
Not executed first time around. At the end of the tape, a JRST
instruction will be read in instead of an IOWD. (JRST is opcode
254, which is postitive). TBL2-RD+1+RD+1 is TBL2+2, which is
ADR. The JRST instruction which was just read in is executed,
and that causes the PC to jump to the beginning of the program.
10/A: SOJA A,RD+1
Set the PC to RD+1, subtract one from index register A (so it
now has RD in the right half, then jump to the original address
(RD+1).
Note: This is a self-modifying instruction. The CPU, however,
remembers the effective address that the instruction used to
have.
04/ JRST .-1
Jump to location 3.
State RD+0 = Reading in data words
03/RD: CONSO PTR,10
Read paper tape reader status, skip if "DONE" bit is set
04/ JRST .-1
Not set, keep looping until the bit does get set
05/ DATAI PTR,@TBL1-RD+1(A)
Index register A has RD+0, indexing TBL1-RD+1+RD+0 is TBL1+1,
which is the ADD CKSM,1(ADR) instruction, therefore the
effective address is one greater than what ADR points to. Store
the data in memory.
06/ XCT TBL1-RD+1(A)
Same effective address, "ADD CKSM,1(ADR)" adds the word read in
to the additive checksum in accumulator CKSM.
07/ XCT TBL2-RD+1(A)
The address is TBL2-RD+1+RD+0 which is TBL2+1. That location
has "AOBJN ADR,RD". Add one to both halves of accumulator ADR.
If the result is still negative, loop back to RD (location 3).
If non-negative, continue on at location 10.
10/A: SOJA A,RD+0
Set the PC to RD+0, subtract one from index register A (so it
now has RD-1 in the right half, then jump to the original
address (RD+0).
Note: This is a self-modifying instruction. The CPU, however,
remembers the effective address that the instruction used to
have.
State RD-1 = Reading in checksum
03/RD: CONSO PTR,10
Read paper tape reader status, skip if "DONE" bit is set
04/ JRST .-1
Not set, keep looping until the bit does get set
05/ DATAI PTR,@TBL1-RD+1(A)
Index register A has RD-1, indexing TBL1-RD+1+RD-1 is TBL1+0,
which is the CAME CKSM,ADR instruction, therefore the effective
address is ADR. Store the expected checksum in ADR.
06/ XCT TBL1-RD+1(A)
Same effective address, "CAME CKSM,ADR" compares the calculated
checksum in accumulator CKSM with the expected checksum stored
in memory location ADR. Skip the next instruction if they're
equal.
07/ XCT TBL2-RD+1(A)
The address is TBL2-RD+1+RD-1 which is TBL2+0. That location
has "JRST 4,ST" which is a HALT instruction. If the previous
compare instruction failed, set the program counter to ST and
halt the CPU. This allows the operator to back up the paper
tape reader and try again. If the CAME succeeded, this HALT is
not executed.
10/A: SOJA A,RD+1
Set the PC to RD+1, subtract one from index register A (so it
now has RD-2 in the right half, then jump to the original
address (RD+1). This jumps to location ST1, which resets the
finite-state machine.
Dispatch table for finite-state machine
11/TBL1: CAME CKSM,ADR
In state RD-1, read expected checksum into ADR, then compare
calculated checksum with expected checksum.
12/ ADD CKSM,1(ADR)
In state RD+0, store data word into memory, then add data word
into running checksum.
13/ SKIPL CKSM,ADR
In state RD+1, store IOWD or JRST in ADR, then load that word
into accumulator CKSM and skip if the word is negative.
14/TBL2: JRST 4,ST
If the checksum comparison fails, halt the CPU, with ST in the
PC.
15/ AOBJN ADR,RD
In state RD+0, increment the IOWD and jump to RD if more to go.
16/ADR: JRST ST1
This is the last word of the RIM10B loader. When the hardware
read-in process is completed, this instruction is executed to
start the program.
17/CKSM=ADR+1
The additive checksum is calculated using this accumulator.
Storing bootstrap in core memory
The FM ENB switch enables Fast Memory, causing references to the
accumulators (locations 00 through 17) to go to RAM instead of core
memory. When FM ENB is off, the above bootstrap can be toggled into
locations 01 through 16. (Locations 00 and 17 need not be
initialized.)
_________________________________________________________________
Notes
History of the RIM10B loader
From: Bob Clements
To: inwap@best.com
Subject: Re: RIM10B bootstrap loader for the PDP-10
Newsgroups: alt.folklore.computers,alt.sys.pdp10
Hi, Joe,
As I said in a previous article about RUNOFF, I'm avoiding USENET
postings until I get a little spare time to fake my address to avoid
email spam. Feel free to post this if you omit my email address except
in the form I put on the last line.
>>The true magic of the PDP-10 instruction set was in the RIM-10B loader.
>>The program is 14 instructions long and uses 2 accumlators for data.
>>It reads in 36-bit words from an 8-bit paper tape reader, deposits the
>>data in memory, and verifies the checksums of the program it is loading.
Add "and can be restarted on a block boundary in case of a checksum
error", which was another requirement of the task.
>Somewhere I read the story of how it was created; a really bright hacker
>was tricked into creating it when his colleagues kept saying that it
>couldn't be done. Anyone have the details?
> -Joe
No trickery. Just the challenge of doing it.
I think I posted this some years ago. If anyone has an old copy they
can compare my current fading memory to the old version. Here's how I
remember it now.
This loader was written in an all-out brainstorming effort. It
happened at DEC, Maynard, building 5 rather than at TMRC or Project
MAC where other such fests happened.
Somehow the challenge came up of writing a paper tape loader that
would not require the use of any fixed memory locations. The idea was
that any program might be loaded in pieces, and you wouldn't want to
clobber any previous part with storage/code used by the loader. Also,
to take dumps of a dead program, we didn't want to clobber any core.
This should fit entirely in the ACs.
A few previous attempts had been done, but they all took somewhat more
than sixteen words. Finally, a bunch of serious bit bummers decided to
work on it and get it solved.
My memory may be wrong, but I think the group that worked on it
included Alan Kotok, Tom Eggers, Dave Gross and myself, and a couple
more whom I'm not so sure of. Maybe Peter Hurley? Maybe Tom Hastings?
Anyway, we came up with a LOT of ways of doing it in fifteen
instructions plus the two registers to hold the checksum and the AOBJN
pointer. Seventeen words in all. We considered using location 40 for
the 17th word, but that didn't feel fair.
Then, out of the blue, Dave Gross came in with this wonderful hack,
using two indexed XCT instructions, which was totally unlike anything
else we had tried. It used thirteen instructions plus the registers,
so it could fit in the ACs WITH the count word needed by the RIM
hardware! Register zero was actually a spare. Most other attempts had
used it for the checksum.
There was great rejoicing, and the quiet, reserved Dave Gross actually
looked quite pleased with himself.
Bob Clements, K1BC, my-last-name at BBN dot COM. (w) +1 617 USE K1BC
Disclaimer
From: JCGreen@ix.netcom.com (John C Green Jr)
By the time the 1971 edition of "PDP-10 Reference Handbook" was
published there had been so many questions asked by people using it as
an example of good programming technique that this comment was added
in the margin:
This loader is written for min-
imum size and is quite com-
plex. Do not approach it as a
simple programming example.
Note from Dave
From: "Dave Gross HLO2-2/B10, pole G13, dtn 225-4317 31-Mar-1998 1403
-0500"
Subject: RE: History of Rim10b loader
Date: Tue, 31 Mar 98 14:03:56 EST
I am the Dave Gross mentioned by Bob Clements in the History of the
rim10b loader page. I don't remember Peter Hurley or Tom Hastings
working on the code, but there was someone else. I'm not sure
who...maybe Peter Conklin or Dave Stone ... who motivated the effort.
The problem was presented to me as a theoretical one: is it possible
for a paper tape loader to fit in the register space, load data,
compute and check checksums, and jump to the loaded program when done?
The others came close as Bob pointed out. My challenge was to "bum"
one more location out of the loader. I don't remember how I found that
SOJA hack, but when the dust settled, the loader was 2 instructions
shorter. Indeed, I nearly broke my arm patting myself on the back.
Then, to my surprise, we actually made use of the loader for most
paper tapes. The loader was written up in the programming manuals but
very few programmers could figure it out. I kept getting phone calls
about that loader for years afterward - right up to the time the 10/20
line was retired.
Dave
Effective address calculation
Later versions of the "Processor Reference Manual" had this paragraph
repeated twice:
PLEASE READ THIS
The calculation of E is the first step in the execution of every
instruction. No other action taken by any instruction, no matter
what it is, can possibly precede that calculation. There is
absolutely nothing whatsoever that any instruction can do to any
accumulator or memory location that can in any way affect its own
effective address calculation.
Note that "A: AOJA A," does not mean "increment accumulator 10 and
then set the PC to the current value of that accumulator". Instead,
the effective address E is calculated first, then the accumulator is
incremented, then the PC is set to the remembered value of E.