Multiplication instructions
Opcode | P/U | Category | Description |
DSL |
user | ALU: multiply | double shift left |
MH |
user | ALU: multiply | multiply high |
MHL |
user | ALU: multiply | multiply high and low |
MHL0 |
user | ALU: multiply | multiply high and low, tribble 0 |
MHL1 |
user | ALU: multiply | multiply high and low, tribble 1 |
MHL2 |
user | ALU: multiply | multiply high and low, tribble 2 |
MHL3 |
user | ALU: multiply | multiply high and low, tribble 3 |
MHL4 |
user | ALU: multiply | multiply high and low, tribble 4 |
MHL5 |
user | ALU: multiply | multiply high and low, tribble 5 |
MHNS |
user | ALU: multiply | multiply high no shift |
ML |
user | ALU: multiply | multiply low |
DSL
Double shift left
Syntax |
c = a dsl b |
Register | Signedness |
All | ignored |
1 opcode only |
Flag | Set if and only if |
N |
bit 35 of the result is set |
Z |
all result bits are zero |
T |
flag does not change |
R |
flag does not change |
DSL
(double shift left) is a critical instruction for long multiplication, providing in one CPU instruction what would otherwise take four instructions. Sample code is available under 36-bit multiplication.
DSL
adds the T
flag with wrapping to b
, and then shifts the sum left six bits. The six vacated bits are filled using the six leftmost bits of a
. The result is written to c
.
N
and Z
are set as if the destination is a signed register. The N
and Z
flags have no purpose in the long multiplication application for which DSL
was designed, but I chose to update them in case someone invents a use for this information at a later date. T
and R
do not change.
This documentation does not match what the dissertation says about DSL
, in that the left and right operands have since been interchanged. This switch was made so that DSL
can directly obtain the correct register copy after an MHL5
instruction in long multiplication.
MH
Multiply high
Syntax |
c = a mh b |
Register | Signedness |
All | ignored |
1 opcode only |
Flag | Set if and only if |
N |
never; flag is cleared |
Z |
c = 0 |
T |
c mod 64 ≠ 0 |
R |
T is set or R is already set |
This is a key instruction for unsigned “short” multiplication where one of the factors fits into six bits, and the product fits into 36 bits. The smaller of the factors must be copied into all of the tribbles via CX
or an assembler constant. MH
multiplies the tribbles of a
and b
pairwise, but the six 12-bit results cannot fit the 6-bit spaces afforded by the tribbles of c
. Instead, MH
retains only the six most significant bits of each 12-bit result. ML
is the complementary instruction that retains the six least significant bits of each.
To meaningfully add the output of MH
and ML
, their place values must be aligned consistently, meaning that MH
needs a 6-position left shift, and that the result can spill to as many as 42 bits (which will not fit in a 36-bit register) as a result of that shift. The solution is that instead of shift, MH
rotates its result six bits left. If the six bits rotated into the rightmost places are not all zeros, the T
and R
flags are set because the eventual product will not fit in 36 bits. Otherwise T
is cleared, R
is left unchanged, and the output of MH
can be directly added to ML
to obtain the 36-bit product. Z
will be set if the output of MH
is all zeros. N
is always cleared.
Here is an unsigned short multiplication example with full range checking, and an always-accurate Z
flag at the end whether or not overflow occurs. Four instructions are needed. When multiplying by a small constant, the CX
can be optimized out by hand.
unsigned big small t result ; will multiply big * small t = cx small ; copy small into all tribbles result = big mh t ; high bits of product t = big ml t ; low bits of product result = result + t ; result is now big * small
Warning
Like other macros, CX
is not yet available. Although it may be tempting to use SWIZ
in place of CX
like this:
t = small swiz 0
SWIZ
will not check to verify small
is between 0 and 63. CX
will have this verification and set T
and R
if small
is out of range.
Note about replacing MH
with MHL
The MHL
family of instructions cannot improve over the performance of MH
and ML
for short multiplication, because MH
is able to include a 6-bit shift that MHL
and its derivatives cannot. (The issue is that only the beta RAMs can shift six bits, and only the gamma RAMs can split registers.) The MHL
family is for long multiplication.
MHL
Multiply high and low
Syntax |
c = a mhl b |
Register | Signedness |
All | ignored |
1 opcode only |
No flags changed |
MHL
is the flagship of the MHL
family of instructions for long multiplication and is the most flexible, although ordinarily the MHL0
through MHL5
instructions are used instead. MHL
is a simultaneous execution of MHNS
and ML
, where the left and right copies of register c
are allowed to desynchronize. Specifically, the MHNS
result is stored in the left copy of register c
, and the ML
result is stored in the right copy.
For a drawing that shows the two copies of the register file in relation to the architecture, see page 200 of the dissertation. A short discussion of register splitting can be found on page 187; however, that discussion assumes the presence of a fast hardware multiplier that stores an entire multiplication result in a split register. MHL
stores a partial result.
MHL
multiplies the tribbles of a
and b
pairwise. In order to fit the six 12-bit results into c
, the six most significant bits of each product are written to the left copy of c
, and the six least significant bits of each product are written to the right copy of c
. These writes are done simultaneously. To preclude any semantic confusion as to whether flags follow the left or right copy of a result, none of the MHL
instructions change any CPU flags at all.
The differing values in the left and right copies of c
can be selected in subsequent instructions by using the left and right operand positions of subsequent ALU instructions. Certain ALU instructions such as shifts are not symmetric in their left and right operands, so very unusual code may require an intervening instruction to transfer a value from one copy of the register file to the other copy. There are also two instructions that you probably don’t need to worry about after MHL
, namely BOUND
and WCM
, where the syntactic left operand is actually the electrically right operand and vice versa. Also, most assignment instructions place the electrically left operand on the right side of the =
sign.
MHL0
Multiply high and low, tribble 0
Syntax |
c = a mhl0 b |
Register | Signedness |
All | ignored |
1 opcode only |
No flags changed |
MHL0
replicates tribble 0 (bits 0–5) of a
across all six subwords, and then multiplies them pairwise with the tribbles of b
. In order to fit the six 12-bit results into c
, the six most significant bits of each product are written to the left copy of c
, and the six least significant bits of each product are written to the right copy of c
. No flags are changed. See also MHL
.
Except that only one instruction is required for MHL0
, it is equivalent to:
t = a swiz 000000000000`o c = t mhl b
The MHL0
–MHL5
instructions can be seen in action under 36-bit multiplication.
MHL1
Multiply high and low, tribble 1
Syntax |
c = a mhl1 b |
Register | Signedness |
All | ignored |
1 opcode only |
No flags changed |
MHL1
replicates tribble 1 (bits 6–11) of a
across all six subwords, and then multiplies them pairwise with the tribbles of b
. In order to fit the six 12-bit results into c
, the six most significant bits of each product are written to the left copy of c
, and the six least significant bits of each product are written to the right copy of c
. No flags are changed. See also MHL
.
Except that only one instruction is required for MHL1
, it is equivalent to:
t = a swiz 010101010101`o c = t mhl b
The MHL0
–MHL5
instructions can be seen in action under 36-bit multiplication.
MHL2
Multiply high and low, tribble 2
Syntax |
c = a mhl2 b |
Register | Signedness |
All | ignored |
1 opcode only |
No flags changed |
MHL2
replicates tribble 2 (bits 12–17) of a
across all six subwords, and then multiplies them pairwise with the tribbles of b
. In order to fit the six 12-bit results into c
, the six most significant bits of each product are written to the left copy of c
, and the six least significant bits of each product are written to the right copy of c
. No flags are changed. See also MHL
.
Except that only one instruction is required for MHL2
, it is equivalent to:
t = a swiz 020202020202`o c = t mhl b
The MHL0
–MHL5
instructions can be seen in action under 36-bit multiplication.
MHL3
Multiply high and low, tribble 3
Syntax |
c = a mhl3 b |
Register | Signedness |
All | ignored |
1 opcode only |
No flags changed |
MHL3
replicates tribble 3 (bits 18–23) of a
across all six subwords, and then multiplies them pairwise with the tribbles of b
. In order to fit the six 12-bit results into c
, the six most significant bits of each product are written to the left copy of c
, and the six least significant bits of each product are written to the right copy of c
. No flags are changed. See also MHL
.
Except that only one instruction is required for MHL3
, it is equivalent to:
t = a swiz 030303030303`o c = t mhl b
The MHL0
–MHL5
instructions can be seen in action under 36-bit multiplication.
MHL4
Multiply high and low, tribble 4
Syntax |
c = a mhl4 b |
Register | Signedness |
All | ignored |
1 opcode only |
No flags changed |
MHL4
replicates tribble 4 (bits 24–29) of a
across all six subwords, and then multiplies them pairwise with the tribbles of b
. In order to fit the six 12-bit results into c
, the six most significant bits of each product are written to the left copy of c
, and the six least significant bits of each product are written to the right copy of c
. No flags are changed. See also MHL
.
Except that only one instruction is required for MHL4
, it is equivalent to:
t = a swiz 040404040404`o c = t mhl b
The MHL0
–MHL5
instructions can be seen in action under 36-bit multiplication.
MHL5
Multiply high and low, tribble 5
Syntax |
c = a mhl5 b |
Register | Signedness |
All | ignored |
1 opcode only |
No flags changed |
MHL5
replicates tribble 5 (bits 30–35) of a
across all six subwords, and then multiplies them pairwise with the tribbles of b
. In order to fit the six 12-bit results into c
, the six most significant bits of each product are written to the left copy of c
, and the six least significant bits of each product are written to the right copy of c
. No flags are changed. See also MHL
.
Except that only one instruction is required for MHL5
, it is equivalent to:
t = a swiz 050505050505`o c = t mhl b
The MHL0
–MHL5
instructions can be seen in action under 36-bit multiplication.
MHNS
Multiply high no shift
Syntax |
c = a mhns b |
Register | Signedness |
All | ignored |
1 opcode only |
Flag | Set if and only if |
N |
never; flag is cleared |
Z |
all result bits are zero |
T |
flag does not change |
R |
flag does not change |
MHNS
is a former key instruction for unsigned long multiplication, where two 36-bit factors are multiplied as 6-bit tribbles and eventually sum to produce a 72-bit result. MHNS
multiplies the tribbles of a
and b
pairwise, but the six 12-bit results cannot fit the 6-bit spaces afforded by the tribbles of c
. Instead, MHNS
retains only the six most significant bits of each result. The tribbles are output in their original positions, instead of being rotated left as with MH
. The Z
flag is set if the outcome of MHNS
is all zeros, and cleared otherwise. N
is always cleared, and T
and R
do not change.
MHNS
has been supplanted by the MHL
family of instructions, allowing the number of instructions required for long multiplication to be reduced from 47 to 35. But the MHL
opcodes require a little more hardware and firmware loader support, due to their register splitting. Architectures derived from Dauug|36 which either do not split registers or have reduced ALU memory may benefit from using MHNS
to multiply. For sample code showing how this used to be done, see page 113 of the dissertation.
ML
Multiply low
Syntax |
c = a ml b |
Register | Signedness |
All | ignored |
1 opcode only |
Flag | Set if and only if |
N |
never; flag is cleared |
Z |
all result bits are zero |
T |
flag does not change |
R |
flag does not change |
ML
is a key instruction for unsigned “short” multiplication where one of the factors fits into six bits, and the product fits into 36 bits. It was also a key instruction for unsigned long multiplication until being supplanted by the MHL
family.
The smaller of the factors must be copied into all of the tribbles via CX
or an assembler constant. ML
multiplies the tribbles of a
and b
pairwise, but the six 12-bit results cannot fit the 6-bit spaces afforded by the tribbles of c
. Instead, ML retains only the six least significant bits of each 12-bit result. The Z
flag is set if the outcome of ML
is all zeros, and cleared otherwise. N
is always cleared, and T
and R
do not change. See MH
and MHNS
for more information and sample code.