Multiplication instructions

Opcode	P/U	Category	Description
`DSL`	user	ALU: multiply	double shift left
`MH`	user	ALU: multiply	multiply high
`MHL`	user	ALU: multiply	multiply high and low
`MHL0`	user	ALU: multiply	multiply high and low, tribble 0
`MHL1`	user	ALU: multiply	multiply high and low, tribble 1
`MHL2`	user	ALU: multiply	multiply high and low, tribble 2
`MHL3`	user	ALU: multiply	multiply high and low, tribble 3
`MHL4`	user	ALU: multiply	multiply high and low, tribble 4
`MHL5`	user	ALU: multiply	multiply high and low, tribble 5
`MHNS`	user	ALU: multiply	multiply high no shift
`ML`	user	ALU: multiply	multiply low

`DSL` Double shift left

Syntax

c = a dsl b

Register	Signedness
All	ignored
	1 opcode only

Flag	Set if and only if
`N`	bit 35 of the result is set
`Z`	all result bits are zero
`T`	flag does not change
`R`	flag does not change

DSL (double shift left) is a critical instruction for long multiplication, providing in one CPU instruction what would otherwise take four instructions. Sample code is available under 36-bit multiplication.

DSL adds the T flag with wrapping to b, and then shifts the sum left six bits. The six vacated bits are filled using the six leftmost bits of a. The result is written to c.

N and Z are set as if the destination is a signed register. The N and Z flags have no purpose in the long multiplication application for which DSL was designed, but I chose to update them in case someone invents a use for this information at a later date. T and R do not change.

This documentation does not match what the dissertation says about DSL, in that the left and right operands have since been interchanged. This switch was made so that DSL can directly obtain the correct register copy after an MHL5 instruction in long multiplication.

`MH` Multiply high

Syntax

c = a mh b

Register	Signedness
All	ignored
	1 opcode only

Flag	Set if and only if
`N`	never; flag is cleared
`Z`	`c` = 0
`T`	`c` mod 64 ≠ 0
`R`	`T` is set or `R` is already set

This is a key instruction for unsigned “short” multiplication where one of the factors fits into six bits, and the product fits into 36 bits. The smaller of the factors must be copied into all of the tribbles via CX or an assembler constant. MH multiplies the tribbles of a and b pairwise, but the six 12-bit results cannot fit the 6-bit spaces afforded by the tribbles of c. Instead, MH retains only the six most significant bits of each 12-bit result. ML is the complementary instruction that retains the six least significant bits of each.

To meaningfully add the output of MH and ML, their place values must be aligned consistently, meaning that MH needs a 6-position left shift, and that the result can spill to as many as 42 bits (which will not fit in a 36-bit register) as a result of that shift. The solution is that instead of shift, MH rotates its result six bits left. If the six bits rotated into the rightmost places are not all zeros, the T and R flags are set because the eventual product will not fit in 36 bits. Otherwise T is cleared, R is left unchanged, and the output of MH can be directly added to ML to obtain the 36-bit product. Z will be set if the output of MH is all zeros. N is always cleared.

Here is an unsigned short multiplication example with full range checking, and an always-accurate Z flag at the end whether or not overflow occurs. Four instructions are needed. When multiplying by a small constant, the CX can be optimized out by hand.

unsigned big small t result     ; will multiply big * small

t = cx small                    ; copy small into all tribbles
result = big mh t               ; high bits of product
t = big ml t                    ; low bits of product
result = result + t             ; result is now big * small

Warning

Like other macros, CX is not yet available. Although it may be tempting to use SWIZ in place of CX like this:

t = small swiz 0

SWIZ will not check to verify small is between 0 and 63. CX will have this verification and set T and R if small is out of range.

Note about replacing `MH` with `MHL`

The MHL family of instructions cannot improve over the performance of MH and ML for short multiplication, because MH is able to include a 6-bit shift that MHL and its derivatives cannot. (The issue is that only the beta RAMs can shift six bits, and only the gamma RAMs can split registers.) The MHL family is for long multiplication.

`MHL` Multiply high and low

Syntax

c = a mhl b

Register	Signedness
All	ignored
	1 opcode only

No flags changed

MHL is the flagship of the MHL family of instructions for long multiplication and is the most flexible, although ordinarily the MHL0 through MHL5 instructions are used instead. MHL is a simultaneous execution of MHNS and ML, where the left and right copies of register c are allowed to desynchronize. Specifically, the MHNS result is stored in the left copy of register c, and the ML result is stored in the right copy.

For a drawing that shows the two copies of the register file in relation to the architecture, see page 200 of the dissertation. A short discussion of register splitting can be found on page 187; however, that discussion assumes the presence of a fast hardware multiplier that stores an entire multiplication result in a split register. MHL stores a partial result.

MHL multiplies the tribbles of a and b pairwise. In order to fit the six 12-bit results into c, the six most significant bits of each product are written to the left copy of c, and the six least significant bits of each product are written to the right copy of c. These writes are done simultaneously. To preclude any semantic confusion as to whether flags follow the left or right copy of a result, none of the MHL instructions change any CPU flags at all.

The differing values in the left and right copies of c can be selected in subsequent instructions by using the left and right operand positions of subsequent ALU instructions. Certain ALU instructions such as shifts are not symmetric in their left and right operands, so very unusual code may require an intervening instruction to transfer a value from one copy of the register file to the other copy. There are also two instructions that you probably don’t need to worry about after MHL, namely BOUND and WCM, where the syntactic left operand is actually the electrically right operand and vice versa. Also, most assignment instructions place the electrically left operand on the right side of the = sign.

`MHL0` Multiply high and low, tribble 0

Syntax

c = a mhl0 b

Register	Signedness
All	ignored
	1 opcode only

No flags changed

MHL0 replicates tribble 0 (bits 0–5) of a across all six subwords, and then multiplies them pairwise with the tribbles of b. In order to fit the six 12-bit results into c, the six most significant bits of each product are written to the left copy of c, and the six least significant bits of each product are written to the right copy of c. No flags are changed. See also MHL.

Except that only one instruction is required for MHL0, it is equivalent to:

t = a swiz 000000000000`o
c = t mhl b

The MHL0–MHL5 instructions can be seen in action under 36-bit multiplication.

`MHL1` Multiply high and low, tribble 1

Syntax

c = a mhl1 b

Register	Signedness
All	ignored
	1 opcode only

No flags changed

MHL1 replicates tribble 1 (bits 6–11) of a across all six subwords, and then multiplies them pairwise with the tribbles of b. In order to fit the six 12-bit results into c, the six most significant bits of each product are written to the left copy of c, and the six least significant bits of each product are written to the right copy of c. No flags are changed. See also MHL.

Except that only one instruction is required for MHL1, it is equivalent to:

t = a swiz 010101010101`o
c = t mhl b

The MHL0–MHL5 instructions can be seen in action under 36-bit multiplication.

`MHL2` Multiply high and low, tribble 2

Syntax

c = a mhl2 b

Register	Signedness
All	ignored
	1 opcode only

No flags changed

MHL2 replicates tribble 2 (bits 12–17) of a across all six subwords, and then multiplies them pairwise with the tribbles of b. In order to fit the six 12-bit results into c, the six most significant bits of each product are written to the left copy of c, and the six least significant bits of each product are written to the right copy of c. No flags are changed. See also MHL.

Except that only one instruction is required for MHL2, it is equivalent to:

t = a swiz 020202020202`o
c = t mhl b

The MHL0–MHL5 instructions can be seen in action under 36-bit multiplication.

`MHL3` Multiply high and low, tribble 3

Syntax

c = a mhl3 b

Register	Signedness
All	ignored
	1 opcode only

No flags changed

MHL3 replicates tribble 3 (bits 18–23) of a across all six subwords, and then multiplies them pairwise with the tribbles of b. In order to fit the six 12-bit results into c, the six most significant bits of each product are written to the left copy of c, and the six least significant bits of each product are written to the right copy of c. No flags are changed. See also MHL.

Except that only one instruction is required for MHL3, it is equivalent to:

t = a swiz 030303030303`o
c = t mhl b

The MHL0–MHL5 instructions can be seen in action under 36-bit multiplication.

`MHL4` Multiply high and low, tribble 4

Syntax

c = a mhl4 b

Register	Signedness
All	ignored
	1 opcode only

No flags changed

MHL4 replicates tribble 4 (bits 24–29) of a across all six subwords, and then multiplies them pairwise with the tribbles of b. In order to fit the six 12-bit results into c, the six most significant bits of each product are written to the left copy of c, and the six least significant bits of each product are written to the right copy of c. No flags are changed. See also MHL.

Except that only one instruction is required for MHL4, it is equivalent to:

t = a swiz 040404040404`o
c = t mhl b

The MHL0–MHL5 instructions can be seen in action under 36-bit multiplication.

`MHL5` Multiply high and low, tribble 5

Syntax

c = a mhl5 b

Register	Signedness
All	ignored
	1 opcode only

No flags changed

MHL5 replicates tribble 5 (bits 30–35) of a across all six subwords, and then multiplies them pairwise with the tribbles of b. In order to fit the six 12-bit results into c, the six most significant bits of each product are written to the left copy of c, and the six least significant bits of each product are written to the right copy of c. No flags are changed. See also MHL.

Except that only one instruction is required for MHL5, it is equivalent to:

t = a swiz 050505050505`o
c = t mhl b

The MHL0–MHL5 instructions can be seen in action under 36-bit multiplication.

`MHNS` Multiply high no shift

Syntax

c = a mhns b

Register	Signedness
All	ignored
	1 opcode only

Flag	Set if and only if
`N`	never; flag is cleared
`Z`	all result bits are zero
`T`	flag does not change
`R`	flag does not change

MHNS is a former key instruction for unsigned long multiplication, where two 36-bit factors are multiplied as 6-bit tribbles and eventually sum to produce a 72-bit result. MHNS multiplies the tribbles of a and b pairwise, but the six 12-bit results cannot fit the 6-bit spaces afforded by the tribbles of c. Instead, MHNS retains only the six most significant bits of each result. The tribbles are output in their original positions, instead of being rotated left as with MH. The Z flag is set if the outcome of MHNS is all zeros, and cleared otherwise. N is always cleared, and T and R do not change.

MHNS has been supplanted by the MHL family of instructions, allowing the number of instructions required for long multiplication to be reduced from 47 to 35. But the MHL opcodes require a little more hardware and firmware loader support, due to their register splitting. Architectures derived from Dauug|36 which either do not split registers or have reduced ALU memory may benefit from using MHNS to multiply. For sample code showing how this used to be done, see page 113 of the dissertation.

`ML` Multiply low

Syntax

c = a ml b

Register	Signedness
All	ignored
	1 opcode only

Flag	Set if and only if
`N`	never; flag is cleared
`Z`	all result bits are zero
`T`	flag does not change
`R`	flag does not change

ML is a key instruction for unsigned “short” multiplication where one of the factors fits into six bits, and the product fits into 36 bits. It was also a key instruction for unsigned long multiplication until being supplanted by the MHL family.

The smaller of the factors must be copied into all of the tribbles via CX or an assembler constant. ML multiplies the tribbles of a and b pairwise, but the six 12-bit results cannot fit the 6-bit spaces afforded by the tribbles of c. Instead, ML retains only the six least significant bits of each 12-bit result. The Z flag is set if the outcome of ML is all zeros, and cleared otherwise. N is always cleared, and T and R do not change. See MH and MHNS for more information and sample code.

Multiplication instructions

DSL Double shift left

MH Multiply high

Warning

Note about replacing MH with MHL

MHL Multiply high and low

MHL0 Multiply high and low, tribble 0

MHL1 Multiply high and low, tribble 1

MHL2 Multiply high and low, tribble 2

MHL3 Multiply high and low, tribble 3

MHL4 Multiply high and low, tribble 4

MHL5 Multiply high and low, tribble 5

MHNS Multiply high no shift

ML Multiply low

`DSL` Double shift left

`MH` Multiply high

Note about replacing `MH` with `MHL`

`MHL` Multiply high and low

`MHL0` Multiply high and low, tribble 0

`MHL1` Multiply high and low, tribble 1

`MHL2` Multiply high and low, tribble 2

`MHL3` Multiply high and low, tribble 3

`MHL4` Multiply high and low, tribble 4

`MHL5` Multiply high and low, tribble 5

`MHNS` Multiply high no shift

`ML` Multiply low