# Multiplication instructions

Opcode | P/U | Category | Description |

`DSL` |
user | ALU: multiply | double shift left |

`MH` |
user | ALU: multiply | multiply high |

`MHL` |
user | ALU: multiply | multiply high and low |

`MHL0` |
user | ALU: multiply | multiply high and low, tribble 0 |

`MHL1` |
user | ALU: multiply | multiply high and low, tribble 1 |

`MHL2` |
user | ALU: multiply | multiply high and low, tribble 2 |

`MHL3` |
user | ALU: multiply | multiply high and low, tribble 3 |

`MHL4` |
user | ALU: multiply | multiply high and low, tribble 4 |

`MHL5` |
user | ALU: multiply | multiply high and low, tribble 5 |

`MHNS` |
user | ALU: multiply | multiply high no shift |

`ML` |
user | ALU: multiply | multiply low |

`DSL`

Double shift left

Syntax |

`c = a dsl b` |

Register | Signedness |

All | ignored |

1 opcode only |

Flag | Set if and only if |

`N` |
bit 35 of the result is set |

`Z` |
all result bits are zero |

`T` |
flag does not change |

`R` |
flag does not change |

`DSL`

(double shift left) is a critical instruction for long multiplication, providing in one CPU instruction what would otherwise take four instructions. Sample code is available under 36-bit multiplication.

`DSL`

adds the `T`

flag with wrapping to `b`

, and then shifts the sum left six bits. The six vacated bits are filled using the six leftmost bits of `a`

. The result is written to `c`

.

`N`

and `Z`

are set as if the destination is a signed register. The `N`

and `Z`

flags have no purpose in the long multiplication application for which `DSL`

was designed, but I chose to update them in case someone invents a use for this information at a later date. `T`

and `R`

do not change.

This documentation does not match what the dissertation says about `DSL`

, in that the left and right operands have since been interchanged. This switch was made so that `DSL`

can directly obtain the correct register copy after an `MHL5`

instruction in long multiplication.

`MH`

Multiply high

Syntax |

`c = a mh b` |

Register | Signedness |

All | ignored |

1 opcode only |

Flag | Set if and only if |

`N` |
never; flag is cleared |

`Z` |
`c` = 0 |

`T` |
`c` mod 64 ≠ 0 |

`R` |
`T` is set or `R` is already set |

This is a key instruction for unsigned “short” multiplication where one of the factors fits into six bits, and the product fits into 36 bits. The smaller of the factors must be copied into all of the tribbles via `CX`

or an assembler constant. `MH`

multiplies the tribbles of `a`

and `b`

pairwise, but the six 12-bit results cannot fit the 6-bit spaces afforded by the tribbles of `c`

. Instead, `MH`

retains only the six most significant bits of each 12-bit result. `ML`

is the complementary instruction that retains the six least significant bits of each.

To meaningfully add the output of `MH`

and `ML`

, their place values must be aligned consistently, meaning that `MH`

needs a 6-position left shift, and that the result can spill to as many as 42 bits (which will not fit in a 36-bit register) as a result of that shift. The solution is that instead of shift, `MH`

rotates its result six bits left. If the six bits rotated into the rightmost places are not all zeros, the `T`

and `R`

flags are set because the eventual product will not fit in 36 bits. Otherwise `T`

is cleared, `R`

is left unchanged, and the output of `MH`

can be directly added to `ML`

to obtain the 36-bit product. `Z`

will be set if the output of `MH`

is all zeros. `N`

is always cleared.

Here is an unsigned short multiplication example with full range checking, and an always-accurate `Z`

flag at the end whether or not overflow occurs. Four instructions are needed. When multiplying by a small constant, the `CX`

can be optimized out by hand.

unsigned big small t result ; will multiply big * small t = cx small ; copy small into all tribbles result = big mh t ; high bits of product t = big ml t ; low bits of product result = result + t ; result is now big * small

### Warning

Like other macros, `CX`

is not yet available. Although it may be tempting to use `SWIZ`

in place of `CX`

like this:

t = small swiz 0

`SWIZ`

will not check to verify `small`

is between 0 and 63. `CX`

will have this verification and set `T`

and `R`

if `small`

is out of range.

### Note about replacing `MH`

with `MHL`

The `MHL`

family of instructions cannot improve over the performance of `MH`

and `ML`

for short multiplication, because `MH`

is able to include a 6-bit shift that `MHL`

and its derivatives cannot. (The issue is that only the beta RAMs can shift six bits, and only the gamma RAMs can split registers.) The `MHL`

family is for long multiplication.

`MHL`

Multiply high and low

Syntax |

`c = a mhl b` |

Register | Signedness |

All | ignored |

1 opcode only |

No flags changed |

`MHL`

is the flagship of the `MHL`

family of instructions for long multiplication and is the most flexible, although ordinarily the `MHL0`

through `MHL5`

instructions are used instead. `MHL`

is a simultaneous execution of `MHNS`

and `ML`

, where the left and right copies of register `c`

are allowed to desynchronize. Specifically, the `MHNS`

result is stored in the left copy of register `c`

, and the `ML`

result is stored in the right copy.

For a drawing that shows the two copies of the register file in relation to the architecture, see page 200 of the dissertation. A short discussion of register splitting can be found on page 187; however, that discussion assumes the presence of a fast hardware multiplier that stores an *entire* multiplication result in a split register. `MHL`

stores a *partial* result.

`MHL`

multiplies the tribbles of `a`

and `b`

pairwise. In order to fit the six 12-bit results into `c`

, the six most significant bits of each product are written to the left copy of `c`

, and the six least significant bits of each product are written to the right copy of `c`

. These writes are done simultaneously. To preclude any semantic confusion as to whether flags follow the left or right copy of a result, none of the `MHL`

instructions change any CPU flags at all.

The differing values in the left and right copies of `c`

can be selected in subsequent instructions by using the left and right operand positions of subsequent ALU instructions. Certain ALU instructions such as shifts are not symmetric in their left and right operands, so very unusual code may require an intervening instruction to transfer a value from one copy of the register file to the other copy. There are also two instructions that you probably don’t need to worry about after `MHL`

, namely `BOUND`

and `WCM`

, where the syntactic left operand is actually the electrically right operand and vice versa. Also, most assignment instructions place the electrically left operand on the right side of the `=`

sign.

`MHL0`

Multiply high and low, tribble 0

Syntax |

`c = a mhl0 b` |

Register | Signedness |

All | ignored |

1 opcode only |

No flags changed |

`MHL0`

replicates tribble 0 (bits 0–5) of `a`

across all six subwords, and then multiplies them pairwise with the tribbles of `b`

. In order to fit the six 12-bit results into `c`

, the six most significant bits of each product are written to the left copy of `c`

, and the six least significant bits of each product are written to the right copy of `c`

. No flags are changed. See also `MHL`

.

Except that only one instruction is required for `MHL0`

, it is equivalent to:

t = a swiz 000000000000`o c = t mhl b

The `MHL0`

–`MHL5`

instructions can be seen in action under 36-bit multiplication.

`MHL1`

Multiply high and low, tribble 1

Syntax |

`c = a mhl1 b` |

Register | Signedness |

All | ignored |

1 opcode only |

No flags changed |

`MHL1`

replicates tribble 1 (bits 6–11) of `a`

across all six subwords, and then multiplies them pairwise with the tribbles of `b`

. In order to fit the six 12-bit results into `c`

, the six most significant bits of each product are written to the left copy of `c`

, and the six least significant bits of each product are written to the right copy of `c`

. No flags are changed. See also `MHL`

.

Except that only one instruction is required for `MHL1`

, it is equivalent to:

t = a swiz 010101010101`o c = t mhl b

The `MHL0`

–`MHL5`

instructions can be seen in action under 36-bit multiplication.

`MHL2`

Multiply high and low, tribble 2

Syntax |

`c = a mhl2 b` |

Register | Signedness |

All | ignored |

1 opcode only |

No flags changed |

`MHL2`

replicates tribble 2 (bits 12–17) of `a`

across all six subwords, and then multiplies them pairwise with the tribbles of `b`

. In order to fit the six 12-bit results into `c`

, the six most significant bits of each product are written to the left copy of `c`

, and the six least significant bits of each product are written to the right copy of `c`

. No flags are changed. See also `MHL`

.

Except that only one instruction is required for `MHL2`

, it is equivalent to:

t = a swiz 020202020202`o c = t mhl b

The `MHL0`

–`MHL5`

instructions can be seen in action under 36-bit multiplication.

`MHL3`

Multiply high and low, tribble 3

Syntax |

`c = a mhl3 b` |

Register | Signedness |

All | ignored |

1 opcode only |

No flags changed |

`MHL3`

replicates tribble 3 (bits 18–23) of `a`

across all six subwords, and then multiplies them pairwise with the tribbles of `b`

. In order to fit the six 12-bit results into `c`

, the six most significant bits of each product are written to the left copy of `c`

, and the six least significant bits of each product are written to the right copy of `c`

. No flags are changed. See also `MHL`

.

Except that only one instruction is required for `MHL3`

, it is equivalent to:

t = a swiz 030303030303`o c = t mhl b

The `MHL0`

–`MHL5`

instructions can be seen in action under 36-bit multiplication.

`MHL4`

Multiply high and low, tribble 4

Syntax |

`c = a mhl4 b` |

Register | Signedness |

All | ignored |

1 opcode only |

No flags changed |

`MHL4`

replicates tribble 4 (bits 24–29) of `a`

across all six subwords, and then multiplies them pairwise with the tribbles of `b`

. In order to fit the six 12-bit results into `c`

, the six most significant bits of each product are written to the left copy of `c`

, and the six least significant bits of each product are written to the right copy of `c`

. No flags are changed. See also `MHL`

.

Except that only one instruction is required for `MHL4`

, it is equivalent to:

t = a swiz 040404040404`o c = t mhl b

The `MHL0`

–`MHL5`

instructions can be seen in action under 36-bit multiplication.

`MHL5`

Multiply high and low, tribble 5

Syntax |

`c = a mhl5 b` |

Register | Signedness |

All | ignored |

1 opcode only |

No flags changed |

`MHL5`

replicates tribble 5 (bits 30–35) of `a`

across all six subwords, and then multiplies them pairwise with the tribbles of `b`

. In order to fit the six 12-bit results into `c`

, the six most significant bits of each product are written to the left copy of `c`

, and the six least significant bits of each product are written to the right copy of `c`

. No flags are changed. See also `MHL`

.

Except that only one instruction is required for `MHL5`

, it is equivalent to:

t = a swiz 050505050505`o c = t mhl b

The `MHL0`

–`MHL5`

instructions can be seen in action under 36-bit multiplication.

`MHNS`

Multiply high no shift

Syntax |

`c = a mhns b` |

Register | Signedness |

All | ignored |

1 opcode only |

Flag | Set if and only if |

`N` |
never; flag is cleared |

`Z` |
all result bits are zero |

`T` |
flag does not change |

`R` |
flag does not change |

`MHNS`

is a former key instruction for unsigned long multiplication, where two 36-bit factors are multiplied as 6-bit tribbles and eventually sum to produce a 72-bit result. `MHNS`

multiplies the tribbles of `a`

and `b`

pairwise, but the six 12-bit results cannot fit the 6-bit spaces afforded by the tribbles of `c`

. Instead, `MHNS`

retains only the six most significant bits of each result. The tribbles are output in their original positions, instead of being rotated left as with `MH`

. The `Z`

flag is set if the outcome of `MHNS`

is all zeros, and cleared otherwise. `N`

is always cleared, and `T`

and `R`

do not change.

`MHNS`

has been supplanted by the `MHL`

family of instructions, allowing the number of instructions required for long multiplication to be reduced from 47 to 35. But the `MHL`

opcodes require a little more hardware and firmware loader support, due to their register splitting. Architectures derived from Dauug|36 which either do not split registers or have reduced ALU memory may benefit from using `MHNS`

to multiply. For sample code showing how this used to be done, see page 113 of the dissertation.

`ML`

Multiply low

Syntax |

`c = a ml b` |

Register | Signedness |

All | ignored |

1 opcode only |

Flag | Set if and only if |

`N` |
never; flag is cleared |

`Z` |
all result bits are zero |

`T` |
flag does not change |

`R` |
flag does not change |

`ML`

is a key instruction for unsigned “short” multiplication where one of the factors fits into six bits, and the product fits into 36 bits. It was also a key instruction for unsigned long multiplication until being supplanted by the `MHL`

family.

The smaller of the factors must be copied into all of the tribbles via `CX`

or an assembler constant. `ML`

multiplies the tribbles of `a`

and `b`

pairwise, but the six 12-bit results cannot fit the 6-bit spaces afforded by the tribbles of `c`

. Instead, ML retains only the six least significant bits of each 12-bit result. The `Z`

flag is set if the outcome of `ML`

is all zeros, and cleared otherwise. `N`

is always cleared, and `T`

and `R`

do not change. See `MH`

and `MHNS`

for more information and sample code.