Assembler keywords

The Dauug|36 assembler has a few keywords that you should recognize. Note these words are not reserved as keywords exclusively, so you are free to also use these keywords as register names if warranted. Here are the keywords in alphabetical order:

`emit`

In rare instances, we need to insert hand-assembled instructions within a program. This sometimes happens in OS kernels where hand-chosen register numbers are needed, or opcodes need to be inserted as immediate values. The emit keyword generates instructions as four nine-bit fields containing literal values like this:

emit 111000111`b 101101101`b 000010000`b 110010011`b

This inserts the instruction 111000111_101101101_000010000_110010011`b in a program. Any radix can be used for the four fields, so the same result would come from:

emit 455 365 16 403

emit can look up opcode values for any of its four operands. For example, the statements

nop
emit nop 1 2 txor

both produce NOP instructions, but the emit version customizes its ignored register numbers. The generated instructions are:

010011001_000000000_000000000_000000000`b
010011001_000000001_000000010_001110000`b

This binary is current as of 20 March 2024 when NOP and TXOR had opcodes 153 and 112 respectively. Using TXOR’s opcode as a right operand register is a whimsical illustration. For a real use of emit with opcodes, see wipe.user.registers:: in the Osmin source (netsim/os.a).

`keep`

Registers that are declared within a scope are only accessible within the scope, unless they are “kept” using the keep keyword. You may list any number of registers (even zero registers) after the keyword. For example:

mycode::
    signed w x y z
    unsigned a b c d
    keep c
    keep d z w

Now mycode contains two groups of registers.

a, b, x, and y are unkept, only exist while mycode is in scope, and are only accessible from within mycode. Note that the assembler never initializes any registers for you, so you should always set these registers prior to use with every call to mycode. The values of these registers may change between calls to mycode, because the assembler is free to reuse a, b, x, and y for other purposes whenever mycode is not being called. Even if the operating system initializes these registers to zeros, you cannot count on this happening, because register sharing may un-initialize them via another scope.

c, d, w, and z are kept, are reserved for exclusive use by mycode, and therefore cannot “spontaneously” change values. mycode can alter or read these registers directly, and other scopes can alter or read these registers by preceding them with their scope, i.e.,

mycode::w = 12345

Again, the assembler will never initialize c, d, w, and z for you, but the operating system may (and should be guaranteed to) initialize them to zero.

To force use of specific register numbers

Osmin’s API requires coordination of register numbers between the kernel and user programs. Here are the register numbers that sometimes have special meaning as of 20 March 2024:

Reg. No.	Name	Purpose
`0`	`0`	“zero register” containing the constant 0
`1`	`api::request`	API request to Osmin kernel
`2`	`api::result`	API result from Osmin kernel

The zero register is used for operations such as negation, which the assembler translates into subtraction from zero.

Programs use the keep keyword to force register numbers where absolutely necessary. Here is a real-life example from the Osmin kernel:

((  ------------------------------------------------------------------------
    Data-only scope for user-superuser intercommunication.
    ------------------------------------------------------------------------ ))
api::
    u. request              ; INPUT
    u. result               ; OUTPUT
    keep request 1 result 2
    ; No code goes in this scope.

`opcode`

The opcode keyword is infrequently used. Ordinarily it would only appear in privileged programs that need to compute an instruction to place in code memory or execute from a register via the XANY (execute anything) instruction.

opcode converts an opcode’s assembler name to its numeric equivalent, which will be in the range 0–511 inclusive, and shifts the result left 27 bits. opcode only works in assignments statements, and is useful because the assembler name for an opcode is easier to remember and synchronize within source code than its numeric value. Example:

unsigned nop_inst           ; oc_NOP is 231`o as of 19 July 2023.
nop_inst = opcode nop       ; nop_inst now 231_000_000_000`o.
xany nop_inst               ; Execute the NOP.

The assembler name is defined by the contents of the name field of struct instruction in opcode.c. Alternatively, the Dauug|36 electrical simulation has a command line option to list all assembler names alongside their numeric opcodes. You can invoke this as:

$ ./ns -l

`s.`

s. is an abbreviation for signed. The period is mandatory.

`scope`

scope is a proposed keyword to replace the :: symbol. The reason for this change is that although scopes are the principal separators of a program’s sections, their notation with :: is so compact as to be nearly invisible. It doesn’t help that labels within a scope, which are notated with :, look very similar. So instead of writing:

decrypt::
    unsigned in out key.1 key.2
    keep in out key.1 key.2

    out =  in xim key.2
    out = out xim key.1
    return

you would change the first line so the scope reads as:

scope decrypt
    unsigned in out key.1 key.2
    keep in out key.1 key.2

    out =  in xim key.2
    out = out xim key.1
    return

The decision to implement scope as a keyword has not been finalized, and the change has not been implemented. If the change goes through, the :: operator for assignments would also change. For example, a privileged program that needs the address of decrypt in a register would now read

addr = scope decrypt

instead of

addr = ::decrypt

`signed`

signed is followed by zero or more names, and declares these names as signed registers within the present scope. Examples:

signed
signed a b c
signed offset account_balance

signed is also used in exceptional cases as a cast on either operand or the destination of ALU arithmetic instructions to override a register’s declared signage. Like this:

unsigned i j k

<signed> i = j - k
i = <signed> j + k
i = j + <signed> k

`u.`

u. is an abbreviation for unsigned. The period is mandatory.

`unsigned`

unsigned is followed by zero or more names, and declares these names as unsigned registers within the present scope. Examples:

unsigned
unsigned i j k
unsigned absolute_value distance magnitude

unsigned is also used in exceptional cases as a cast on either operand or the destination of ALU arithmetic instructions to override a register’s declared signage. Like this:

signed i j k

<unsigned> i = j - k
i = <unsigned> j + k
i = j + <unsigned> k

`wrap`

wrap is used as a cast on the destination register of ALU arithmetic instructions to indicate that the result may wrap around without being identified as out-of-range. Example:

unsigned downcounter, upcounter

<wrap> downcounter = downcounter - 1
<wrap> upcounter = upcounter + 1