Osmin memory structures
Near-equivalence of kernel’s physical and virtual memory
This section is about data memory, not code memory.
At startup, Osmin allocates all memory the superuser will ever use, so there will never be a problem with kernel memory exhaustion. Osmin never frees any of its own memory for other programs to use. Despite Osmin’s superuser standing, Osmin uses nonprivileged memory instructions wherever it can, because these instructions come with the protections afforded by the page table.
Osmin’s simple memory scheme allows it to simply “grow” its memory contiguously until Osmin is fully initialized. Because Osmin obtains all of its data memory before user programs receive any, Osmin’s allocated physical and virtual addresses grow in lockstep. In most cases, the address of a physical memory page and its corresponding virtual address have the same numeric value. A disparity will occur if data RAM M0 is not present, in which case the physical and virtual addresses will differ at the 234 place value in order to select data RAM M1.
There is another disparity, but it happens on both sides and effectively cancels itself. Neither virtual page 0 nor physical page 0 are used for traditional storage. Virtual page 0 is a 4096-word space the superuser can use for tight page fill loops such as memsets. This page offers a small performance advantage over the remaining pages, because it includes the all-zero virtual address that loops can leverage to terminate without requiring a CMP
(compare) instruction. Physical page 0 is a 4096-word space that only contains zeros. This page is mapped with write protection to the unused page table entries for all programs, in order to preclude page table “holes” from compromising the system.
Thus “ordinary” kernel memory begins at virtual address 4096 instead of 0, and either physical address 4096 or 234 + 4096.
Order of kernel data memory
Kernel data memory is a “packed” collection of data structures that starts at address 0 and does not add padding between these structures. Their order starting from address 0 is:
Nature of data | Register pointer |
zero page | 0 -or- p.addr::zero.page |
physical page pool | v.addr::phys.page.pool |
user id pool | user.pool::a |
text segment (code segment) pool | v.addr::text.pool |
schedule (list of program instances) | v.addr::schedule |
(top-of-used-memory marker) | v.addr::used.to |
(top-of-backed-memory marker) | v.addr::backed.to |
Except for the top-of-memory markers, the order of these structures isn’t important to the kernel. They are in this order, because it was convenient to create them in this order.
The kernel data doesn’t require much memory. Here is the maximum RAM amounts required by these data, assuming support for 256 users (the architecture limit) and 8 Mibyte of data RAM (using the largest ICs on the market for data RAM ICs M0 and M1).
Register pointer | Size determined by | Max. words |
0 -or- p.addr::zero.page | 4096 words always | 4096 |
v.addr::phys.page.pool | 1 word / page data memory | 2048 |
user.pool::a | 1 word / user | 256 |
v.addr::text.pool | 4 words + 4 words / text segment | 1028 |
v.addr::schedule | 4 words + 2 words / program instance | 516 |
v.addr::used.to | none (high-water mark) | |
v.addr::backed.to | varies (pad to page boundary) | |
Total | 7944 |
Because the kernel cannot allocate fractional pages of data memory, its footprint will be rounded up to 8192 words (two pages). Reducing the allowed number of users or using smaller data RAMs would still result in a two-page footprint, because kernel data will consume some portion of one page, while the zero page must not contain kernel data.
Address fixup scratch space (location)
The load.text.segment::
scope must update CALL
and JUMP
instructions to point to branch targets in physical code memory. This will depend on what code memory isn’t filled by other programs at the time, so the executable file won’t contain the final addresses. load.text.segment::
uses a temporary allocation of data memory to map assembler pages to physical code pages. But neither of the two preceding tables show this allocation. Here’s why.
As of 18 August 2023, load.text.segment::
runs directly within the scheduler between time slices as requested via the Osmin API. Thus load.text.segment::
runs as user 0 and is not subject to preemption for multitasking. This has to change, because load.text.segment::
executes in time proportional to the length of the program it’s loading, which disqualifies Osmin from being a real-time operating system (RTOS). But as things stand today, there is no possibility of a context switch occurring while a program is being loaded.
Which brings us to where the address fixup scratch space is: the kernel maps virtual page 0 onto the zero page without write protection! Osmin guarantees that user programs will only find zeros in this page, but under the hood, Osmin puts all the user programs to sleep, uses the zero page for its own processing data, re-zeros the page, and awakens the user programs with no trace left behind.
As Osmin matures, load.text.segment::
will be move to a privileged, interruptible user that is not the superuser. At that time, the zero page will no longer be suitable for use as scratch space, so memory will be permanently allocated as address fixup scratch space.
See also “Address fixup scratch space (format)” below.
Data structures for kernel data memory
As a reminder, the physical and virtual memory page size is 4096 words. This size is determined by the electrical configuration of the page table and arithmetic logic unit, so this size isn’t likely or easy to change.
The code memory page size is a power of two, but is not otherwise directly constrained by hardware. User programs must be assembled with a code page size that is equal to or a divisor of the operating system’s page size. This is because Osmin requires one word at the end of every code page to JUMP
to the next page, so the assembler adds NOP
(no operation) instructions at these locations. There isn’t serious harm if the assembler uses a smaller page size than the target OS, because the only result of the mismatch is a few more NOP
s in the executable. In any event, a code page size of 512 words is about optimal for production systems: more than 512 words will tie up more storage for JUMP
instructions between pages, while less than 512 words will tie up more storage on average when text segments aren’t an exact multiple of the page size. For electrical simulations of the architecture with short programs, a 16-word page size works well.
Zero page
The Dauug|36 architecture has no concept of a page fault, or of a page table entry that isn’t in use. This leaves every user program with many unneeded page table slots that need to be secured, to prevent a program from accessing memory it’s not authorized to. Osmin solves this problem by having a write-protected page that contains just zeros, and placing that page in every page table entry that isn’t required for user program memory or kernel memory. Reads and writes to the zero page are legal and cannot be trapped, but the value read will always be zero, and the value written will always be discarded.
The format of the zero page is simply 4096 words that are all zero.
Write protection is turned on using bit 35 in the physical address. Like all kernel memory, the zero page may need to be in the M1 RAM if the M0 RAM is not installed. M1 is selected using bit 34 in the physical address. Physical address bits 0–33 are all zero, although Osmin’s design choice to place the zero page at the lowest physical address is not a design or architecture requirement.
Physical page pool
The physical page pool is Osmin’s map of physical pages that are free and not free. This pool, or map, enables the free/non-free condition and retention count of any physical page to be inspected or changed in O(1) time. This is because the pool is a simple array, containing one word in the pool for every page of physical memory in the machine.
This pool also enables a free page to be allocated in O(1) time. This is because the pool entry for each free page in the list holds the pool index of the next free page. Thus the physical page pool is an array with a linked list superposed over a portion of it, where the array modality inventories pages in use, and the linked list modality organizes pages that are free.
The pages themselves live in either one or two data RAMs, depending on whether RAM M0, M1, or both are installed in the system. The pool describing these pages is contiguous, despite the discontinuity of physical addresses between M0 and M1. As a simple, smaller-than-life example, suppose M0 has 65,536 words and M1 has 32,768 words. This means M0 contains 16 pages that span physical addresses 0
through 177_777`o
, and M1 contains 8 pages that span physical addresses 200_000_000_000`o
through 200_000_077_777`o
, because bit 34 is chip select. Because there are 24 physical pages in total, the physical page pool contains 24 words, where the first 16 words describe the usage of RAM M0, and the final 8 words describe the usage of M1.
Because the lowest physical page address is permanently assigned to Osmin’s zero page, offset 0 in the pool is initialized to 0 and is never read or written again. The remaining offsets that represent free pages are linked in a chain. For example, if pages 2, 9, 18, and 21 are free and all other pages are in use, then:
index::next.free.phys.page = 2 pool[2] = 9 pool[9] = 18 pool[18] = 21 pool[21] = 0
In the above chain, the page indices happen to be linked in ascending order. Memory allocation and deallocation may lead to the list being in some other order. The order doesn’t matter, because a program’s virtual memory doesn’t care at all where each physical page is—that’s the whole point of virtual memory. So an equally valid order might be:
index::next.free.phys.page = 18 pool[2] = 21 pool[9] = 0 pool[18] = 2 pool[21] = 9
Because we already know that physical page 0 does not participate in the pool, we are able to use 0 as a “no more free pages” marker within the pool. If no free pages remain in the pool, index::next.free.phys.page
will be 0.
Osmin’s practice is to zero all 4096 words of any physical page as it is freed, and to zero all data memory at startup. This way, requests for physical memory will not have to wait for any pages to be zeroed.
The words in the physical page pool are signed, where negative quantities indicate that a page is in use, and how many times it has been retained. This reference counting scheme permits a physical page to be mapped to multiple virtual pages. Perhaps a physical page sometimes warrants write protection, but not at other times, or a physical page is mapped by multiple users for interprocess communication. A physical page that is mapped to one virtual page is denoted -1 in the pool, a page that is mapped to two virtual pages is denoted -2, and so on. Thus a complete representation of a physical page pool may look like this:
REGISTERS --------- count::M0.words = 65536 ; M0 has 16 pages count::M1.words = 32768 ; M1 has 8 pages v.addr::phys.page.pool = 4096 ; virtual address of pool[0] index::next.free.phys.page = 18 ; will be next page allocated DATA MEMORY ----------- pool[0] = 0 ; zero page is not part of the pool pool[1] = -1 ; in use by 1 virtual page pool[2] = 21 ; available, next available is 21 pool[3] = -1 ; in use by 1 virtual page pool[4] = -1 ; ... pool[5] = -1 pool[6] = -1 pool[7] = -1 pool[8] = -1 pool[9] = 0 ; available, last in list pool[10] = -1 pool[11] = -1 pool[12] = -2 ; in use by 2 virtual pages pool[13] = -1 pool[14] = -1 pool[15] = -1 pool[16] = -29 ; in use by 29 virtual pages pool[17] = -1 pool[18] = 2 ; first available, next available is 2 pool[19] = -1 pool[20] = -1 pool[21] = 9 ; available, next available is 9 pool[22] = -1 pool[23] = -1
User id pool
Active Dauug|36 are segregated in hardware by an eight-bit user id that partitions the registers, call stack, and page table among up to 256 programs. The superuser, also called the kernel, has user id 0. Every other program has one of the remaining 255 user ids. The user id pool is a memory structure that allows a user id for a new program instance to be assigned in O(1) time.
Because user ids are recycled and will be used in the API to refer to programs that are running or even terminated recently, Osmin uses a 36-bit extended user id that does not repeat frequently. Bits 0–7 of the extended user id is the hardware user id, and bits 8–35 are a counter that increments every time the corresponding 8-bit user id is reused. This counter will eventually wrap around after 228 reuses of the same 8-bit user id.
The user id pool is a FIFO circular buffer containing extended user ids that are available for use. Barring a strange customization, this pool will use 256 words of memory and contain no more than 255 extended user ids. This buffer occupies the virtual memory from user.pool::a
up to but not including user.pool::b
. user.pool::r
and user.pool::w
indicate virtual memory addresses of the next slots to read and write in the pool. The pool is considered empty when user.pool::r
is equal to user.pool::w
, which is why the pool cannot hold 256 extended user ids. It needn’t anyhow, since user 0 is not managed by the pool.
When a program is to be started, an extended user id to represent the program is obtained from the user id pool. The extended user id is placed in the schedule (list of running program instances) without modification, where it remains until the program terminates. After the program is finished running, 256 is added to its extended user id, which has the effect of adding 1 to the re-use counter in bits 8–35, and then the extended user id is returned to the circular queue at the end.
Using extended user ids directly as process ids would present a security risk in some applications, because inspection of these ids may reveal private user or system information. For this reason, Osmin extended user ids will be known only to the kernel and won’t pass through the API. Instead, extended user ids will be “blinded” using MIX
and XIM
instructions with at least a 72-bit key whenever the API needs to specify or identify a program instance. These blinded user ids are termed process ids and will increase the difficulty of user programs inferring information from their own or other programs’ user ids.
Collections
Collections are managed by a set of related scopes named dict.something
, referred to in the documentation as the dict.*
scopes.
Osmin has a simple memory structure for collections that have a fixed maximum number of elements. These collections are simply unsorted arrays that can be used as arrays, association lists, sets, multisets, etc. Although their O(n) time complexity would be a concern for many operating systems, three mitigating factors apply to these structures in Osmin:
- Osmin is an RTOS for small machines that may run between one and five programs at a time. The architecture’s 256-program maximum isn’t a typical use case, and such a case may not require real-time scheduling anyway.
- API calls that involve Osmin collections happen infrequently, because they do such tasks as start and end programs. In an embedded system, such calls can be days apart—or may not even occur once the system has fully booted.
- Even if an Osmin collection holds 256 elements and is touched routinely, finding a key by exhaustive search takes only 6 instructions per element plus a fixed overhead of 15 instructions including
CALL
andRETURN
. These up to 1,551 instructions that would be executed within an API call are not many, compared to the probable dispatcher configuration of 65,535 instructions per time slice.
The collection format is a 4-word header followed immmediately by the fixed-length array of elements. The array has no padding, so if a collection allows up to 7 elements, and each element contains 3 words, the total collection size is 4 + 7 * 3 = 25 words. The header contains the following information:
Offset | Mutable? | Contents |
0 | yes | number of elements in use |
1 | no | number of words per element |
2 | yes | pointer to end of used space |
3 | no | pointer to end of available space |
Header data in the “mutable” words will change as the number of elements increases and decreases. The pointers at offsets 2 and 3 mark the end of open intervals; that is, they point to the first word beyond the marked region. These pointers are equal when and only when the collection is full. The first element in the collection begins at offset 4.
Some collections use the first word of each element for lookup purposes. Usually this word will be a unique key, although the code written to support collections also supports cases where the first word isn’t unique.
Text segment pool
The text segment pool is the kernel’s inventory of programs contained in code memory. It’s implemented as a collection, so see “Collections” above for its header format. Every program is represented by a 4-word structure in the text segment pool:
Offset | Contents | |
0 | filename | |
1 | JUMP to first instruction in the text segment |
|
2 | hold bit and reference count | |
3 | true iff segment contains privileged instructions |
Offset 0. At present, programs are uniquely identified to the kernel by their 36-bit “filename” within the paravirtualized I/O system. This filename is offset 0 of each text segment pool element. This scheme undoubtedly must change once actual storage peripherals and I/O are available. But for now, the filename in the above structure is a unique key within the pool. (More than one instance of this program may be running, but the code memory will only contain one copy of the program. Refer to “Schedule” below for how running instances are inventoried.)
Offset 1. This word identifies where in code memory a program (text segment) is kept. The format is the 27-bit address of the first instruction in bits 0–26, and an unconditional JUMP
opcode in bits 27–35.
A program may be fragmented in code memory, so programs are segmented into power-of-two-sized pages by the assembler and kernel. Every page but the last has a JUMP
to the next page as its last instruction. The last instruction of the last page is a JUMP
to itself: if that instruction is reached, the program counter will no longer increment.
Offset 2. This word contains the number of currently-running instances of the text segment. Theoretically this number uses bits 0–34 of the word; however, the physical machine only supports 256 users (program instances) total, so the kernel will never reach most of these bits. Bit 35 is used as a “hold” bit: the kernel will unconditionally keep this text segment in memory for as long as this bit is set.
Any time the word at offset 2 reaches zero, the kernel will remove the program from code memory, because no copies are running, and there is no hold in effect. Programs are overwritten with NOP
instructions when they are removed from code memory.
Possibility of metadata recovery. Ignore this paragraph unless you presently have expert knowledge of Dauug|36 and Osmin. Although overwriting reclaimed code memory with NOP
s lessens the opportunity to infer what may have run in the past, the inter-page JUMP
instructions are not replaced. It may be possible to recover some metadata by examining these JUMP
instructions after a program has been unloaded, and such recovery might be a vulnerability if an attacker has RCM1
/RCM2
capability or physical access. I estimate that opportunities to gainfully exploit this are tiny to none, so I don’t have plans to “fix” this as of December 2023.
Offset 3. The kernel checks, when transferring a program into code memory, if a program contains any privileged instructions. The word at this offset will be nonzero iff any privileged instructions are present. The purpose of having this information is to allow system calls to differentiate what is allowed based on whether a program is privileged.
Schedule
The schedule is the kernel’s list of executing program instances. It’s implemented as a collection, so it includes the above-described collection header. The schedule’s elements contain only two pieces of information:
Offset | Contents | |
0 | extended user id | |
1 | filename |
The extended user id is the decrypted process id. Bits 0–7 are the user id, which identifies the physical memory segment the process is using in the register file, page table, and stack. Bits 8–35 are a counter that increases each time the user id is recycled for another process. The intent of this counter is to reduce the frequency of process id reuse. (This counter could theoretically be used to make the process id harder to predict or decrypt, but the current implementation always starts the counter at 1, which on many embedded systems would never increment.)
The extended user id is a primary key for the schedule. It’s guaranteed not to have the same value for two simultaneously-live program instances, because the architecture relies on unique user ids in providing memory, and the user id is a portion of the extended user id.
The filename uniquely identifies the program’s text segment, and is a primary key into the text segment pool. See that pool’s documentation for more information concerning the filename.
Address fixup scratch space (format)
The address fixup scratch space is a small array of unsigned, with the first element at virtual address 0. See “Address fixup scratch space (location)” above for more about this positioning. The offset into the array is equal to the virtual memory address. The word at some offset i in the scratch space is equal to the code memory address of page i within the program being written to code memory.
The code page size is determined when Osmin boots, but will always be a power of two. Converting the assembler address of a program to a physical code address requires splitting the assembler address into a page and offset, looking in the address fixup scratch space to find the corresponding physical page from the assembler page, and combining the physical page with the offset.
Example. Suppose the code page size to be 512 words, and that a 2048-word program will be loaded as four pages located at code addresses [16,384 25,088 18,432 36,864]. These four addresses, in this order, are written at virtual addresses [0 1 2 3]. No other address fixup scratch space is needed. Suppose a JUMP
instruction has assembler address 1,333 as its destination. Because 1333 = 2 * 512 + 309, the JUMP
needs to point to page 2, offset 309. The physical address for page 2 is in the table at virtual address 2; this address is 18,432 and is added to offset 309 to yield physical code address 18,741 for this JUMP
.
Order of kernel code memory
A small bootloader (79 words as of October 2023) loads Osmin into memory. Because the bootloader is present from code address 0 until its end, Osmin must be loaded at a higher address.
The bootloader doesn’t know how to fix up addresses to account for Osmin’s location in memory, so the assembler has to correct all target addresses for CALL
, JUMP
, etc. to the actual location where Osmin will reside, as well as write the start address into the executable file header. The cross assembler has a -b
option that provides this. For example,
$ xa -b 128 os.a -o pvio-files/file-bootMe
tells the cross assembler to begin the Osmin kernel at physical code address 128. This is above the bootloader’s last instruction at physical code address 78, so there is no overlap to worry about.
There are some memory addresses in Osmin that are of special relevance to code memory layout. In ascending order, these are:
Symbolic address | Code starting at this address |
main:: - 1 | CALL to constant intialization section |
main:: | bulk of kernel code |
firmament:: | kernel code that only runs at startup |
safe.to.write:: | protective NOP |
safe.to.write:: + 1 | constant initialization section |
(unknown) | RETURN from constant initializations |
Osmin’s kernel overlaps some number of pages of code memory, so those pages must not be placed on the kernel’s list of pages where user programs can be loaded. These are the pages containing main::
, safe.to.write:: - 1
, and all pages in between. The remaining code memory pages are placed in the list of available pages for loading programs (text segments).
A tighter bound for kernel-occupied code memory would be main::
, firmament:: - 1
, and all pages in between. The idea is that kernel code from firmament::
and upward only runs during operating system initialization, and can then be freed for user programs. The implementation is doable but adds some complexity. The extra memory this would yield as of October 2023 is only 307 words. This would make a one-page difference at most on production systems, because the anticipated code memory page size for most machines is 512 words.
The above special addresses are also used to clear all code memory, including the bootloader, when Osmin starts. The procedure is twisted. Osmin’s first instruction is a CALL
to the constant intialization section appended to the kernel by the assembler. After that returns, all of the superuser’s registers are cleared, resulting in the loss of these constants. The CALL
is repeated to restore the constants. At that time, only the instructions between main::
up to but excluding safe.to.write::
require keeping, and all other words in the code memory are cleared.
The protective NOP
at location safe.to.write::
is intended to ensure the kernel is not modified (except perhaps for this NOP
) when code memory is being manipulated by enschedule::
to force a program start address on a user stack. I’m not sure (October 2023) it’s implemented correctly; it perhaps should be one instruction before safe.to.write::
instead of right on it.
How code memory pages are stored
In the Dauug|36 architecture, no electrical path exists from node 0 (the instruction pointer) or the return address stack into the ALU or registers. All that can be done with a code address is select which instruction to fetch from code memory. (See page 200 of Marc’s dissertation for a drawing.) A consequence of this is that once a program is running, the operating system has no way to identify the address of instructions that are executing. This means that a program cannot be moved to another area in code memory during execution, because the stack and program counter cannot be synchronized with the change. (It’s possible to write a program that periodically rewinds its stack and instruction pointer to a predictable state that supports relocation, but it’s not worth the hassle.)
A consequence of inability to relocate executing programs is that code memory will become fragmented as programs are loaded and unloaded. Defragmentation is not workable, because it requires terminating and restarting every program the kernel needs to move. Living with unwanted fragmentation is not workable either, because programs that are longer than available contiguous space won’t be able to start.
Dauug|36 solves the fragmentation-relocation problem by breaking every user program into fixed-length, fully-interchangeable segments called pages. The last instruction of each page is a JUMP
to the next page. Pages need not be contiguous. When the kernel loads (for example) a 41-page program into code memory, the kernel can select any 41 free pages in any order. Programmers don’t have to think about these pages, because the assembler knows the page size and inserts a NOP
as the last instruction of each page. The kernel’s program loader (a scope called load.text.segment::
) replaces these NOP
s with the necessary inter-page JUMP
s.
Page-based allocation of code memory incurs two types of losses. One is leftover space at the end of a program’s last page. The larger the page size, the worse this loss. The other is space consumed by the JUMP
instructions needed to link pages to each other. The smaller the page size, the worse this loss. The code RAM size, number of text segments, and the distribution of text lengths are also influences.
I did some modeling, and I suggest a code page size of 512 words. The kernel will increase the page size, if the code RAM is particularly large, to ensure the number of pages is not more than 4096. This limits the address fixup scratch space to a single page, simplifying its allocation.
A small amount of time overhead is added by JUMP
instructions from one page to the next. In rare cases, it may help to ensure that time-critical tight loops are never split between pages. I have in mind to add an assembler directive that can control this, but as of October 2023 there isn’t one yet.
A machine’s code memory pages are organized as N + 1
linked lists, where N
is the number of text segments in code memory. The text segment pool’s JUMP
instruction leads to the first page of each segment, and subsequent pages are linked by JUMP
instructions at the end of each page. The final page is marked by a JUMP
instruction to itself, and is trivial to distinguish from the other JUMP
s, because the target address and therefore the entire instruction is an odd number (bit 0 is set). The final linked list is the list of unallocated pages, started via a JUMP
instruction in register v.addr::code.pool
.
No data memory is used to manage code memory allocation. One reason this is good is that the information is readily available from the text segment pool and the code memory itself, so duplication and potential for desynchronization is eliminated. One drawback is that allocating and freeing code memory relies on the RCM1
and RCM2
(read code memory 1 and 2) instruction sequence, which is so kludgy that I considered not supporting code memory reads in the architecture at all. These two instructions must execute contiguously, requiring the multitasking timer to be disabled. The most straightforward way to guarantee this is to have the superuser handle code memory allocation; however, large code memory transactions will need to happen in a manner that meets RTOS time bound requirements. This will add some complexity to the kernel.
Page table memory
Like code memory pages, data memory pages may be allocated and deallocated in sufficient number to jeopardize RTOS time bounds if the superuser must handle this work. But the superuser constraint might not exist, because the RPT
(read page table) and WPT
(write page table) instructions work atomically. A user program containing these privileged instructions can execute them just fine; however, only that user’s page table entries can be modified. There are ways to implement this, and it may turn out that adding some hardware capabilities (more identity-modifying privileged instructions) would be the preferred way. Stay tuned.
No data memory is used to track page table memory, because the page table includes all information needed. The trick with the page table is that for any active user, every page table entry needs to point to memory the user is permitted to access, whether the user actually needs the memory or not. A “baseline” Dauug|36 system’s page table consists of one 512Ki × 18 SRAM, with an equal number of slots for eacch of 256 users. So each user can access up to 2,048 pages of memory, meaning there are 2,048 page table slots that require initialization. The simplest way to fill out the pages safely is to point all unneeded slots to the zero page’s write-protected base address. All reads from that page will come back as zeros, and all writes will execute normally except have no effect.
The physical and virtual data memory address formats, as well as input and output bit assignments for the page table, are described in the dissertation at pages 188–191.
Physical data memory is allocated and reclaimed as it is added to and removed from the page table. These records are maintained in the physical page pool, described above.
Stack memory
The Dauug|36 stack contains CPU flags and return addresses only. No program data is (no variables are) ever on the stack. The stack is implemented by a 128Ki × 36 SRAM with these data I/O bits:
Bit(s) | Contents |
0–26 | return address |
27 | N(egative) flag |
28 | R(ange) flag |
29 | T(emporal range) flag |
30 | Z(ero) flag |
31–35 | pull resistor to logic 1 |
The address input bits are:
Bit(s) | Contents |
0–7 | call stack depth |
8–15 | current user (but subject to identity-modifying instructions) |
16 | logic 0 |
To lessen the number of components, call stack depth is stepped by an 8-bit Galois linear feedback shift register (LFSR) that implements a predecessor and successor function. The LFSR has two cycles: a length 1 cycle containing only 0, and a length 255 cycle containing the nonzero 8-bit numbers. These two cycles support maximum call stack depths of 1 and 255 respectively.
Because the LFSR cycles are closed under the predecessor and successor functions, we need only initialize the call depth to a nonzero value to ensure the call stack depth will be 255 forever after. This is what the CALI
(call stack initialize) instruction does: it’s equivalent to CALL
except that it forces one of the call depth bits to be nonzero. CALI
is privileged, because this forcing can cause a non-contiguous jump within the call stack, which may be to a location that isn’t in the current user’s text segment. This means that CALI
could lead to a nonprivileged program executing privileged instructions.
No program knows its actual call stack depth, because the LFSR generates a 255-value sequence that wraps around, and although CALI
guarantees that one bit will be nonzero, it does nothing to establish the remaining seven bit values. The only instructions that modify the stack depth are the CALL
and RETURN
variants (CALI
, CALL
, NPCALL
, PCALL
, RETURN
, REVERT
, and instructions that are processing a context switch). The bottom line is that a user can only RETURN
to the instruction following an earlier CALL
, meaning the user can’t leverage the stack to escape its own text segment.
What happens if a user RETURN
s more times than it CALL
s? This is called stack underflow, and this will cause the user to escape its own text segment—but in a controlled way. Before the user program starts, the operating system initializes the stack with CALI
. The instructions in the kernel immediately after CALI
are where control will go in the event of stack underflow, and that code is an infinite loop to capture the user program permanently. (As of October 2023, stack underflow captures into an infinite loop of YIELD
instructions, but in the future it will be an infinite loop of calls to terminate the program. Presumably the very first termination call will always succeed, but just in case, the kernel is prepared to try forever.)
There is also a stack overflow condition, where the user executes 255 or more CALL
instructions without matching RETURN
s. This will overwrite the CALI
return address, removing the only address on the stack that leaves the user’s text segment. So stack overflow is harmless in terms of privilege escalation, because it can’t execute any branch that an regular JUMP
instruction can’t. (Recall that the kernel’s program loader ensures that branch targets never leave their original text segment.)
Although there may be niche cases where it may be feasible to employ recursive calls without stack-based data, it’s the intent of the Dauug|36 architecture to disallow stack-based recursion. Programmers are expected to not write code that reenters a scope that is already in the call stack. Given this constraint, the maximum stack depth of 255 return addresses will be adequate for essentially any program’s needs.
You may be wondering what happens if a program executes without a completely fresh stack. Perhaps a privileged program with the same user number was running earlier and left some code memory around that a subsequent program shouldn’t be able to branch to. The design ensures that only the return addresses from the current program’s CALI
to its present call depth are accessible, so if a new program is only 10 calls deep, it doesn’t matter if there are 245 “unsafe” addresses in the stack. They aren’t accessible. But Osmin’s kernel goes beyond this minimum safety level: when a program terminates, its stack is completely overwritten by a loop of 260 CALL
instructions. As of October 2023, this return address is not monitored by the kernel for unexpected arrivals. This monitoring still needs to be added.
In addition to return addresses, the CALL
family of instructions store the current CPU flags on the user’s stack. RETURN
does not restore these flags to their original state, but there is another instruction, REVERT
, which does. This restoration is the only difference between RETURN
and REVERT
. Be cautioned: REVERT
can unset the R(ange) flag, which could invalidate arithmetic sanity assurances. Although not a privileged instruction, REVERT
is intended principally for use within the Osmin kernel.