Memory architecture

A look into QCPU’s memory system

memory

QCPU’s 16 bit virtual memory and its addressing modes.

Published

October 8, 2023

…

1 Memory access

A memory instruction (which is either mst or mld and their word equivalent) can be composed like the following:

instr <dynamic:1> <register:2>, low byte, high byte

Dynamic offsets is a bit whether to add an 8 bit accumulator offset to the address, indicated by ' after the instruction. For the register selection, it’s a special or index register used in the address calculation:

id	short	description
`00`		none
`01`	`sf`	stack frame (user or kernel depending on current context)
`10`	`sp`	stack pointer (user or kernel depending on current context)
`11`	`idx`	general-purpose index register (`rx` and `ry`)

Special registers have a user and kernel copy. They aren’t addressable in userspace, but can be written to using the DMAC in kernel space or implicitly through certain instructions (like mutating the stack pointer with msp).

Depending on the processor’s current context (either user or kernel mode), operations using the special registers will vary between the two copies. For example, if a system call is done in userland, the stack immediately switches to the kernel’s location.

2 Address resolution

Like for the imm (load immediate) and addi (add with immediate, and alike) instructions described in the diagrams below, the memory instructions use an instruction constant. For memory instructions specifically, it’s a 2 byte, little-endian immediate.

fetch	fetch	decode	decode	execute	writeback
`imm`				through alu	written to acc
value for immediate		ready for alu

fetch	decode	execute	writeback
`ast ra`	load `ra`	through alu	written to acc
`addi`		add value with acc	written to acc
value for immediate	ready for alu

The full address calculation pipeline is shown in the pipeline diagram below. Memory instruction mldw' loads a word from memory with an additional dynamic offset (the accumulator store prefixing the instruction) and the index registers.

Together, the address is calculated as [dynamic offset +] instruction constant + register.

fetch	decode	execute	writeback
`ast rz`	load `rz`	through alu	written to acc, dynamic offset
`mldw' idx`	load `rx`	rx + imml	dynamic offset
low byte	load `ry`, low byte ready	ry + imml prop. carry	dynamic offset prop. carry
high byte	high byte ready		invalidate instr.
sync recall
`rst ra`		write to reg
`rst rb`		write to reg

The immediate and register is calculated through the ALU, but the dynamic offset is calculated with a hardware-accelerated component. Note that the intermediate address is put through in two separate bytes. The dynamic offset component consists of an adder (adding the previous accumulator value and the low byte of the intermediate address) and a conditional incrementer (if a carry-out was present in the addition of the previous cycle, an increment is done for the high byte).

3 Reference indirection

An indirect access is done by loading two bytes from an arbitrary memory location and storing the result into the index registers. This can be combined with another access using idx as address register.

mldw        0x0000        ; load first address
rst   rx
rst   ry
mld   idx   0x0000        ; load from indirect address
rst   ra

4 Async memory

Fetching from memory takes time, especially when a cache miss is present. To combat this more efficiently, results from memory can be asynchronously written to a temporary register set and read from later. If the register is still ‘pending’, the processor will stall until it resolves the register.

mldw        0x0000        ; load first address
amra                      ; mark buffer a as pending

amr   mal                 ; move register a, low byte, into acc
amr   mah                 ; move register a, high byte, into acc