QCPU
  • Instruction set
  • CLI
  • Guide
  • Snippets

On this page

  • 1 Jumps
    • 1.1 Dynamic addresses
  • 2 Branches
  • 3 Fetch security and optimisations
    • 3.1 Instruction cache
    • 3.2 Branch Target Buffer

Jumps and branches

memory
addressing
Efficiently moving the instruction pointer(s) around.
Published

June 20, 2024

There are a handful of instructions which (conditionally) allows the programmer to set the instruction pointer to a static or runtime-dynamic address.

1 Jumps

There are three types of jump instructions used in various scenarios:

  • 1-1010-#-00 jmp: jumps to an absolute address (3 bytes)
  • 1-1010-#-01 jmpr: jumps to a relative offset (2 bytes)
  • 1-1010-#-10 jmpd: jumps to a dynamic address (1 byte)

A pipeline diagram (for jmp in this case) is shown below. For a relative jump (jmpr), a two byte instruction is used instead of three bytes and the offset is calculated based on the instruction pointer value at the address of the first jump instruction byte.

fetch fetch decode decode execute writeback
jmp btb hit known jumped if btb miss, written to btb
low byte low byte shift low byte ready invalidate instr.
high byte or btb instr. high byte ready invalidate instr. if btb miss
btb instr. + 1 invalidate instr. if btb miss
btb instr. + 2 invalidate instr. if btb miss
btb instr. + 3 or jump

Each jump instruction can be callable, which is done by setting the LSB to 1, indicated by a l character in the assembly (like memory instructions would add a ' for a dynamic offset):

.func:      ...               ; subroutine...
            ret               ; return from subroutine

main:       jmprl       .func ; assembler calculates relative offset
fetch fetch decode decode execute writeback
jmpl btb hit known load spl load copy sfl add 4 to spl …, memory low byte ready
low byte load sph load copy sfh addition, prop. carry memory high byte ready
high byte or btb instr.

1.1 Dynamic addresses

Sometimes, an address is present in memory (like a switch table) or argument registers (like a callback argument) and a jump must be performed to that address. The jmpd instruction provides this dynamic functionality by piping the last two accumulator writes into the instruction pointer.

Because dynamic addressing may change, the BTB (explained in Section 3.2) compares the speculation address versus actual address and recovers/invalidates if necessary when a BTB hit was found.1

fetch fetch decode decode execute writeback
ast ra load ra written to acc
ast rb load rb written to acc
jmpd btb hit known compare target vs. actual jumped if btb miss or invalid address

2 Branches

A branch is a non-callable, relative, conditional jump with the synopsys 1-1001-### brh (2 bytes), where bit T is whether the jump should be taken when uncertain (only on BTB hit). The 3 bit condition is either the following:

  • C - carry out
  • S - sign bit (MSB)
  • Z - zero
  • U - underflow
  • !C - not carry out
  • !S - not sign bit (MSB)
  • !Z - not zero
  • !U - not underflow
fetch fetch decode decode execute writeback
brh c btb hit known recovery ready for taken -> not taken jumped if mispredicted, written to btb
signed offset perform offset, ready for not taken -> taken jump invalidate instr.
btb hit and T, btb miss or !T invalidate instr. if mispredicted
taken + 1 or not taken + 1 invalidate instr. if mispredicted
taken + 2 or not taken + 2 invalidate instr. if mispredicted
taken + 3 or not taken + 3 or recover

For address calculation, branch recovery and BTB writes, the following address delay cycles are used:

  • Address calculation (relative jumps/branches): 3 cycles (relative to the instruction)
  • Branch recovery (taken -> not taken branches): 3 cycles (incremented)
  • Branch Target Buffer writes: 4 cycles

3 Fetch security and optimisations

3.1 Instruction cache

A cache miss may occur which results in the CPU stalling until the valid/ready bit turns high for the designated page.

fetch fetch decode decode execute writeback
instr., miss known at end of cycle new address written in cc invalidate instr.
instr. + 1, second miss, ignore invalidate instr.
instr. + 2, hit, wait valid bit invalidate instr. if invalid bit

3.2 Branch Target Buffer

The Branch Target Buffer (BTB) caches the jump destination addresses to allow the instruction cache to perform a jump within a single cycle, avoiding a three cycle penalty. Section 1 provides a pipeline diagram of the BTB. Below are diagrams showing the interference of a cache miss during a jump with both a BTB hit and miss.

fetch fetch decode decode execute writeback
jmpr btb hit no jump necessary
signed offset cancel cache miss perform offset invalidate instr.
valid destination could be a cache miss
fetch fetch decode decode execute writeback
jmpr btb miss jumped to address, written to btb
signed offset perform offset invalidate instr.
penalty cycle cancel cache miss invalidate instr.
penalty cycle cancel cache miss invalidate instr.
penalty cycle cancel cache miss invalidate instr.
valid destination could be a cache miss

BTB addresses are invalidated when the page of origin gets written to, which includes context swaps in virtual memory like by switching userland processes.

Footnotes

  1. Currently uncertain. BTB may not be used for dynamic addresses.↩︎