Close

SRB : Smooth Register Backup (it was nice)

A project log for F-CPU

The Freedom CPU project has a log here too now :-)

yann-guidon-ygdesYann Guidon / YGDES 04/05/2024 at 19:270 Comments

FC0 had a great feature, one of its defining mechanisms, which was invented to make its 64 registers easier to manage during thread/context switching.

Each register has a hidden "dirty" flag, all the flags are set when the switch starts, then each time one register is accessed by the new program, it gets swapped with the saved version (if the flag is set). This provides a progressive replacement, which does not saturate the memory bus as much as a bulk save.

FC1 has 64 registers, too, but with 4 D-cache blocks, 2 ports each, so a broadcast instruction can send 8 registers to memory in one cycle. Saving the 56 scalar registers takes 7 instructions, 7 cycles. Massive parallelism makes SRB and its state machine moot, at least for the full-featured, full-width version.

The goal is to keep the frequency high, hence the register set must be kept small and simple. So far it's 16 registers with 3R1W ports, replicated 4 times. Using a dual-banked register set doubles the size and reduces the clock frequency. This would be interesting for a multithreaded version, with 2 or 4 simultaneous threads to flood the EUs with useful operations to perform, the lower clock freq would be compensated by a better efficiency/reuse of the units/lower latency. But an idle register set is out of question: use the caches.

And the save/restore mechanism could freeze a number of cache lines on each D-cache block for a faster sequence.

Discussions