Close
0%
0%

ECM-16/TTL homebrew computer

16 bit Computer made from ttl logic chips

Public Chat
Similar projects worth following
The aim of this project is to build functional computer, based around 16 bit datapath, from scratch, by using logic chips of 74hc family. There are three parts of this project: hardware design, hardware build, writing software.
For design, the "Digital" logic simulator (Logisim derivative from H.Neeman, https://github.com/hneemann/Digital) is used. After part of the simulation is fully designed and works, and there are no further design change is planned, it is actual building/soldering begins. Throughout all this time software for computer is developed, and this development is not stopped after hardware completion. The *.dig (simulation) files and Assembler are available publicly (sim: https://github.com/Pavel-chemist/ecm16-ttl_simulation, Assembler: https://github.com/Pavel-chemist/ecm16ttl-asm/releases), so anyone interested can write their own software for this computer, or tinker with it's design.

Computer design goals:

Architecture:

RISC-like – inspired by MIPS, but is quite different. This is Load/Store architecture, meaning that ALU operations are only applied on data in registers, and for using data from memory it should be first loaded to these registers, or stored from them back to memory, in separate instruction cycle.

The instruction set is presented in its own sub-project.

The assembly language is presented also in ints own sub-project.

16-bit computer, 16-bit wide registers, 16-bit wide ALU and 16-bit bus.

Memory consists of 16-bit words.

Up to 4G bytes can be addressed in theory, by 32-bit addresses, byte-addressable memory. (Although most memory transfers would be word-sized). In real machine, no more than 16 MBytes of SRAM will ever be installed.

Register transfer scheme:

Component base: 74HCxx SSI and MSI chips (Elementary logic, multiplexers, flip-flops, 8-bit registers, and counters).

Input-output (tentative):

Input: keyboard.

Output: Monitor (VGA): characters, pseudographics, bitmap.

Mass storage: 1GB CompactFlash card through Parallel ATA interface.

Registers:

Register file: 8 16-bit registers, 3-address:

First address (A operand) is written with result of ALU operation on 2 registers (B and C operands). ;

Some ALU ops use only 1 register, as accumulator;

Memory Pointers: 8 16-bit registers paired to yield Program Counter, Stack Pointer, Frame Pointer and Base Pointer (all 32-bit);

Instruction register: 16-bit, holds running instruction;

Memory Data Buffer: 16-bit, holds additional data for some instructions;

Memory Address Buffer: 32-bit, holds result of Address Arithmetic Unit;

Interrupt Vector: 32-bit, holds address of Interrupt Service routine start, 2 parts: 16-bit Interrupt Vector Base (high word) and 8 bit Interrupt Vector address, second least significant byte, loaded when entering interrupt, the least significant byte is always zero;

Status Register: 4-bit, holds Main ALU output flags.

ALU (16-bit):

Functions: ADD, SUB, AND, NOR, XOR, SHIFT, ROTATE

B operand modifications: no, invert (1-complement), twos complement, replace with: 0, 1-255.

ALU Adder: fast adder (16-bit, with carry look-ahead) for high speed.

Memory Addressing:

Several different addressing modes are to be supported: Immediate, Direct, Indexed, Indexed with pre-increment and Indexed with post-increment.

Address Adder ( 32-bit, with carry look-ahead ), included into Address Arithmetic unit, which adds signed offset to Memory Pointer for indexed address calculation.

ALU together with register file are built, and assembled into the Calculating core. Its testing is described in more detail in its own project.

Computer050.zip

Added byte addressing mode; fixed couple of wiring bugs. Additionally, the simulation is pre-loaded with integer calculator program (decimal input and output), which does additions, subtractions, multiplications and divisions (also can calculate remainder of division). Numbers can be up to 9 digits long.

application/x-zip-compressed - 1.67 MB - 01/13/2020 at 06:37

Download

Computer042.zip

Several bugs fixed, addressing logic remade from scratch, so it more regular, plus adding true subroutine calls. Comes pre-loaded with 16-bit positive integer multiplication program, input and output are hexadecimal.

x-zip-compressed - 2.29 MB - 12/12/2019 at 05:44

Download

asm3.zip

Third version of assembler; supports constant and address labels, several directives.

x-zip-compressed - 1.46 MB - 12/12/2019 at 05:42

Download

asm2.zip

Second version of assembler: couple of bugs fixed, now can have arbitrary number of spaces in lines, and, most important, address labels are supported.

x-zip-compressed - 1.43 MB - 11/30/2019 at 08:20

Download

asm.zip

simple assembler for code for use with simulated computer

application/x-zip-compressed - 1.43 MB - 11/24/2019 at 13:24

Download

View all 8 files

  • Memory map

    Pavel12/12/2022 at 19:57 0 comments

    The computer has 32-bit address bus, which provides address space of 4 Gigabytes of memory. This is quite big, taking into account that processor will run at frequencies in range of megahertz. It will take more than half an hour for PC to traverse from address zero to the top of this address space doing only NOPs (they take 2 clock cycles) when running at 4MHz. This is impractical in my opinion. Also, as I intend to use SRAM chips, it will be very big and very, very expensive build, which may not even be capable of work at such speeds reliably.


    So there will only be maximum of 16 Megabytes of SRAM (main RAM) installed -- for this only lowest 24 bits of address are needed.

    The main RAM address range will be from 0x0000.0000 up to 0x00FF.FFFF. This is just 1/256 of all the address space. 

    Maybe some time in the future I'll implement some sort of memory management with paging and ability to swap memory to permanent storage, and thus full address range could be utilized.

    These 16 MB are the lower memory.


    There will also be a chunk of address space where ROMs and I/O is mapped. For future proofing I decided to place it at the very top, in range from 0xFF00.0000 to 0xFFFF.FFFF.

    The first megabyte (0xFF00.0000--0xFF0F.FFFF) is reserved for ROMs, where system monitor and standard interrupt service routines are to be located.

    The last megabyte (0xFFF0.0000--0xFFFF.FFFF) is for Video RAM -- there 2 video buffers are to be located, they will be switching as described on this section of wiki article (page flipping). The two pages of VRAM will not necessary occupy a whole megabyte, I think that pages 128 kBytes each will be enough. These buffer pages will be on the very top of address space.

    The next to last megabyte (0xFFE0.0000--0xFFEF.FFFF) is to be reserved for various I/O ports, video control registers, palette and character table.


    All the addresses in between will be left unused, there are just so many of them.


    Graphical overview of the whole address space:

    +-------------------------------------+ FFFF FFFF
    |       ROMs, I/O, video RAM          |  16MB higher memory
    +-------------------------------------+ FF00 0000
    |                                     | FEFF FFFF
    ~       No hardware mapped here       ~
    ~  May be used for virtual addresses  ~
    |                                     | 0100 0000
    +-------------------------------------+ 00FF FFFF
    |             Main RAM                |  16MB lower memory
    +-------------------------------------+ 0000 0000
    

    Graphical overview of the higher memory:

    +-------------------------------------+ FFFF FFFF
    |              Video RAM              |  2 pages 128 kBytes each for display data
    +-------------------------------------+ FFE0 0000
    |                 I/O                 |  keyboard, PATA, UARTs etc. mapped here
    +-------------------------------------+ FFD0 0000
    |                                     |  unused
    +-------------------------------------+ FFC0 0000
    |                                     |  unused
    +-------------------------------------+ FFB0 0000
    |                                     |  unused
    +-------------------------------------+ FFA0 0000
    |                                     |  unused
    +-------------------------------------+ FF90 0000
    |                                     |  unused
    +-------------------------------------+ FF80 0000
    |                                     |  unused
    +-------------------------------------+ FF70 0000
    |                                     |  unused
    +-------------------------------------+ FF60 0000
    |                                     |  unused
    +-------------------------------------+ FF50 0000
    |                                     |  unused
    +-------------------------------------+ FF40 0000
    |                                     |  unused
    +-------------------------------------+ FF30 0000
    |                                     |  unused
    +-------------------------------------+ FF20 0000
    |                                     |  unused
    +-------------------------------------+ FF10 0000
    |                 ROMs                |  EEPROMs with system software (monitor, ISRs)
    +-------------------------------------+ FF00 0000
    |                                     | FEFF FFFF
    ~          virtual addresses          ~
    
    

    The ROMs and VRAM will not occupy whole Megabyte slots, but placed individually for ease of addressing.


    Graphical overview of the VRAM slot:

    +-------------------------------------+ FFFF FFFF
    |               page 1                |  
    |            (128 kBytes)             |  
    +-------------------------------------+ FFFD 0000
    |               page 0                |  
    |            (128 kBytes)             |  
    +-------------------------------------+ FFFC 0000
    |                                     |  
    |                                     |  unused
    +-------------------------------------+...
    Read more »

  • Rethinking multiplexed outputs

    Pavel12/06/2022 at 07:12 0 comments

    There are several 8-to-1 and 4-to-1 multiplexers around the build, which are routing 16-bit and 32-bit wide signals. Managing all this wiring looks like a challenge (one have to look at the Pilot-1 CPU, which is mostly a register file coupled with ALU -- it is ribbon cables all around.

    There are quite a lot of wires criss-crossing the 8-to-1 multiplexer boards. It helps that these wires are all individual, and 3-d space allows to weave them as one pleases.


    Here is the example schematics for Register File, there are a LOT of 16-bit paths:


    If I to replicate such structure on PCB, turning these wires into traces on 2 planes, it seems that there will not be enough space, and I'll need multi-layered PCBs which are more complicated and expensive to order, and if there some wiring bug in buried layers, there is no way to repair it.

    The better solution, it seems, would be to replace multiplexers with tri-state buffers that feed into common bus and activated by single 3-to-8 decoder. This must greatly simplify the build.


    Revised schematics for Register File, using tri-state buffers, it looks much more manageable:


    This new solution will be applied for replacing 4-to-1 and 8-to-1 multiplexers; the 2-to-1 multiplexers are fine as they are, wiring them is not as complicated. This replacement will cut greatly into the number of ribbon cables and weird orthogonal board connections in the final build.

    The reason I was vary of using this solution earlier is that I thought it would be significantly slower. But it applies more to situations where the clock speeds are much higher and there are much more such outputs to single bus. In my case this speed penalty looks insignificant.

    Also, when actually looking into datasheets, the propagation delays for 8-to-1 multiplexer and tri-state buffer are almost the same, and my worries were unsubstantiated.

  • ALU staggering

    Pavel12/02/2022 at 19:00 0 comments

    I solved the staggering of ALU execution step and fetch of the next instruction.

    The state machine diagram is updated accordingly.

    The problem was in timings of propagation delays. All the registers are clocked on rising edge, and the update of gpr and IR should have been at the same exact moment. But in reality, there are several gate delays between rising edge of system clock signal, and the latching of particular register in the register file. On the other hand, instruction register is updated right at the system clock rising edge, thus control signals for the next instruction propagate at the time of latching registers for previous instruction, leading to wrong operation results.


    To solve this problem, I introduced "Shadow" register right after the Instruction register, that is latched some time after the IR:


    This new register is clocked with 90 degrees shift relative to system clock, so that instruction word is propagated to decoder a quarter of system clock period later, when all other registers are long latched to their values.


    Here is clock diagram for this:

    The system clock is derived from base clock by dividing its period by 2, and Shadow register clock signal is AND between base and system clock signals.


    2022-12-06 Update:

    I figured, there is no particular need to make additional clock phase -- all that needed is a sufficient delay between Instruction Register and Shadow Register latch signals. This can be accomplished by several consecutive NOT gates in the latching signal path for the Shadow Register. While tinkering with simulation, it become clear that delay of just 2 gates is sufficient. In real circuit it may better be done with longer delay, say 6 NOT gates, it will be 20-30 ns. 

  • Jumps simplification

    Pavel11/20/2022 at 18:50 0 comments

    The State Machine diagram and Jump instruction layouts are updated.

    It looks like there is no need for jumps with offsets in GPR - such jumps are unpredictable at time of coding/assembly, and it is likely that no use can be gained from such an instruction -> removing this option. 

    Jumps using MPs other than PC as base would be too confusing, and there is also no clear idea what they can be used for -> removing this option.

    It seems that jumps with current value in PC as base address are the only useful option -- they are easily predictable.

    Thus, decode complexity is reduced, and now also there is no distinction between JMP and JSR states, so only one of them remains.

    Jumping to absolute address can be accomplished using interrupts, or via direct or immediate load of address into PC register pair or via double-word MOV into this pair.

    For JSR, use of PC can be made implicit.

    All of the above simplifies the use of jumps in assembly, the instruction will look like this:

    JUMP offset

    ,where JUMP is any of 10 varieties of jump instructions, and offset is the distance from jump address and current PC value.

    In practice, the offset is to be calculated by the assembler, and user should use labels.

  • Yet another subproject

    Pavel11/12/2022 at 17:35 0 comments

    Just a note: I have created a new subproject for describing the assembly language to use with my CPU, and tracing some progress in development of assembler program.

  • Updated State Machine

    Pavel11/10/2022 at 17:07 0 comments

    As I tested the operations in simulation, I encountered the error in the design of Jump instructions, and thus the part of State Machine that is responsible for orchestrating execution of  Jump instructions was not right. Here is updated state machine diagram with corrected error:


    List of states:

    01 - F1 -- Fetch 1 state, loads 16-bit word from memory at address in PC into Instruction Register, advances PC by 2; it is a common start of any instruction execution cycle.

    02 - F2 -- Fetch 2 state, loads 16-bit word from memory at address in PC into Memory Data Buffer Register, advances PC by 2; it is used for 2-word instructions, second word containing some immediate value.

    03 - F_i -- Fetch Interrupt state, loads hardcoded JSR (SP) [r0] instruction into Instruction Register. The execution is modified so that [r0] value is ignored. This state is entered when IRQ signal is set, and current instruction is executed fully.

    04 - ALU -- ALU state, perform any of the 29 ALU operations on GPR. This state is normally staggered / concurrent with F1 state. In case there is an IRQ signal active, the next instructions Fetch state is entered only after current ALU state is executed.

    05 - MOV -- MOV state, Copy data from one register in CPU to another, can be repeated several times for multi-word transfers.

    06 - MAB load -- during this state, the 32-bit Memory Address Buffer is loaded with sum of value from MP and offset;  for loads/stores with pre-ncrement the Memory Pointer is also updated with this value.

    07 - LD/ST (MP) -- load or store value from memory at address in MAB to register, advance MAB by 2; can be repeated several times for multi-word memory transfers.

    08 - MP Post incr -- Updates 32-bit Memory Pointer Pair with sum of  its current value and signed 16-bit offset.

    09 - LD/ST direct -- Loading/storing 16-bit value from memory at immediate address (provided from combination of low byte of Instruction Register and Memory Data Buffer register).

    10 - LDi -- Loading of 16-bit value from Memory Data Buffer register into one of the GPR, or sign extended 25-bit value from combination of low 9 bits of Instruction Register and Memory Data Buffer register.

    11 - Addr Arithm -- Address Arithmetic, Loading 32-bit MP with sum of its current value and signed 16-bit offset.

    12 - Misc -- Any of 8 unsorted instructions that are executed in one clock cycle.

    13 - JMP -- Jump, Loading 32-bit PC with sum of its previous value and signed 16-bit offset.

    14 - SP-4 -- Loading 32-bit SP with its current value minus 4, also storing this new value to 32-bit Memory Address Buffer.

    15 - Push PC -- Store PC register pair to memory at current address in MAB, update MAB with its current value plus 2. This state is repeated 2 times, storing high and low halves of PC.

    16 - EINT -- Enter Interrupt, or Jump to Interrupt Service routine, Loads PC with current value from Interrupt Vector register, clears IRQ signal.

  • Update on simulation; rethinking Jumps

    Pavel11/06/2022 at 16:33 0 comments

    With my new rudimentary assembler, I wrote a series of test routines for the instructions used in CPU. This is a great help for the debugging the simulation in systematic way, and using them, to date, I tested all instruction types except Jumps and Miscellaneous.

    During the testing several wiring bugs were found and fixed.

    As I started to test the Jumps, I found that I thought out them in an erroneous way. 

    The way I designed the microinstruction sequence, the jump address is being loaded from memory, which is not how it should work: 

    -

    Old (wrong) Jump instruction sequence:
    0    Fetch1        // load IR <-- Mem:PC;       PC <-- PC+2
    0'    (Fetch2)    // load MDB <-- Mem:PC;   PC <-- PC+2
    1    Set 2w flag
    2    Load MAB <-- MP+offset  // offset may be GPR or MDB
    3    Load PCH <-- Mem:MAB; load MAB <-- MAB+2
    3*   Load PCL <-- Mem:MAB; load MAB <-- MAB+2

    This sequence is loading direct address for jump from somewhere in memory.

    How should jumps work:

    The way jumps should work is by loading the address from some Memory Pointer pair, with possible offset, into PC pair. This can be readily done using Address Arithmetic instruction for unconditional jumps. But for conditional jumps, there need to be a special kind of instruction anyway, one that encodes condition. When using the PC as base address, such jumps facilitate position independence of the piece of code.

    So, the jump instructions have quite a lot in common with Address Arithmetic instructions while also having some resemblance to Indirect Memory Accesses, but not almost identical to the latter, as I thought earlier.

    If there is a need to load direct jump address from memory (i.e. in case of returning from subroutine), it can be done with 2-word load via address pointer into the PCH and PCL registers.

    -

    Here is revised sequence for regular jumps:

    0    Fetch1        // load IR <-- Mem:PC;       PC <-- PC+2
    0'    (Fetch2)    // load MDB <-- Mem:PC;   PC <-- PC+2
    1    Load PC <-- MP+offset  // offset may be GPR or MDB. The offset is signed value.
    -

    Sequence for Jump to Subroutine:

    This is a jump combined together with storing of current PC value to memory, at address pointed by Stack Pointer (with some offset). 

    The store part has the same microinstruction sequence as regular indirect store with pre-increment, only difference is that it is hardwired to store the PCH/PCL pair, and pre-increment is always -4. The base pointer can be any of the four memory pointers, but convention is to use the second one (PC). The following jump part is the regular unconditional jump. This combination makes possible to store the address which when loaded into PC again allows the program readily proceed with execution without further modification of PC value.

    -

    mnemonic for Jump to Subroutine instr.

    JSR (SP) [offset]

    The offset is distance in bytes from current PC to start of subroutine, it is used in later, "jump" part of sequence.

    -

    0    Fetch1       // load IR <-- Mem:PC;      PC <-- PC+2
    0'   (Fetch2)    // load MDB <-- Mem:PC;  PC <-- PC+2
    1    Set 2w flag
    2    Load MAB <-- SP-4, Load SP <-- SP-4 // most common MP will be SP
    3    Store Mem:MAB <-- PCH;  MAB <-- MAB+2
    3*   Store Mem:MAB <-- PCL;   MAB <-- MAB+2
    4    Load PC <-- PC+offset  // offset may be GPR or MDB. The offset is signed value.

  • Status Register, Prefixes and Interrupt masks

    Pavel10/27/2022 at 20:03 0 comments

    There are several groups of latches scattered around the CPU, which can be collectively named as "Flags register". 

    The Status Register - 4 flags to store some properties of ALU operation result. These flags are updated only at ALU operation, or by special MOV, otherwise they stay intact during execution of other operations. Their contents influence results of conditional jumps.

    Prefixes - 4 flags for multi-word operations and 1 flag for byte transfer. These are located in Multi-word Transfer Enabler circuit that is integrated with the sequencer. The multi-word transfers are applicable to regular MOVs and memory accesses via memory pointer. The byte transfer flag is only affecting memory access operations. These 5 flags can only be explicitly set by "prefix" instructions - they are cleared automatically after transferring a number of words/bytes according to particular flag. The "byte" flag is working in parallel to the "multi-word operation" flags, so one can set the byte flag together with multi-word transfer flag, and then execute load instruction, thus loading several bytes in consecutive registers. All these flags are cleared at the end of this Load operation execution.

    The Multi-word Transfer Enabler makes some states to repeat several times in row (the looping arrows from the state into the same state on the State Machine diagram).

    The operations for setting "Prefixes" flags are in "Miscellaneous" group.

    Interrupt Masks - several flags (for now I think that 7 or 8 may be needed), that can be set or cleared using operations from "miscellaneous" group. These flags are meant to differentiate interrupts by priority, so that while serving high-priority interrupt, the low-priority ones are ignored or put on queue. This part for now is least defined. 

    The Interrupt Masks group of latches/flags are to be placed in Interrupt Controller circuit that is yet to be developed. The sequencer itself only sees if there is interrupt request pending or not.

  • Current status (oct 2022) and plans for the future

    Pavel10/27/2022 at 16:30 0 comments

    For some time ( ~8-9 months) I put this project on hold... again. This time not just for losing interest to other topic, but because I decided to (hopefully temporally) move from Belarus to other country, to get farther away from war between Russia and Ukraine.

    All the hardware was left in my home country, there was a bit of stress, and many more urgent things to do.

    Finally, for now I have the interest for this project reignite, and I continuing developing it. 

    For a time, this project will be software/documentation only - recently I updated the instruction descriptions, and started writing an assembler program that translates mnemonics into machine instructions.

    As of now, plans are as following:

    - finish defining and describing instructions in Miscellaneous group

    - finish and debug the model in logic simulator

    - make a proper assembler

    - write an emulator for the CPU - this probably will be finished before or alongside the model in logic simulator

    - (provisional) - make working model using some hardware description language - this will require me to learn such a language

    - (provisional) - make pcb designs

    I am thinking now, that ordering custom PCBs is the way to go at the time when I'll have the opportunity to return to making hardware, as wiring all parts by hand on a perf board is very time-consuming, finicky, and often error-prone process (although a satisfying one). Resulting hardware turns out to be quite fragile, with all these dangling wires.

    If I will go through this custom PCB business, the Pilot system will remain on its own, and proper processor will be built up from scratch, with parts (calculating core) functionally identical to corresponding ones already built.

    Before this the finishing of the model is top priority: as of now, not all aspects of jump implementation are thought out, and the operations in Miscellaneous group are not properly defined. The Memory subsystem is in very draft stage, and I/O is not done at all. The Interrupt system, although broadly exists, but some details need work to do. Lastly, startup system is to be developed, so that CPU at start automatically enters the right working state.

  • New sub-project describing the instruction set

    Pavel01/26/2022 at 10:39 0 comments

    I created a new project related to this one for publishing the instruction set used by this ECM-16/TTL cpu. In that project all of the instructions are going to be presented and fully described in one place.

View all 46 project logs

Enjoy this project?

Share

Discussions

Peabody1929 wrote 01/25/2021 at 19:15 point

Have you considered using a 74HC283 4 bit full adder with carry lookahead?  Or do you want to implement the adder at the gate level?  Using 4 bit or 8 bit wide parts would make the schematic simpler to read and the board easier to build.

  Are you sure? yes | no

Pavel wrote 01/26/2021 at 07:59 point

This chip is nice of course. But I tried to use the individual gates as much as possible. It also was interesting for me to develop such circuit by myself.

  Are you sure? yes | no

peter wrote 12/20/2019 at 17:39 point

Hi, great project!! how did you the transfer from Digital software to schematics and PCB design? Any tools, or made by hand?

  Are you sure? yes | no

Pavel wrote 12/26/2019 at 05:33 point

All by hand. Digital has library of DIL shapes for chips, so I start with replacing conventional logic element shapes with these, and then use it as reference when soldering the thing. I do not make or order custom PCBs, all is point-to-point soldering on perfboard.

  Are you sure? yes | no

Ken Yap wrote 11/24/2019 at 14:15 point

So no byte addressing? I suppose that keeps the addressing simple. Still, it means you will waste half the storage for characters and character strings.

This is the kind of CPU BCPL was targetting, but to handle strings, BCPL first had byte packing and unpacking routines, and later the % infix byte indirection operator.

  Are you sure? yes | no

Pavel wrote 11/24/2019 at 17:07 point

Well, I have consciously made such choice. And it doesn't mean that half of the storage is wasted, as character set could be made richer with more pseudographics and other non-Latin scripts (in my country the script used is Cyrillic, for example), and also maybe colour info encoded in the higher bits. On the other hand, basic Unicode is also 16-bit, so going this route, the strings can be made just that.

And I intend to use fairly modern and capacious storage, so having text info using up twice as much space than it would if I used byte-addressable memory is not that big of a deal.

  Are you sure? yes | no

Julian wrote 08/20/2018 at 23:31 point

Nice.  I think yours might be the only project on this site using Digital other than my 6-bit CPU (https://hackaday.io/project/159003-c61). :)

Out of interest, in case you need it I have a Digital plug-in library that provides a variable width/length FIFO component, which I've implemented for my planned IO processor project but haven't got around to actually using yet.  If you have any need for such a thing, let me know, and I'll upload it somewhere so you can use it.

  Are you sure? yes | no

Similar Projects

Does this project spark your interest?

Become a member to follow this project and never miss any updates