Building a CPU out of discrete logic ICs is something I've been wanting to do for a very long time. However, I always thought that the amount of parts I needed would be to expensive for me for quite some time. Until recently, when I randomly had a look through my box of unsorted electronic components that I keep in the back of my shelf, and discovered a whole heap of logic IC. Which, surprisingly, contained a bunch of very specific chips that I required for building something as complex as an entire CPU. And so, just two weeks later, I finished soldering the last component onto this monstrosity:
I call it the Scrap-CPU since I built it only using components I found in my room somewhere, which includes all ICs, LEDs, the three perfboards I put everything on, and the (at least) 20 meters of wiring used to connect everything.
The CPU itself is a Harvard architecture Microcontroller. However, as I didn't have enough D-type flip-flop ICs to build enough 8-bit registers, I had to make the compromise of reducing the CPU to 6 bits in order to implement all the functionality I wanted. As a result, the CPU can address 64 words of RAM and ROM (though I plan on implementing bank-switching for the ROM to allow complex programs to be run on the CPU).
The CPU has an instruction set of 25 instructions which take anywhere from 2 - 6 clock cycles to execute. However, the CPU's clock speed is currently limited to around 375KHz due to an issue with the program ROM.
The ALU includes operations for Adding, Substracting, Magnitude Comparison, and (soon) bitshifts.
For the rest of this post, I will describe the different parts of the CPU, how they work, and how I worked around the many limitations I ran into while trying to build a CPU using the few types of ICs I had on hand.
But first, here's a video of the CPU running its first program, calculating fibonacci numbers:
The ALU of the Scrap-CPU is the first board I built and tested, and arguably the most complex part of the CPU, as it actually takes up half of the total board space. It is set up in a quite unique way, with one register, the Accumulator or A-Register, and the current bus contents acting as its inputs and with another register, the B-Register, at its output. There's also a 2-bit status register, that contains a 0-flag bit and carry-flag bit.
An ALU operation is executed by simply putting data from Memory on the bus before clocking the B-Register to store the result. Whether or not the data in the B-Register is copied back into the Accumulator depends on the instruction that is being executed.
The Operations that can be executed on the ALU are: Addition, Substraction, two types of magnitude comparison (equal and greater-then), Add-with-carry and Substract-with-carry. I'm also going to implement right shift and left shift operations as soon as I can get my hands on another perfboard.
Add-with-carry and Substract-with-carry are special, as they manipulate the carry bit that's input at the first stage of the ALU's Ripple Carry Adder before adding/substracting. This allows for multiple words to be combined to represent larger numbers. Add-with-carry simply inputs the carry bit currently inside the status register, while Substract-with-carry inputs the inverse of that bit.
Construction wise, the ALU is mostly made out of discrete logic gates with the exception of the registers, which are made using 74hc74 D-type flip-flops and the magnitude comparator, which I build out of two 74hc85s. The Ripple Carry Adder and control logic are entirely made using only AND and XOR gates.
The RAM module is probably the simplest component in the entire CPU. It's simply a register, the Memory Address Register, that, as the name implies, stores and address in Memory, connected to an SRAM IC. The particular IC used actually contains 32KB of RAM, but only the first 6 bits of the first 64 bytes are used by the CPU. The only other thing of note here is that the register is build out of CD4013BE ICs (which are also D-type flip-flops), as I ran out of 74hc74s after building the ALU and putting the rest of them aside for use in the control logic.
There is a 7-Segment Display present underneath the RAM module. It is the only component that is mapped to a RAM address (writing to address 63 will set the value on the display), as I ran out of ICs to build a dedicated output register. It was actually sortof hacked-on at the last minute. As a result, its actually driven by an Attiny Microcontroller due to the mentioned lack of ICs to build another register and lack of remaining board space. I absolutely needed some sort of output display to better debug the CPU at higher frequencies, and this was the only solution I could come up with.
This is where the magic happens. The Control Logic mainly consists of four parts: the program counter, ROM, the Instruction Register, and the Microcode ROMs.
There's also a little more logic to deal with lines that are active low instead of active high and to make sure that the clock lines for all registers are only active when they're supposed to.
However, this module is where I encountered the most issues, mainly with getting the program counter to work. I unfortunately discovered too late that I didn't have any binary counter ICs. Instead, I only had BCD counters, which have a 4-bit output, but only count to 9 (1001 in binary) before reseting. Luckily, you can trick them into working like regular 3-bit binary counters by connecting the 4th bit of each IC into its own reset pin. The short pulse generated on that pin between the counter reaching 1000 and reseting can then be used to increment the next counter. So, I now had a 6-bit counter.
Next, I ran into issues with the counter's preset function. Turns out that pulling the preset enable high on the counter ICs causes them to continuously set their value to the bus contents as long as preset enable remains high. This caused a whole bunch of issues. You can actually see my solution in the second photo at the beginning of this post in the form of the small circuit that's dangling of the side of one of the boards. That circuit basically just turns any signal at its input into a single, reeeeally short pulse that then goes to the counter's preset enable.
Designing the instruction set for this CPU was actually quite difficult. Mainly because 6 bits isn't exactly enough to implement all the opcodes I would have liked to. For example, each instruction is limited to two addressing modes. What those modes are depends on the instruction executed.
Here's a quick overview of the 17 base instructions:
|LOAD||Load from memory into the Accumulator|
|STORE||Store data in B-Register to memory|
|STOREA||Store data in accumulator to memory|
|ADD||Add to the Accumulator|
|SUB||Substract from the Accumulator|
|EQL||Outputs 1 if Accumulator = Input, 0 otherwise|
|MAG||Outputs 1 if Accumulator > Input, 0 otherwise|
|LSH||Left-shift value in Accumulator|
|RSH||Right-shift value in Accumulator|
|JZ||Jumps only if ALU's 0-flag is set|
|JNZ||Jumps only if ALU's 0-flag is NOT set|
|LOADm||Sets the value in the Memory Address Register|
|LOADi||Loads instruction's argument from ROM to the accumulator|
Since there only 16 base instructions (LOADi is special, more on that in a bit), they can be encoded using only four bits. The fifth bit is used in arithmetic instructions to indicate if they will update the Accumulator or not. If the fifth bit is not set and an arithmetic instruction is executed, it copies the contents of the B-Register back into the Accumulator after computing its result. If the fifth bit is set, it skips that step. The Accumulator retains its current value, but the result of the operation is still stored in the B-Register. In the CPU's Assembly language, this bit can be set by prepending the name of the arithmetic instruction with a lowercase 'q' (e.g. qADD, qSUBc, qEQL, etc.), which technically extends the amount of unique instructions to 25.
This leaves one bit to indicate the addressing mode. If the bit is set, it means that the value in the Memory Address Register will not be changed, and the instruction will operate on the word already being pointed to by the Memory Address Register. This means that jump instructions will set the program counter to that value, arithmetic instructions will operate on it and LOAD, LOADm, STORE and STOREA will load/store to the address.
What happens if the bit is not set depends on the specific instruction being executed.
In case of JMP, JZ and JNZ, the instruction's argument is used as the address to jump to, LOADm will set the value in the Memory Address Register to the instruction's argument and LOAD, STORE, STOREA and all artihmetic instructions will first copy their argument into the Memory Address Register before executing.
LOADi is the only exception to all those rules. It is encoded as all 1s, and will always copy its argument into the Accumulator.
Instructions can be either two words or one word, depending on their addressing mode (with the exception of LOADi and NOP, which always need an argument). The first word is the opcode, and the second is the optional argument. As a result, in the Assembly language, the addressing mode of an instruction can be set by either giving it an argument or not.
The code for the Assembler and a Simulator for the CPU can be found on my GitHub, together with a series of example programs. It can all be found here:
Join the ring?To join the Homebuilt CPUs ring, drop Warren a line, mentioning your page's URL. he'll then add it to the list.
You will need to copy this code fragment into your page.
Note: The ring is chartered for projects that include a home-built CPU. It can emulate a commercial part, that′s OK. But actually using that commercial CPU doesn′t rate. Likewise, the project must have been at least partially built: pure paper designs don′t rate either. It can be built using any technology you like, from relays to FPGAs.