Published on

Nand to Tetris Part 1

Authors
  • avatar
    Name
    Joseph Fortman
    Twitter

Resources

Nand to Tetris website
Coursera Link
My solutions - GitHub

Overview

Build a Modern Computer from First Principles: From Nand to Tetris.

In this project-centered course* you will build a modern computer system, from the ground up. We'll divide this fascinating journey into six hands-on projects that will take you from constructing elementary logic gates all the way through creating a fully functioning general purpose computer. In the process, you will learn - in the most direct and constructive way - how computers work, and how they are designed.

Hardware Simulator

In my undergrad work we covered a number of these topics, but not in this level of detail. In this Coursera, we must implement chips for 16-bit operations like Mux8Way16. Of course, today's computers usually have 64-bit architectures, but the concept is the same.

Highlights

The architecture is 16bit (vs modern 64bit) and program instructions are stored in ROM, completely separate from program data stored in RAM. The advantage is simplicity and security, with the disadvantage of difficulty changing the running program.

This is a general purpose computer that can execute any arbitrary program, provided the program is compiled for the Hack architecture.

ALU

The Arithmetic Logic Unit (ALU) is the heart of the CPU. When people casually talk about a CPU, they usually mean the ALU. It exposes the lowest level API to execute an operation on the machine, without considerations for memory or clock.

Full details

As a black box, the ALU listens to two inputs (x and y) and performs the an action from the table. There are 6-bits of inputs which allows 64 possible instructions, but only <20 instructions are used/needed. If the pin names are ignored, the instruction input can be written as a single value, eg 010011 means "output x-y". This will later become part of the instruction set for the Assembly language.

ALU

Understanding the pin names reveals the hidden simplicity of the design.

zx: set x = 0
nx: negate x
zy: set y = 0
ny: negate y
f: "function": 0 => bitwise AND; 1 => binary addition
no: negate output

The circuit evaluates each pin in series, with the result being the desired higher-order operation from the table.

ALU

x&y and x+y are great examples to step through

ALU.hdl
CHIP ALU {
    IN  
        x[16], y[16],  // 16-bit inputs        
        zx, // zero the x input?
        nx, // negate the x input?
        zy, // zero the y input?
        ny, // negate the y input?
        f,  // compute out = x + y (if 1) or x & y (if 0)
        no; // negate the out output?

    OUT 
        out[16], // 16-bit output
        zr, // 1 if (out == 0), 0 otherwise
        ng; // 1 if (out < 0),  0 otherwise

    PARTS:
    // zx
    Mux16(a=x, b=false, sel=zx, out=xVal);
    // nx
    Not16(in=xVal, out=notX);
    Mux16(a=xVal, b=notX, sel=nx, out=finalX);
    // zy
    Mux16(a=y, b=false, sel=zy, out=yVal);
    // ny
    Not16(in=yVal, out=notY);
    Mux16(a=yVal, b=notY, sel=ny, out=finalY);
    // f
    Add16(a=finalX, b=finalY, out=sum);
    And16(a=finalX, b=finalY, out=and);
    Mux16(a=and, b=sum, sel=f, out=res);
    // no
    Not16(in=res, out=notResult);
    Mux16(a=res, b=notResult, sel=no, out=finalRes);
    // out
    Or16(a=finalRes, b=false, out=out);
    // zr
    Or16Way(in=finalRes, out=isNotZero);
    Not(in=isNotZero, out=zr);
    // ng
    Neg16(in=finalRes, out=ng);
}

CPU

The Central Processing Unit (CPU) is the most complicated single chip of the computer. It exposes an input for instructions and data, and outputs data according to its instructions. Every high-level command on the computer gets translated into a binary instruction for the CPU to tediously execute.

Full details

An instruction goes into the CPU as a 16 bit value. Below you see the assembly instruction D=D+1;JMP, which is converted to binary as 11100111 11010111. When executed, this command is supposed to increment register D and set the next instruction to the value of register A.

Looking closely, each bit of the sequence has an independent function. The most significant bit is the opcode, which determines whether it is an A command or C.

A Commands are simple: treat the instruction as an integer, and load the value into register A. When a jump command is issued, the PC will reflect the current value of register A.

C Commands break down into smaller parts:

  • The 2 bits after the opcode are unused here and are always 1.
  • The 4th bit controls M vs A register (eg, D=D-M vs D=D-A).
  • The next 6 bits map to the ALU instruction
CPU - Program Control
CPU - Understanding Jump Bits

The last 3 bits of the instruction translate to the jump instruction. This is a classic example of a bit field. Combining the flag for jump equal and the flag for jump greater than, the result is jump if greater than or equal to.

CPU - Program Control

The program counter determines which instruction is executed next. Reset starts the computer. Normally each step increments the program counter by 1. Instructions can be issued to override the normal sequence, and jump to another instruction.

CPU.hdl
CHIP CPU {

  IN  inM[16],         // M value input  (M = contents of RAM[A])
      instruction[16], // Instruction for execution
      reset;           // Signals whether to re-start the current
                        // program (reset==1) or continue executing
                        // the current program (reset==0).

  OUT outM[16],        // M value output
      writeM,          // Write to M? 
      addressM[15],    // Address in data memory (of M)
      pc[15];          // address of next instruction

  PARTS:
  Not(in=instruction[15], out=isAInstruction);    // otherwise: C instruction
  
  // RegA
  Mux16(a=toOutM, b=instruction, sel=isAInstruction, out=inRegA);
  And(a=instruction[5], b=instruction[15], out=isCAndDestA);
  Or(a=isAInstruction, b=isCAndDestA, out=shouldWriteA);
  ARegister(in=inRegA, load=shouldWriteA, out=regA);

  // RegD
  And(a=instruction[4], b=instruction[15], out=isCAndDestD);
  DRegister(in=toOutM, load=isCAndDestD, out=regD);

  // ALU
  Mux16(a=regA, b=inM, sel=instruction[12], out=AorM);
  ALU(x=regD, y=AorM, zx=instruction[11], nx=instruction[10], zy=instruction[9], ny=instruction[8], f=instruction[7], no=instruction[6], out=toOutM, zr=isZero, ng=isNeg);

  // ALU Outputs
  Or16(a=regA, b=false, out[0..14]=addressM);
  Or16(a=toOutM, b=false, out=outM);
  And(a=instruction[3], b=instruction[15], out=writeM);

  // Jump bits
  And(a=instruction[2], b=isNeg, out=jumpNeg);

  And(a=instruction[1], b=isZero, out=jumpZero);

  Or(a=isNeg, b=isZero, out=isNegOrZero);
  Not(in=isNegOrZero, out=isPos);
  And(a=instruction[0], b=isPos, out=jumpPos);

  Or8Way(in[0]=jumpNeg, in[1]=jumpZero, in[2]=jumpPos, out=shouldJump);

  // PC
  And(a=shouldJump, b=instruction[15], out=isCAndShouldJump);
  PC(in=regA, load=isCAndShouldJump, inc=true, reset=reset, out=nextInstruction);
  Or16(a=nextInstruction, b=false, out[0..14]=pc);
}