Skip to content

C. Project Design

Nate edited this page Dec 13, 2024 · 7 revisions

1: VGA Controller

1.1: General Design

The general design of the VGA controller uses two frame buffers controlled by a master. The controller takes a memory address and 32-bit data as input and will store the data at that address in the master buffer. The input data consists of 4 bytes. the first byte (starting at bit-32) contains an ASCII code. The other three bytes contain red, green, and blue color values in that order.

Each frame, one of the two frame buffers outputs its contents to the display, and the other frame buffer writes the current contents of the master buffer to itself. After each buffer finishes its task, their functions swap. The addition of the master buffer allows asynchronous writing, as the switching of the two frame buffers does not affect the master buffer's function. This also simplifies interactions between the VGA controller and processor.

When data is read from the frame buffer, it is decoded into pixel data using an ASCII bitmap stored in a ROM. The pixel data is then sent to the VGA driver, which implements proper VGA timings and sends output signals to the monitor.

1.2: File Functions

1.2.1: vga_driver.v

The vga_driver module is what drives and times the signals to the VGA port on the FPGA. The module consists of two finite state machines, one that controls horizontal timing and synchronization, and one that controls vertical timing and synchronization. Each FSM has four states: Display, Front Porch, Sync, and Back Porch. They are clocked by a 25MHz clock from the clock_divider module. The horizontal states are controlled by a counter that iterates every clock cycle. The vertical states use a similar counter that iterates every full cycle of the horizontal states. It uses standard timings for a 640x480@60Hz display output (seen near the top of the module), with the horizontal timings in clock cycles and the vertical timings in number of lines (see more information on timings here: http://www.tinyvga.com/vga-timing/640x480@60Hz).

Each state determines which signals are sent to the VGA port to remain synchronized with the monitor:

Blank signal - Active low, forces the RGB color inputs to zero. Sync signal - Active low, two signals: hsync and vsync that indicate the beginning of horizontal lines and entire frames respectively.

  • Display
    • Sync = 1
    • Blank = 1
  • Front Porch
    • Sync = 1
    • Blank = 0
  • Sync
    • Sync = 0
    • Blank = 0
  • Back Porch
    • Sync = 1
    • Blank = 0

The driver functions such that the monitor outputs an image while both FSMs are in the display state.

The driver also outputs a few useful control signals:

  • x - The horizontal position of the pixel being displayed
  • y - the vertical position of the pixel being displayed
  • disp_done - pulses high when a full frame has been displayed.

1.2.2: double_buffer.v

The double_buffer module contains the logic that drives the two frame buffers used to send a smooth picture output to the monitor. Each frame buffer is a single port RAM module with 4800 words and a word width of 32 bits. The first byte of each word is an ASCII code, the second is the red color value, the third is the blue color value, and the fourth is the green color value. A finite state machine controls the inputs and outputs of each buffer. The buffer connected to the display is read-only and cannot be modified until the buffers are switched. The other buffer is connected to the modules inputs and can be written to by other modules. The FSM maintains this state until the switch_buffer signal goes high, in which case the two buffers will swap their function (the write buffer becomes the display buffer, and vice versa).

1.2.3: clock_divider.v

The clock_divider module is a simple function that takes a clock signal as input, and outputs a clock signal with half of the original clock's frequency.

1.2.4: vga_controller.v

The vga_controller module serves as an interface between the double_buffer and the vga_driver. It contains a few logical implementations:

  1. ROM initialization
  2. combinational ASCII decoder
  3. buffer switching FSM

The Char_ROM a 1024-word 32-bit word width ROM that contains encodings for standard ASCII characters. Each character is 8x8 pixels, where the top left corner of each character is at (0,0), and the bottom right corner is (7,7). to retrieve a specific pixel, the memory address is equal to the ASCII character value multiplied by 8 plus the pixels y-value. After the data is returned from the ROM, shift it right by the pixels x-value:

For example, the character 'A' is encoded as follows:

    (address) => (hexadecimal data) => (binary data) => (visualization)

    0x208 => 0x0C => 0000 1100 => ..XX....
    0x209 => 0x1E => 0001 1110 => .XXXX...
    0x20A => 0x33 => 0011 0011 => XX..XX..
    0x20B => 0x33 => 0011 0011 => XX..XX..
    0x20C => 0x3F => 0011 1111 => xxxxxx..
    0x20D => 0x33 => 0011 0011 => XX..XX..
    0x20E => 0x33 => 0011 0011 => XX..XX..
    0x20F => 0x00 => 0000 0000 => ........

right shifting the first line of 'A' to retrieve a pixel:

                         . . X X . . . .
                         | | | | | | | |
    (0x0C >> 0) & 1 == 0-+ | | | | | | |
    (0x0C >> 1) & 1 == 0---+ | | | | | |
    (0x0C >> 2) & 1 == 1-----+ | | | | |
    (0x0C >> 3) & 1 == 1-------+ | | | |
    (0x0C >> 4) & 1 == 0---------+ | | |
    (0x0C >> 5) & 1 == 0-----------+ | |
    (0x0C >> 6) & 1 == 0-------------+ |
    (0x0C >> 7) & 1 == 0---------------+

The combinational ASCII decoder performs the above function. The x and y coordinates from the vga_driver are used to index into the display buffer (see double_buffer.v). The ASCII code from the first byte of the display buffer is then used in combination with the x and y coordinates to retrieve the proper pixel from the Char_ROM and send it to the vga_driver. There is a time delay to access the RAM, which is accounted for both in the logic and by automatic correction in the monitor.

The final piece of code in the vga_controller is the FSM to control the switching of the frame buffers. It takes in two signals: the disp_done signal from the vga_driver and a vga_write_done signal from a higher-level module. It's purpose is to sync the switching of the frame buffers with the vertical blanking period of the diplay driver so that the switch is unnoticeable. when both signals are active, it pulses the switch_buffer signal, then waits for 16 clock cycles. This delay period gives adequate time for the disp_done and vga_write_done to go low to prevent the frame buffer from switching multiple times.

1.2.5: ascii_master_controller.v

The ascii_master_controller is the top-level module for the full VGA controller. Its primary purpose is to implement a third buffer, the master buffer, that can be accessed without synchronizing timings with the frame buffer and vga_driver. the master buffer itself is a dual port RAM with the same properties as the frame buffers (4800 32-bit words: ASCII, red, green, blue). A finite state machine within the module will detect when the frame buffers are switched, then write the contents of the master buffer to the write buffer. Once it is done writing, it will set the vga_write_done signal to high, which signals that the frame buffer can safely switch. After the buffer switches, the vga_write_done signal will go low, and the cycle will repeat.

The dual-port nature of the master buffer enables simultaneous reads and writes, so a character can be written to the master buffer asynchronously, as it will be transferred to the frame buffer during the next write cycle.

2: RISC-V Processor

2.1: General Design

The processor is an implementation of the RISC-V ISA specification using a Von-Neumann architecture meaning the instruction memory and data memory are stored in the same memory space. Programs are loaded into memory using the Intel .mif (memory initialization files) and executed using a 5 stage finite state machine. The processor is connected to the ascii master controller and the key buttons through memory mapped I/O ports allowing it to read input and print to the screen with a special write assembly routine.

2.2: File Functions

2.2.1: RISC_V.v

This contains the main control loop of the processor and all of the logic for updating register, interfacing with memory, and modifying the program counter.

There are 5 main stages to the processor implemented as a finite state machine.

  1. Fetch
  2. Decode
  3. Execute
  4. Read Mem
  5. Update

Fetch

The fetch step includes 2 states (FETCH, WAIT_FETCH) during which the memory is read from the address of the program counter and is stored in the instruction register (IR).

Decode

During the decode step the instruction decoder receives the data from the IR and outputs a list of control flags to be used during the update step. It also parses and sign extends any immediate values (literal values encoded into instructions).

wire decode_error; // Throws the FSM into an error state if the opcode is not recognized
wire [4:0] rd; // Contains the ID of the register to save the result
wire [4:0] rs1; // Contains the ID of the register to use as the first operand to the ALU
wire [4:0] rs2; // Contains the ID of the register to use as the second operand to the ALU
wire rs1_use_pc; // Flag controls whether to replace the first operand with the program counter
wire rs2_use_imm; // Flag controls whether to replace the second operand with and immediate
wire [WORD_SIZE-1:0] immediate; // The parsed and sign extended immediate value
wire [3:0] alu_op; // Controls which operation the ALU performs (Add, Sub, Shift Left, Shift Right, Xor, Or, And)
wire [2:0] reg_load_size; // When loading from memory determines the size of the loaded value (8 bits, 16 bits, or 32 bits) and whether to sign extend it
wire [1:0] mem_write_size; // Controls when to write to memory and how much data to write (8 bits, 16 bits, or 32 bits)
wire mem_to_reg; // Controls whether to store the ALU result or the memory output into the destination register
wire [2:0] branch_condition; // Controls which comparison is done for branching
wire branch; // Flag controls whether to branch if the branch condition is met
wire jump; // Flag controls whether to jump
wire jal_or_jalr; // Flag controls whether to jump relative to the program counter or relative to register 1

Execute

During this step, operands are passed to the ALU according to the rs1_use_pc and rs2_use_imm flags.

Read Memory

This step includes two states (MEM_ACCESS, WAIT_MEM_ACCESS) during which the memory address is set to the ALU output in preparation for the update step.

Update

This step includes three states (UPDATE, WAIT_UPDATE, CLEANUP_UPDATE). The UPDATE state includes combinational logic to write the correct values into the destination register, program counter, VGA controller, and memory. The WAIT_UPDATE state allows the byte addressable write to the memory. This state may take multiple clock cycles and is only used when storing to memory. The CLEANUP_UPDATE state disables the write enable signals on the register file, VGA controller, and memory.

Debug States and Error Handling

The remaining states in the FSM are used for debugging (GET_REG, WAIT_REG, DISP_REG, DISP_BYTE, WAIT_BYTE, PRINT_DONE, INCREMENT_DISPLAY, INCREMENT_BYTE) and error handling (DECODE_ERROR, MEM_ERROR, FSM_ERROR). The debug states print the contents of the each register to the VGA, but dramatically increase the time spent on each cycle.

2.2.2: instruction_decoder.v

Following the RISC-V ISA for instruction formats the instruction decoder takes a 32 bit instruction as input and outputs a set of control flags.

There are 6 types of instructions described by the ISA:

Format Bits
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
Register funct7 rs2 rs1 funct3 rd opcode
Immediate imm[11:0] rs1 funct3 rd opcode
Store imm[11:5] rs2 rs1 funct3 imm[4:0] opcode
Branch [12] imm[10:5] rs2 rs1 funct3 imm[4:1] [11] opcode
Upper Immediate imm[31:12] rd opcode
Jump [20] imm[10:1] [11] imm[19:12] rd opcode

Notably, the instruction decoder also parses and sign extends the various formats for the immediate values using the sign extender utility module.

2.2.3: sign_extender.v

This is a simple utility module which sign extends a given input. It is a parameterized modules, so you can specify at compile time how many bits to sign extend.

2.2.4: register_file.v

This contains the 32 general purpose registers defined by the ISA. It's implementation is relatively simple using an array of 32 data flip flops, each storing 32 bits.

The RISC-V spec also gives recommendations for the usage of each of the registers.

Register ABI Name Description Saver
x0 zero Hard-wired zero
x1 ra Return address Caller
x2 sp Stack pointer Callee
x3 gp Global pointer
x4 tp Thread pointer
x5 t0 Temporary/alternate link register Caller
x6–7 t1–2 Temporaries Caller
x8 s0/fp Saved register/frame pointer Callee
x9 s1 Saved register Callee
x10–11 a0–1 Function arguments/return values Caller
x12–17 a2–7 Function arguments Caller
x18–27 s2–11 Saved registers Callee
x28–31 t3–6 Temporaries Caller

2.2.5: ALU.v

The ALU is relatively simple, it takes two 32 bit operands and a 4 bit operation code as input and outputs a 32 bit result. The operations it can perform are Add, Sub, Shift Left, Shift Right Logical, Shift Right Arithmetic, Xor, Or, And, and Set less than (signed/unsigned).

2.2.6: byte_addressable.v

The byte addressable memory module is a wrapper around the standard 2-Port altsyncram module. The RISC-V ISA specifies three types of store and load instructions, bytes, half-words, and words. We are following the 32 bit (rv32i) architecture, so our word size is 32 bits (and half-words are 16 bits). The ISA also specifies that all half-words, words, should be writable and readable regardless alignment on addresses divisible by 4. To allow for quick access to all read instructions, the byte addressable module alternates between two ports of different sizes on the altsyncram module depending on which type of read operation is required. It is includes a small finite state machine which ensures that half-words do not overwrite the other half of the full word they are writing to.

2.2.7: branch_condition.v

The branch condition module, similar to the ALU performs mathematical operations to determine whether to take a branch or to continue with the normal flow of the program. It takes two 32 bit operands and a 3 bit operation as input and outputs a single bit which represents whether or not to take the branch. It can perform 6 operations, Equals, Not Equals, Signed Less Than, Unsigned Less Than, Signed Greater Than or Equal, and Unsigned Greater Than or Equal.

2.2.8: I/O Files

There are a few miscellaneous files in the IO folder which are used for debugging. These include a seven segment display module and a hex display module. When testing and a few switches were used to toggle what was displayed on the seven segments.

2.3.1: Memory Mapped IO Ports

There are many ways to interface a processor with external devices, the way we chose was to use memory mapped IO ports. Our processor memory has an address space of 17 bits, which doesn't take up the full 32 bits of our register. With this in mind we decided to connect our VGA module to the address space above the memory. We used a similar technique for reading from the value of the Key buttons by using a load instruction with a very high address.