Tutorial 15 · Capstone Project

Verilog Capstone: UART Receiver + FIFO

This is the series finale. You will build a complete, synthesizable RTL system from scratch — a UART receiver FSM that deserializes incoming serial bytes, a synchronous FIFO that buffers them, a top-level module that wires everything together, and a self-checking testbench that verifies end-to-end behavior. Every concept from the previous 14 tutorials meets here.

FSM (T13)Parameters (T08)Tasks (T10) Testbench (T09)System Tasks (T11)RTL Patterns (T14) Blocking/Non-Blocking (T06)generate (T08)
UART RX FSMsync FIFOtop-level integration self-checking testbenchIcarus VerilogGTKWave
UART RECEIVER + FIFO — SYSTEM BLOCK DIAGRAM Serial Input rx UART RX FSM IDLE→START→DATA→STOP CLK_PER_BIT oversampling 8-bit shift register data[7:0] + valid Sync FIFO DEPTH=16, WIDTH=8 MSB pointer trick full / empty flags rd_data + empty CPU / Host clk / rst_n → both modules uart_fifo_top.v

1. System Architecture

The system has three RTL modules and one testbench:

ModuleFileRole
UART Receiveruart_rx.vDeserializes 8N1 serial data; outputs byte + valid pulse
Synchronous FIFOsync_fifo.vBuffers received bytes; full/empty flags
Top Leveluart_fifo_top.vWires UART valid → FIFO wr_en; exposes FIFO read port
Testbenchtb_uart_fifo.vSends bytes serially; reads FIFO; self-checks

The UART RX uses 8N1 framing: 1 start bit (low), 8 data bits (LSB first), 1 stop bit (high). The FIFO's wr_en is driven directly by UART's valid pulse — one byte written per received frame.

2. File Structure

uart_fifo_project/
  uart_rx.v ← UART receiver FSM
  sync_fifo.v ← Synchronous FIFO
  uart_fifo_top.v ← Top-level integration
  tb_uart_fifo.v ← Self-checking testbench
  wave.vcd ← Generated by simulation

3. Module 1: uart_rx.v

Four-state FSM: IDLESTARTDATASTOP. The valid output pulses high for exactly one clock cycle after a complete byte is received with a valid stop bit.

`timescale 1ns/1ps
module uart_rx #(
  parameter CLK_PER_BIT = 868   // 100 MHz / 115200 baud ≈ 868
)(
  input               clk, rst_n,
  input               rx,        // serial input (idle = 1)
  output reg [7:0]   data,      // parallel byte output
  output reg          valid      // 1-cycle pulse when byte ready
);
  localparam IDLE  = 2'd0,
             START = 2'd1,
             DATA  = 2'd2,
             STOP  = 2'd3;

  reg [1:0]  state;
  reg [15:0] clk_cnt;
  reg [2:0]  bit_cnt;
  reg [7:0]  shift;

  always @(posedge clk or negedge rst_n) begin
    if (!rst_n) begin
      state   <= IDLE; clk_cnt <= 0;
      bit_cnt <= 0;    valid   <= 0;
    end else begin
      valid <= 0;           // default: deassert every cycle
      case (state)

        IDLE:
          if (!rx) begin        // falling edge = start bit
            state   <= START;
            clk_cnt <= 0;
          end

        START:                   // wait half-bit to sample center
          if (clk_cnt == CLK_PER_BIT/2 - 1) begin
            clk_cnt <= 0;
            bit_cnt <= 0;
            state   <= DATA;
          end else clk_cnt <= clk_cnt + 1;

        DATA:                    // sample 8 bits, LSB first
          if (clk_cnt == CLK_PER_BIT - 1) begin
            shift   <= {rx, shift[7:1]};
            clk_cnt <= 0;
            if (bit_cnt == 3'd7) state <= STOP;
            else                   bit_cnt <= bit_cnt + 1;
          end else clk_cnt <= clk_cnt + 1;

        STOP:                    // validate stop bit
          if (clk_cnt == CLK_PER_BIT - 1) begin
            if (rx) begin         // stop bit must be high
              data  <= shift;
              valid <= 1;
            end
            state   <= IDLE;
            clk_cnt <= 0;
          end else clk_cnt <= clk_cnt + 1;

        default: state <= IDLE;
      endcase
    end
  end
endmodule

4. Module 2: sync_fifo.v

Parameterized FIFO using the extra-MSB pointer trick so full and empty are unambiguous when pointers wrap around.

`timescale 1ns/1ps
module sync_fifo #(
  parameter WIDTH = 8,
  parameter DEPTH = 16    // must be a power of 2
)(
  input               clk, rst_n,
  input               wr_en, rd_en,
  input  [WIDTH-1:0]  wr_data,
  output [WIDTH-1:0]  rd_data,
  output              full, empty,
  output [$clog2(DEPTH):0] count  // occupancy
);
  localparam PTR_W = $clog2(DEPTH);

  reg [WIDTH-1:0] mem [0:DEPTH-1];
  reg [PTR_W:0]   wr_ptr, rd_ptr;

  assign empty   = (wr_ptr == rd_ptr);
  assign full    = (wr_ptr[PTR_W] != rd_ptr[PTR_W]) &&
                    (wr_ptr[PTR_W-1:0] == rd_ptr[PTR_W-1:0]);
  assign rd_data = mem[rd_ptr[PTR_W-1:0]];
  assign count   = wr_ptr - rd_ptr;

  always @(posedge clk or negedge rst_n) begin
    if (!rst_n) begin
      wr_ptr <= 0; rd_ptr <= 0;
    end else begin
      if (wr_en && !full) begin
        mem[wr_ptr[PTR_W-1:0]] <= wr_data;
        wr_ptr <= wr_ptr + 1;
      end
      if (rd_en && !empty)
        rd_ptr <= rd_ptr + 1;
    end
  end
endmodule

5. Module 3: uart_fifo_top.v

The top-level module is thin — it exists purely to wire the UART's valid signal to the FIFO's wr_en and expose the FIFO's read interface to the outside world.

`timescale 1ns/1ps
module uart_fifo_top #(
  parameter CLK_PER_BIT = 868,
  parameter FIFO_DEPTH  = 16
)(
  input        clk, rst_n,
  input        rx,           // serial input
  // FIFO read port (host reads received bytes)
  input        rd_en,
  output [7:0] rd_data,
  output       fifo_empty,
  output       fifo_full
);
  wire [7:0] uart_data;
  wire        uart_valid;

  uart_rx #(.CLK_PER_BIT(CLK_PER_BIT)) u_rx (
    .clk   (clk),   .rst_n(rst_n),
    .rx    (rx),
    .data  (uart_data),
    .valid (uart_valid)
  );

  sync_fifo #(.WIDTH(8), .DEPTH(FIFO_DEPTH)) u_fifo (
    .clk    (clk),          .rst_n  (rst_n),
    .wr_en  (uart_valid),   // write when UART has a byte
    .wr_data(uart_data),
    .rd_en  (rd_en),        .rd_data(rd_data),
    .full   (fifo_full),   .empty  (fifo_empty),
    .count  ()               // unused at top; available internally
  );
endmodule

6. Testbench: tb_uart_fifo.v

The testbench sends 4 known bytes over the serial line, waits for the FIFO to fill, then reads and verifies each byte.

`timescale 1ns/1ps
module tb_uart_fifo;
  // ---- parameters match DUT ----
  localparam CLK_PER_BIT = 10;   // fast sim: 10 cycles/bit
  localparam BIT_PERIOD  = CLK_PER_BIT;

  reg        clk, rst_n, rx, rd_en;
  wire [7:0] rd_data;
  wire       fifo_empty, fifo_full;
  integer    pass_cnt, fail_cnt;

  // DUT
  uart_fifo_top #(.CLK_PER_BIT(CLK_PER_BIT), .FIFO_DEPTH(16)) dut (.*);

  // 100 MHz clock
  initial clk = 0;
  always #5 clk = ~clk;

  // ---- task: send one byte over serial line ----
  task automatic send_byte;
    input [7:0] byte_val;
    integer i;
    begin
      // start bit
      rx = 0; #(BIT_PERIOD * 10);  // 10 ns/cycle × 10 cycles
      // 8 data bits (LSB first)
      for (i=0; i<8; i++) begin
        rx = byte_val[i];
        #(BIT_PERIOD * 10);
      end
      // stop bit
      rx = 1; #(BIT_PERIOD * 10);
    end
  endtask

  // ---- task: read + check one FIFO entry ----
  task automatic check_byte;
    input [7:0] expected;
    begin
      rd_en = 1; @(posedge clk); #1;
      rd_en = 0;
      if (rd_data === expected) begin
        $display("PASS: got 8'h%02h", rd_data);
        pass_cnt++;
      end else begin
        $error("FAIL: expected 8'h%02h got 8'h%02h", expected, rd_data);
        fail_cnt++;
      end
    end
  endtask

  initial begin
    $dumpfile("wave.vcd");
    $dumpvars(0, tb_uart_fifo);

    pass_cnt=0; fail_cnt=0;
    rx=1; rd_en=0; rst_n=0;
    repeat(4) @(posedge clk);
    rst_n=1;
    repeat(2) @(posedge clk);

    // Send 4 bytes
    send_byte(8'hA5);
    send_byte(8'h3C);
    send_byte(8'hFF);
    send_byte(8'h00);

    // Wait for last byte to settle in FIFO
    repeat(20) @(posedge clk);

    // Check FIFO not empty
    if (fifo_empty) $error("FIFO unexpectedly empty after sends");

    // Read and verify all 4 bytes
    check_byte(8'hA5);
    check_byte(8'h3C);
    check_byte(8'hFF);
    check_byte(8'h00);

    if (!fifo_empty) $error("FIFO not empty after all reads");

    $display("--- %0d passed, %0d failed ---", pass_cnt, fail_cnt);
    $finish;
  end
endmodule

7. Simulate with Icarus Verilog

# Compile all four files
iverilog -o sim uart_rx.v sync_fifo.v uart_fifo_top.v tb_uart_fifo.v

# Run simulation
vvp sim

Expected terminal output:

PASS: got 8'ha5
PASS: got 8'h3c
PASS: got 8'hff
PASS: got 8'h00
--- 4 passed, 0 failed ---
CLK_PER_BIT for simulation: The testbench uses CLK_PER_BIT=10 to keep simulation fast. In real hardware targeting 115200 baud at 100 MHz, set CLK_PER_BIT=868. Both the DUT and testbench must use the same value.

8. View Waveform in GTKWave

gtkwave wave.vcd

Signals to add in GTKWave for debugging:

SignalWhat to look for
tb_uart_fifo.rxSerial waveform — start bit low, 8 data bits, stop bit high
dut.u_rx.stateFSM transitions: IDLE → START → DATA (×8) → STOP → IDLE
dut.u_rx.valid1-cycle pulse after each byte — triggers FIFO write
dut.u_rx.dataParallel byte output — should match sent values
dut.u_fifo.wr_ptrIncrements by 1 on each valid pulse
dut.u_fifo.rd_ptrIncrements by 1 on each rd_en
fifo_emptyHigh at start, low after first byte, high again after all reads

9. How to Extend This Project

ExtensionWhat to add
UART TXMirror FSM: load parallel byte → serialize with start/stop bits → drive tx pin
Parity checkingAdd parity accumulator in DATA state; compare against parity bit in STOP state
Baud rate auto-detectMeasure start-bit duration to calibrate CLK_PER_BIT dynamically
Async FIFO (dual-clock)Replace sync_fifo with async FIFO using Gray-coded pointers + 2-FF synchronizers
AXI-S interfaceReplace rd_en/rd_data with valid-ready handshake to connect to an AXI4-Stream bus
Interrupt outputAssert an IRQ line when FIFO occupancy exceeds a threshold parameter