This is the series finale. You will build a complete, synthesizable RTL system from scratch — a UART receiver FSM that deserializes incoming serial bytes, a synchronous FIFO that buffers them, a top-level module that wires everything together, and a self-checking testbench that verifies end-to-end behavior. Every concept from the previous 14 tutorials meets here.
The system has three RTL modules and one testbench:
| Module | File | Role |
|---|---|---|
| UART Receiver | uart_rx.v | Deserializes 8N1 serial data; outputs byte + valid pulse |
| Synchronous FIFO | sync_fifo.v | Buffers received bytes; full/empty flags |
| Top Level | uart_fifo_top.v | Wires UART valid → FIFO wr_en; exposes FIFO read port |
| Testbench | tb_uart_fifo.v | Sends bytes serially; reads FIFO; self-checks |
The UART RX uses 8N1 framing: 1 start bit (low), 8 data bits (LSB first), 1 stop bit (high). The FIFO's wr_en is driven directly by UART's valid pulse — one byte written per received frame.
Four-state FSM: IDLE → START → DATA → STOP. The valid output pulses high for exactly one clock cycle after a complete byte is received with a valid stop bit.
`timescale 1ns/1ps module uart_rx #( parameter CLK_PER_BIT = 868 // 100 MHz / 115200 baud ≈ 868 )( input clk, rst_n, input rx, // serial input (idle = 1) output reg [7:0] data, // parallel byte output output reg valid // 1-cycle pulse when byte ready ); localparam IDLE = 2'd0, START = 2'd1, DATA = 2'd2, STOP = 2'd3; reg [1:0] state; reg [15:0] clk_cnt; reg [2:0] bit_cnt; reg [7:0] shift; always @(posedge clk or negedge rst_n) begin if (!rst_n) begin state <= IDLE; clk_cnt <= 0; bit_cnt <= 0; valid <= 0; end else begin valid <= 0; // default: deassert every cycle case (state) IDLE: if (!rx) begin // falling edge = start bit state <= START; clk_cnt <= 0; end START: // wait half-bit to sample center if (clk_cnt == CLK_PER_BIT/2 - 1) begin clk_cnt <= 0; bit_cnt <= 0; state <= DATA; end else clk_cnt <= clk_cnt + 1; DATA: // sample 8 bits, LSB first if (clk_cnt == CLK_PER_BIT - 1) begin shift <= {rx, shift[7:1]}; clk_cnt <= 0; if (bit_cnt == 3'd7) state <= STOP; else bit_cnt <= bit_cnt + 1; end else clk_cnt <= clk_cnt + 1; STOP: // validate stop bit if (clk_cnt == CLK_PER_BIT - 1) begin if (rx) begin // stop bit must be high data <= shift; valid <= 1; end state <= IDLE; clk_cnt <= 0; end else clk_cnt <= clk_cnt + 1; default: state <= IDLE; endcase end end endmodule
Parameterized FIFO using the extra-MSB pointer trick so full and empty are unambiguous when pointers wrap around.
`timescale 1ns/1ps module sync_fifo #( parameter WIDTH = 8, parameter DEPTH = 16 // must be a power of 2 )( input clk, rst_n, input wr_en, rd_en, input [WIDTH-1:0] wr_data, output [WIDTH-1:0] rd_data, output full, empty, output [$clog2(DEPTH):0] count // occupancy ); localparam PTR_W = $clog2(DEPTH); reg [WIDTH-1:0] mem [0:DEPTH-1]; reg [PTR_W:0] wr_ptr, rd_ptr; assign empty = (wr_ptr == rd_ptr); assign full = (wr_ptr[PTR_W] != rd_ptr[PTR_W]) && (wr_ptr[PTR_W-1:0] == rd_ptr[PTR_W-1:0]); assign rd_data = mem[rd_ptr[PTR_W-1:0]]; assign count = wr_ptr - rd_ptr; always @(posedge clk or negedge rst_n) begin if (!rst_n) begin wr_ptr <= 0; rd_ptr <= 0; end else begin if (wr_en && !full) begin mem[wr_ptr[PTR_W-1:0]] <= wr_data; wr_ptr <= wr_ptr + 1; end if (rd_en && !empty) rd_ptr <= rd_ptr + 1; end end endmodule
The top-level module is thin — it exists purely to wire the UART's valid signal to the FIFO's wr_en and expose the FIFO's read interface to the outside world.
`timescale 1ns/1ps module uart_fifo_top #( parameter CLK_PER_BIT = 868, parameter FIFO_DEPTH = 16 )( input clk, rst_n, input rx, // serial input // FIFO read port (host reads received bytes) input rd_en, output [7:0] rd_data, output fifo_empty, output fifo_full ); wire [7:0] uart_data; wire uart_valid; uart_rx #(.CLK_PER_BIT(CLK_PER_BIT)) u_rx ( .clk (clk), .rst_n(rst_n), .rx (rx), .data (uart_data), .valid (uart_valid) ); sync_fifo #(.WIDTH(8), .DEPTH(FIFO_DEPTH)) u_fifo ( .clk (clk), .rst_n (rst_n), .wr_en (uart_valid), // write when UART has a byte .wr_data(uart_data), .rd_en (rd_en), .rd_data(rd_data), .full (fifo_full), .empty (fifo_empty), .count () // unused at top; available internally ); endmodule
The testbench sends 4 known bytes over the serial line, waits for the FIFO to fill, then reads and verifies each byte.
`timescale 1ns/1ps module tb_uart_fifo; // ---- parameters match DUT ---- localparam CLK_PER_BIT = 10; // fast sim: 10 cycles/bit localparam BIT_PERIOD = CLK_PER_BIT; reg clk, rst_n, rx, rd_en; wire [7:0] rd_data; wire fifo_empty, fifo_full; integer pass_cnt, fail_cnt; // DUT uart_fifo_top #(.CLK_PER_BIT(CLK_PER_BIT), .FIFO_DEPTH(16)) dut (.*); // 100 MHz clock initial clk = 0; always #5 clk = ~clk; // ---- task: send one byte over serial line ---- task automatic send_byte; input [7:0] byte_val; integer i; begin // start bit rx = 0; #(BIT_PERIOD * 10); // 10 ns/cycle × 10 cycles // 8 data bits (LSB first) for (i=0; i<8; i++) begin rx = byte_val[i]; #(BIT_PERIOD * 10); end // stop bit rx = 1; #(BIT_PERIOD * 10); end endtask // ---- task: read + check one FIFO entry ---- task automatic check_byte; input [7:0] expected; begin rd_en = 1; @(posedge clk); #1; rd_en = 0; if (rd_data === expected) begin $display("PASS: got 8'h%02h", rd_data); pass_cnt++; end else begin $error("FAIL: expected 8'h%02h got 8'h%02h", expected, rd_data); fail_cnt++; end end endtask initial begin $dumpfile("wave.vcd"); $dumpvars(0, tb_uart_fifo); pass_cnt=0; fail_cnt=0; rx=1; rd_en=0; rst_n=0; repeat(4) @(posedge clk); rst_n=1; repeat(2) @(posedge clk); // Send 4 bytes send_byte(8'hA5); send_byte(8'h3C); send_byte(8'hFF); send_byte(8'h00); // Wait for last byte to settle in FIFO repeat(20) @(posedge clk); // Check FIFO not empty if (fifo_empty) $error("FIFO unexpectedly empty after sends"); // Read and verify all 4 bytes check_byte(8'hA5); check_byte(8'h3C); check_byte(8'hFF); check_byte(8'h00); if (!fifo_empty) $error("FIFO not empty after all reads"); $display("--- %0d passed, %0d failed ---", pass_cnt, fail_cnt); $finish; end endmodule
# Compile all four files iverilog -o sim uart_rx.v sync_fifo.v uart_fifo_top.v tb_uart_fifo.v # Run simulation vvp sim
Expected terminal output:
PASS: got 8'ha5 PASS: got 8'h3c PASS: got 8'hff PASS: got 8'h00 --- 4 passed, 0 failed ---
CLK_PER_BIT=10 to keep simulation fast. In real hardware targeting 115200 baud at 100 MHz, set CLK_PER_BIT=868. Both the DUT and testbench must use the same value.gtkwave wave.vcd
Signals to add in GTKWave for debugging:
| Signal | What to look for |
|---|---|
tb_uart_fifo.rx | Serial waveform — start bit low, 8 data bits, stop bit high |
dut.u_rx.state | FSM transitions: IDLE → START → DATA (×8) → STOP → IDLE |
dut.u_rx.valid | 1-cycle pulse after each byte — triggers FIFO write |
dut.u_rx.data | Parallel byte output — should match sent values |
dut.u_fifo.wr_ptr | Increments by 1 on each valid pulse |
dut.u_fifo.rd_ptr | Increments by 1 on each rd_en |
fifo_empty | High at start, low after first byte, high again after all reads |
| Extension | What to add |
|---|---|
| UART TX | Mirror FSM: load parallel byte → serialize with start/stop bits → drive tx pin |
| Parity checking | Add parity accumulator in DATA state; compare against parity bit in STOP state |
| Baud rate auto-detect | Measure start-bit duration to calibrate CLK_PER_BIT dynamically |
| Async FIFO (dual-clock) | Replace sync_fifo with async FIFO using Gray-coded pointers + 2-FF synchronizers |
| AXI-S interface | Replace rd_en/rd_data with valid-ready handshake to connect to an AXI4-Stream bus |
| Interrupt output | Assert an IRQ line when FIFO occupancy exceeds a threshold parameter |