Transmitting is easy — you control every bit. Receiving is harder because the incoming data arrives on its own clock. This lesson builds uart_rx.v: a robust 8N1 UART receiver with 16× oversampling and mid-bit sampling that works reliably even when TX and RX clocks drift slightly apart.
In Day 13 you set a baud counter and drove bits out. The receiver cannot do that — it has no idea when the transmitter started. The only indication is the falling edge of the start bit: the line drops from its idle-high state to 0. That edge is the receiver's synchronisation point.
Because the transmitter runs on its own oscillator, TX and RX clocks accumulate phase error over time. For 10 bits at 115200 baud, the total drift budget is roughly ±5% per bit. The classic fix is oversampling: run the receiver at 16× the baud rate and sample each bit at its centre, maximising distance from both edges.
valid. If it is 0, assert err.| Port | Dir | Width | Description |
|---|---|---|---|
| clk | IN | 1 | System clock (must be 16× baud rate × N, e.g. 100 MHz for 115200 baud gives CLKS_PER_BIT=868; oversample tick = clk/16 = 54.25 → use integer 54) |
| rst | IN | 1 | Synchronous active-high reset |
| rx | IN | 1 | Serial data input (idle high). Connect to UART RX pin. |
| data | OUT | 8 | Received byte, valid when valid is high for one clock cycle |
| valid | OUT | 1 | Pulses high for 1 clock when a complete byte has been received and the stop bit was correct |
| err | OUT | 1 | Pulses high for 1 clock when stop bit sampled as 0 (framing error) |
The receiver uses four states:
// uart_rx.v — 8N1 UART Receiver with 16x oversampling
// CLKS_PER_BIT = system_clk_freq / baud_rate
// e.g. 100 MHz / 115200 = 868
module uart_rx #(
parameter CLKS_PER_BIT = 868 // clocks per one UART bit
)(
input wire clk,
input wire rst,
input wire rx,
output reg [7:0] data,
output reg valid,
output reg err
);
// Oversample at 16x: ticks per oversample period
localparam OSAMP_DIV = CLKS_PER_BIT / 16; // ~54 for 100MHz/115200
localparam HALF_BIT = 8; // 8 oversampling ticks = half bit period
localparam FULL_BIT = 16; // 16 oversampling ticks = full bit period
// FSM states
localparam IDLE = 2'd0,
START = 2'd1,
DATA = 2'd2,
STOP = 2'd3;
reg [1:0] state = IDLE;
reg [9:0] clk_cnt = 0; // counts system clocks for one oversample tick
reg [3:0] osamp_cnt = 0; // counts oversampling ticks within a bit
reg [2:0] bit_idx = 0; // which data bit we are receiving (0-7)
reg [7:0] shift_reg = 0; // shift register accumulates data bits
// Double-flop rx input to meta-stabilise
reg rx_s1, rx_s2;
always @(posedge clk) begin
rx_s1 <= rx;
rx_s2 <= rx_s1;
end
// Oversample tick strobe (pulses high once every OSAMP_DIV clocks)
reg tick;
always @(posedge clk) begin
if (rst) begin
clk_cnt <= 0;
tick <= 0;
end else begin
tick <= 0;
if (clk_cnt == OSAMP_DIV - 1) begin
clk_cnt <= 0;
tick <= 1;
end else begin
clk_cnt <= clk_cnt + 1;
end
end
end
// Main receiver FSM
always @(posedge clk) begin
if (rst) begin
state <= IDLE;
osamp_cnt <= 0;
bit_idx <= 0;
shift_reg <= 0;
data <= 0;
valid <= 0;
err <= 0;
end else begin
valid <= 0;
err <= 0;
case (state)
// -------------------------------------------------------
IDLE: begin
osamp_cnt <= 0;
bit_idx <= 0;
// Detect falling edge of start bit
if (rx_s2 == 1'b0)
state <= START;
end
// -------------------------------------------------------
// Wait half a bit period, then re-check RX is still 0
START: begin
if (tick) begin
if (osamp_cnt == HALF_BIT - 1) begin
osamp_cnt <= 0;
if (rx_s2 == 1'b0)
state <= DATA; // confirmed start bit
else
state <= IDLE; // noise glitch, ignore
end else begin
osamp_cnt <= osamp_cnt + 1;
end
end
end
// -------------------------------------------------------
// Sample each data bit at its centre (every 16 ticks)
DATA: begin
if (tick) begin
if (osamp_cnt == FULL_BIT - 1) begin
osamp_cnt <= 0;
// Sample bit at centre
shift_reg <= {rx_s2, shift_reg[7:1]}; // LSB first
if (bit_idx == 3'd7) begin
bit_idx <= 0;
state <= STOP;
end else begin
bit_idx <= bit_idx + 1;
end
end else begin
osamp_cnt <= osamp_cnt + 1;
end
end
end
// -------------------------------------------------------
// Wait one full bit period then sample stop bit
STOP: begin
if (tick) begin
if (osamp_cnt == FULL_BIT - 1) begin
osamp_cnt <= 0;
state <= IDLE;
if (rx_s2 == 1'b1) begin
data <= shift_reg;
valid <= 1;
end else begin
err <= 1; // framing error
end
end else begin
osamp_cnt <= osamp_cnt + 1;
end
end
end
default: state <= IDLE;
endcase
end
end
endmodule
The rx pin arrives asynchronously — it can change at any time relative to clk. Sampling it directly can cause metastability and corrupt the entire FSM. The two-flop chain (rx_s1 → rx_s2) resolves metastability before the signal reaches any combinational logic. This is mandatory on real hardware.
The constant OSAMP_DIV = CLKS_PER_BIT / 16 generates the oversampling tick. For 100 MHz / 115200 baud = 868 clocks per bit, OSAMP_DIV = 54 (integer division). This gives 54 system clocks per oversample tick, slightly slower than ideal (54 × 16 = 864 vs exact 868). The error is 0.46%, well within UART's ±5% tolerance.
UART sends LSB first. The shift register captures each incoming bit into shift_reg[7] and shifts right, so after 8 bits the register holds the byte in correct orientation. No bit-reversal is needed.
The testbench drives the RX line manually, toggling it at the correct baud timing to transmit 0x55 (binary 01010101) and then 0xA3. It then checks that valid pulses and data matches.
// tb_uart_rx.v — self-checking testbench for uart_rx
`timescale 1ns/1ps
module tb_uart_rx;
// ---- Parameters matching DUT ----
parameter CLK_PERIOD = 10; // 100 MHz
parameter CLKS_PER_BIT = 868; // 115200 baud at 100 MHz
parameter BIT_PERIOD = CLK_PERIOD * CLKS_PER_BIT; // ns per bit
// ---- DUT signals ----
reg clk = 0;
reg rst = 1;
reg rx = 1; // idle high
wire [7:0] data;
wire valid;
wire err;
// ---- Instantiate DUT ----
uart_rx #(.CLKS_PER_BIT(CLKS_PER_BIT)) dut (
.clk(clk), .rst(rst), .rx(rx),
.data(data), .valid(valid), .err(err)
);
// ---- Clock ----
always #(CLK_PERIOD/2) clk = ~clk;
// ---- Task: send one byte over serial line ----
task send_byte;
input [7:0] byte_in;
integer i;
begin
// Start bit (low)
rx = 1'b0;
#(BIT_PERIOD);
// 8 data bits, LSB first
for (i = 0; i < 8; i = i + 1) begin
rx = byte_in[i];
#(BIT_PERIOD);
end
// Stop bit (high)
rx = 1'b1;
#(BIT_PERIOD);
end
endtask
// ---- Check helper ----
integer pass_cnt = 0;
integer fail_cnt = 0;
task check_byte;
input [7:0] expected;
begin
// Wait for valid pulse
@(posedge valid);
@(posedge clk);
if (data === expected) begin
$display("PASS: received 0x%02X (expected 0x%02X)", data, expected);
pass_cnt = pass_cnt + 1;
end else begin
$display("FAIL: received 0x%02X expected 0x%02X", data, expected);
fail_cnt = fail_cnt + 1;
end
if (err) begin
$display("FAIL: unexpected framing error");
fail_cnt = fail_cnt + 1;
end
end
endtask
// ---- Stimulus ----
initial begin
$dumpfile("tb_uart_rx.vcd");
$dumpvars(0, tb_uart_rx);
// Reset
@(posedge clk);
rst = 1;
repeat(10) @(posedge clk);
rst = 0;
#(BIT_PERIOD); // idle gap
// Test 1: send 0x55 (alternating bits 01010101)
fork
send_byte(8'h55);
check_byte(8'h55);
join
#(BIT_PERIOD * 2); // inter-byte gap
// Test 2: send 0xA3 (10100011)
fork
send_byte(8'hA3);
check_byte(8'hA3);
join
#(BIT_PERIOD * 2);
// Test 3: send 0x00
fork
send_byte(8'h00);
check_byte(8'h00);
join
#(BIT_PERIOD * 2);
// Test 4: send 0xFF
fork
send_byte(8'hFF);
check_byte(8'hFF);
join
#(BIT_PERIOD * 2);
// Summary
if (fail_cnt == 0)
$display("\nALL TESTS PASSED (%0d/%0d)", pass_cnt, pass_cnt+fail_cnt);
else
$display("\nFAILED: %0d/%0d tests passed", pass_cnt, pass_cnt+fail_cnt);
$finish;
end
// ---- Timeout watchdog ----
initial begin
#(BIT_PERIOD * 100);
$display("TIMEOUT");
$finish;
end
endmodule
PASS: received 0x55 (expected 0x55) PASS: received 0xA3 (expected 0xA3) PASS: received 0x00 (expected 0x00) PASS: received 0xFF (expected 0xFF) ALL TESTS PASSED (4/4)
set_false_path -from [get_ports rx] since the two-flop synchroniser handles CDC correctly.Oversampling (16×) lets the receiver resynchronise to each incoming bit. TX and RX run on independent oscillators that drift apart. By sampling at the midpoint (sample 8 of 16), the receiver is furthest from both bit edges and most immune to clock skew and jitter.
After detecting the start-bit falling edge, the receiver waits 8 oversampling ticks (half a bit period) to land at the centre of the start bit, then samples every 16 ticks to hit the centre of each subsequent bit.
The err flag is asserted when the stop bit is sampled as 0. A valid UART frame must end with a high stop bit. A 0 indicates framing loss or line noise — the byte should be discarded and the receiver returned to IDLE.