HomeFPGA from ScratchDay 14
DAY 14 · SERIAL COMMUNICATION

UART Receiver — Oversampling & Mid-Bit Sampling

By EcrioniX · Updated Jun 11, 2026

Transmitting is easy — you control every bit. Receiving is harder because the incoming data arrives on its own clock. This lesson builds uart_rx.v: a robust 8N1 UART receiver with 16× oversampling and mid-bit sampling that works reliably even when TX and RX clocks drift slightly apart.

1. The RX challenge — two independent clocks

In Day 13 you set a baud counter and drove bits out. The receiver cannot do that — it has no idea when the transmitter started. The only indication is the falling edge of the start bit: the line drops from its idle-high state to 0. That edge is the receiver's synchronisation point.

Because the transmitter runs on its own oscillator, TX and RX clocks accumulate phase error over time. For 10 bits at 115200 baud, the total drift budget is roughly ±5% per bit. The classic fix is oversampling: run the receiver at 16× the baud rate and sample each bit at its centre, maximising distance from both edges.

The mid-bit sampling trick

2. Port table — uart_rx

PortDirWidthDescription
clkIN1System clock (must be 16× baud rate × N, e.g. 100 MHz for 115200 baud gives CLKS_PER_BIT=868; oversample tick = clk/16 = 54.25 → use integer 54)
rstIN1Synchronous active-high reset
rxIN1Serial data input (idle high). Connect to UART RX pin.
dataOUT8Received byte, valid when valid is high for one clock cycle
validOUT1Pulses high for 1 clock when a complete byte has been received and the stop bit was correct
errOUT1Pulses high for 1 clock when stop bit sampled as 0 (framing error)

3. FSM state diagram

The receiver uses four states:

4. uart_rx.v — full design

uart_rx.v
// uart_rx.v — 8N1 UART Receiver with 16x oversampling
// CLKS_PER_BIT = system_clk_freq / baud_rate
// e.g. 100 MHz / 115200 = 868

module uart_rx #(
    parameter CLKS_PER_BIT = 868   // clocks per one UART bit
)(
    input  wire       clk,
    input  wire       rst,
    input  wire       rx,
    output reg  [7:0] data,
    output reg        valid,
    output reg        err
);

// Oversample at 16x: ticks per oversample period
localparam OSAMP_DIV  = CLKS_PER_BIT / 16;   // ~54 for 100MHz/115200
localparam HALF_BIT   = 8;   // 8 oversampling ticks = half bit period
localparam FULL_BIT   = 16;  // 16 oversampling ticks = full bit period

// FSM states
localparam IDLE  = 2'd0,
           START = 2'd1,
           DATA  = 2'd2,
           STOP  = 2'd3;

reg [1:0]  state     = IDLE;
reg [9:0]  clk_cnt   = 0;    // counts system clocks for one oversample tick
reg [3:0]  osamp_cnt = 0;    // counts oversampling ticks within a bit
reg [2:0]  bit_idx   = 0;    // which data bit we are receiving (0-7)
reg [7:0]  shift_reg = 0;    // shift register accumulates data bits

// Double-flop rx input to meta-stabilise
reg rx_s1, rx_s2;
always @(posedge clk) begin
    rx_s1 <= rx;
    rx_s2 <= rx_s1;
end

// Oversample tick strobe (pulses high once every OSAMP_DIV clocks)
reg tick;
always @(posedge clk) begin
    if (rst) begin
        clk_cnt <= 0;
        tick    <= 0;
    end else begin
        tick <= 0;
        if (clk_cnt == OSAMP_DIV - 1) begin
            clk_cnt <= 0;
            tick    <= 1;
        end else begin
            clk_cnt <= clk_cnt + 1;
        end
    end
end

// Main receiver FSM
always @(posedge clk) begin
    if (rst) begin
        state     <= IDLE;
        osamp_cnt <= 0;
        bit_idx   <= 0;
        shift_reg <= 0;
        data      <= 0;
        valid     <= 0;
        err       <= 0;
    end else begin
        valid <= 0;
        err   <= 0;

        case (state)
            // -------------------------------------------------------
            IDLE: begin
                osamp_cnt <= 0;
                bit_idx   <= 0;
                // Detect falling edge of start bit
                if (rx_s2 == 1'b0)
                    state <= START;
            end

            // -------------------------------------------------------
            // Wait half a bit period, then re-check RX is still 0
            START: begin
                if (tick) begin
                    if (osamp_cnt == HALF_BIT - 1) begin
                        osamp_cnt <= 0;
                        if (rx_s2 == 1'b0)
                            state <= DATA;  // confirmed start bit
                        else
                            state <= IDLE;  // noise glitch, ignore
                    end else begin
                        osamp_cnt <= osamp_cnt + 1;
                    end
                end
            end

            // -------------------------------------------------------
            // Sample each data bit at its centre (every 16 ticks)
            DATA: begin
                if (tick) begin
                    if (osamp_cnt == FULL_BIT - 1) begin
                        osamp_cnt <= 0;
                        // Sample bit at centre
                        shift_reg <= {rx_s2, shift_reg[7:1]};  // LSB first
                        if (bit_idx == 3'd7) begin
                            bit_idx <= 0;
                            state   <= STOP;
                        end else begin
                            bit_idx <= bit_idx + 1;
                        end
                    end else begin
                        osamp_cnt <= osamp_cnt + 1;
                    end
                end
            end

            // -------------------------------------------------------
            // Wait one full bit period then sample stop bit
            STOP: begin
                if (tick) begin
                    if (osamp_cnt == FULL_BIT - 1) begin
                        osamp_cnt <= 0;
                        state     <= IDLE;
                        if (rx_s2 == 1'b1) begin
                            data  <= shift_reg;
                            valid <= 1;
                        end else begin
                            err   <= 1;  // framing error
                        end
                    end else begin
                        osamp_cnt <= osamp_cnt + 1;
                    end
                end
            end

            default: state <= IDLE;
        endcase
    end
end

endmodule

5. Design notes

Two-flop synchroniser on RX input

The rx pin arrives asynchronously — it can change at any time relative to clk. Sampling it directly can cause metastability and corrupt the entire FSM. The two-flop chain (rx_s1 → rx_s2) resolves metastability before the signal reaches any combinational logic. This is mandatory on real hardware.

Oversample divider

The constant OSAMP_DIV = CLKS_PER_BIT / 16 generates the oversampling tick. For 100 MHz / 115200 baud = 868 clocks per bit, OSAMP_DIV = 54 (integer division). This gives 54 system clocks per oversample tick, slightly slower than ideal (54 × 16 = 864 vs exact 868). The error is 0.46%, well within UART's ±5% tolerance.

LSB-first shift register

UART sends LSB first. The shift register captures each incoming bit into shift_reg[7] and shifts right, so after 8 bits the register holds the byte in correct orientation. No bit-reversal is needed.

6. Testbench — tb_uart_rx.v

The testbench drives the RX line manually, toggling it at the correct baud timing to transmit 0x55 (binary 01010101) and then 0xA3. It then checks that valid pulses and data matches.

tb_uart_rx.v
// tb_uart_rx.v — self-checking testbench for uart_rx
`timescale 1ns/1ps

module tb_uart_rx;

// ---- Parameters matching DUT ----
parameter CLK_PERIOD  = 10;       // 100 MHz
parameter CLKS_PER_BIT = 868;     // 115200 baud at 100 MHz
parameter BIT_PERIOD   = CLK_PERIOD * CLKS_PER_BIT; // ns per bit

// ---- DUT signals ----
reg        clk = 0;
reg        rst = 1;
reg        rx  = 1;   // idle high
wire [7:0] data;
wire       valid;
wire       err;

// ---- Instantiate DUT ----
uart_rx #(.CLKS_PER_BIT(CLKS_PER_BIT)) dut (
    .clk(clk), .rst(rst), .rx(rx),
    .data(data), .valid(valid), .err(err)
);

// ---- Clock ----
always #(CLK_PERIOD/2) clk = ~clk;

// ---- Task: send one byte over serial line ----
task send_byte;
    input [7:0] byte_in;
    integer i;
    begin
        // Start bit (low)
        rx = 1'b0;
        #(BIT_PERIOD);
        // 8 data bits, LSB first
        for (i = 0; i < 8; i = i + 1) begin
            rx = byte_in[i];
            #(BIT_PERIOD);
        end
        // Stop bit (high)
        rx = 1'b1;
        #(BIT_PERIOD);
    end
endtask

// ---- Check helper ----
integer pass_cnt = 0;
integer fail_cnt = 0;

task check_byte;
    input [7:0] expected;
    begin
        // Wait for valid pulse
        @(posedge valid);
        @(posedge clk);
        if (data === expected) begin
            $display("PASS: received 0x%02X (expected 0x%02X)", data, expected);
            pass_cnt = pass_cnt + 1;
        end else begin
            $display("FAIL: received 0x%02X expected 0x%02X", data, expected);
            fail_cnt = fail_cnt + 1;
        end
        if (err) begin
            $display("FAIL: unexpected framing error");
            fail_cnt = fail_cnt + 1;
        end
    end
endtask

// ---- Stimulus ----
initial begin
    $dumpfile("tb_uart_rx.vcd");
    $dumpvars(0, tb_uart_rx);

    // Reset
    @(posedge clk);
    rst = 1;
    repeat(10) @(posedge clk);
    rst = 0;
    #(BIT_PERIOD);   // idle gap

    // Test 1: send 0x55 (alternating bits 01010101)
    fork
        send_byte(8'h55);
        check_byte(8'h55);
    join

    #(BIT_PERIOD * 2);  // inter-byte gap

    // Test 2: send 0xA3 (10100011)
    fork
        send_byte(8'hA3);
        check_byte(8'hA3);
    join

    #(BIT_PERIOD * 2);

    // Test 3: send 0x00
    fork
        send_byte(8'h00);
        check_byte(8'h00);
    join

    #(BIT_PERIOD * 2);

    // Test 4: send 0xFF
    fork
        send_byte(8'hFF);
        check_byte(8'hFF);
    join

    #(BIT_PERIOD * 2);

    // Summary
    if (fail_cnt == 0)
        $display("\nALL TESTS PASSED (%0d/%0d)", pass_cnt, pass_cnt+fail_cnt);
    else
        $display("\nFAILED: %0d/%0d tests passed", pass_cnt, pass_cnt+fail_cnt);

    $finish;
end

// ---- Timeout watchdog ----
initial begin
    #(BIT_PERIOD * 100);
    $display("TIMEOUT");
    $finish;
end

endmodule

7. Expected simulation output

PASS: received 0x55 (expected 0x55)
PASS: received 0xA3 (expected 0xA3)
PASS: received 0x00 (expected 0x00)
PASS: received 0xFF (expected 0xFF)

ALL TESTS PASSED (4/4)

8. FPGA implementation tips

Key Takeaways

Frequently Asked Questions

Why does UART RX use oversampling?

Oversampling (16×) lets the receiver resynchronise to each incoming bit. TX and RX run on independent oscillators that drift apart. By sampling at the midpoint (sample 8 of 16), the receiver is furthest from both bit edges and most immune to clock skew and jitter.

What is mid-bit sampling?

After detecting the start-bit falling edge, the receiver waits 8 oversampling ticks (half a bit period) to land at the centre of the start bit, then samples every 16 ticks to hit the centre of each subsequent bit.

What does the err output signal mean?

The err flag is asserted when the stop bit is sampled as 0. A valid UART frame must end with a high stop bit. A 0 indicates framing loss or line noise — the byte should be discarded and the receiver returned to IDLE.

← Previous
Day 13: UART Transmitter