Topic 25 · Digital Electronics

Synchronous FIFO Design
Pointer Logic · Full/Empty · Verilog

The fundamental data buffer behind every bus protocol, memory controller, and streaming interface — learn to build one from scratch with correct full and empty detection.

Write Pointer Read Pointer MSB Trick Circular Buffer Parameterized Verilog
PRODUCER wr_en wr_data[W-1:0] ← full write CIRCULAR BUFFER mem[DEPTH-1:0][WIDTH-1:0] addr[0] · addr[1] · addr[2] · … · addr[DEPTH-1] wr_ptr → [slot X] [slot Y] ← rd_ptr empty: wr_ptr == rd_ptr | full: MSBs differ, addr equal read CONSUMER rd_en rd_data[W-1:0] ← empty ↑ clk (single clock domain) · rst_n (active-low sync reset)

1. What is a FIFO?

A FIFO (First-In First-Out) buffer is a queue: the first word written is the first word read out. It decouples a producer that writes data from a consumer that reads it, absorbing mismatches in rate or timing.

ApplicationWhy a FIFO?
AXI / APB bus bridgesDecouple fast master from slow peripheral
UART TX bufferCPU writes bytes; serializer drains them at baud rate
DDR write bufferBurst writes absorb latency of memory controller
Video line bufferPixel pipeline: one line written, next line read
CDC (dual-clock)Async FIFO passes data between clock domains safely
Synchronous vs Async: A synchronous FIFO has one shared clock — simpler logic, trivial full/empty. An async FIFO has two independent clocks (write clock + read clock) and needs Gray-code pointers synchronized across clock domains. This page covers the synchronous case.

2. Synchronous FIFO Architecture

Three components: a register array (the storage), a write pointer (wr_ptr), and a read pointer (rd_ptr). On each clock:

SignalDirectionDescription
clkInputSingle clock for all operations
rst_nInputActive-low synchronous reset — zeros both pointers
wr_enInputWrite enable — write wr_data when asserted and not full
wr_data[W-1:0]InputData to write (W = WIDTH parameter)
rd_enInputRead enable — advance rd_ptr when asserted and not empty
rd_data[W-1:0]OutputData at read pointer (registered or combinational)
fullOutputAsserted when no more writes can be accepted
emptyOutputAsserted when no data is available to read
count[$clog2(DEPTH):0]OutputNumber of valid entries currently in the FIFO
Write-when-full / read-when-empty: Guard every write with !full and every read with !empty. Violating this causes pointer corruption — data is lost or corrupted silently.

3. Pointer Logic & Full/Empty Detection

Both pointers start at zero. Each write increments wr_ptr; each read increments rd_ptr. Both wrap modulo DEPTH. The count is simply wr_ptr − rd_ptr.

Naive approach — and the problem

With N-bit pointers that wrap at DEPTH (a power of 2):

// Naive — BROKEN when DEPTH is not a power of 2, or ambiguous
assign empty = (wr_ptr == rd_ptr);
assign full  = (wr_ptr == rd_ptr);  // same condition! ambiguous

When both pointers are equal the FIFO could be completely empty or completely full — we've gone all the way around. A simple == comparison cannot tell these cases apart.

One common fix — separate count register

always @(posedge clk) begin
  if (!rst_n)        count <= 0;
  else if (wr && !rd) count <= count + 1;
  else if (rd && !wr) count <= count - 1;
end
assign full  = (count == DEPTH);
assign empty = (count == 0);

Works but requires an extra adder. The MSB trick below gives the same result using only pointer comparison — no counter logic at all.

4. The MSB Pointer Trick

Key insight: Use ($clog2(DEPTH) + 1)-bit pointers — one extra bit beyond what is needed to address the memory. The lower bits address the circular buffer; the upper MSB acts as a "lap counter" — it flips every time a pointer wraps around.

Full and empty rules

ConditionExpressionMeaning
Emptywr_ptr == rd_ptrAll bits equal — same lap, same position
Fullwr_ptr[MSB] != rd_ptr[MSB]
&& wr_ptr[ADDR] == rd_ptr[ADDR]
MSBs differ (write lapped read once) but address bits are equal
Countwr_ptr - rd_ptrSubtraction works correctly across the wrap due to two's complement

Example: DEPTH=4, pointer width=3 bits (2 address bits + 1 MSB). Pointers 3'b100 and 3'b000 — MSBs differ (1 vs 0), address bits equal (00 vs 00) → FULL.

Walking example — DEPTH=4

Actionwr_ptr (3-bit)rd_ptr (3-bit)CountStatus
Reset0000000EMPTY
Write A0010001
Write B0100002
Write C0110003
Write D1000004FULL — MSBs 1≠0, addr 00=00
Read A1000013
Read B,C,D1001000EMPTY — all bits equal
DEPTH must be a power of 2 for this trick to work correctly (so the lower-bit wrap-around is exact). If you need a non-power-of-2 depth, use a separate count register instead.

5. Parameterized Verilog Implementation

module sync_fifo #(
  parameter WIDTH = 8,
  parameter DEPTH = 16   // must be a power of 2
)(
  input                    clk,
  input                    rst_n,
  input                    wr_en,
  input  [WIDTH-1:0]       wr_data,
  input                    rd_en,
  output reg [WIDTH-1:0]  rd_data,
  output                   full,
  output                   empty,
  output [$clog2(DEPTH):0] count
);
  // ── Storage ──────────────────────────────────────────────────
  reg [WIDTH-1:0] mem [0:DEPTH-1];

  // ── Pointers: clog2(DEPTH)+1 bits (MSB = lap bit) ─────────
  reg [$clog2(DEPTH):0] wr_ptr, rd_ptr;

  // ── Write ─────────────────────────────────────────────────────
  always @(posedge clk) begin
    if (!rst_n)
      wr_ptr <= 0;
    else if (wr_en && !full) begin
      mem[wr_ptr[$clog2(DEPTH)-1:0]] <= wr_data;
      wr_ptr <= wr_ptr + 1;
    end
  end

  // ── Read ──────────────────────────────────────────────────────
  always @(posedge clk) begin
    if (!rst_n)
      rd_ptr <= 0;
    else if (rd_en && !empty) begin
      rd_data <= mem[rd_ptr[$clog2(DEPTH)-1:0]];
      rd_ptr  <= rd_ptr + 1;
    end
  end

  // ── Status flags ──────────────────────────────────────────────
  localparam ADDR_W = $clog2(DEPTH);

  assign empty = (wr_ptr == rd_ptr);
  assign full  = (wr_ptr[ADDR_W] != rd_ptr[ADDR_W]) &&
                 (wr_ptr[ADDR_W-1:0] == rd_ptr[ADDR_W-1:0]);
  assign count = wr_ptr - rd_ptr;

endmodule
Simultaneous read and write: This implementation allows a write and a read in the same cycle (when neither full nor empty). The count falls through correctly because the subtraction is evaluated after both pointers update.

Registered vs combinational read data

The implementation above registers rd_data — it appears one cycle after rd_en. For fall-through / lookahead behaviour (data valid the same cycle rd_en is asserted):

// Combinational (fall-through) read — replace the rd always block:
assign rd_data = mem[rd_ptr[$clog2(DEPTH)-1:0]];

always @(posedge clk) begin
  if (!rst_n)          rd_ptr <= 0;
  else if (rd_en && !empty) rd_ptr <= rd_ptr + 1;
end

Trade-off: combinational read creates a longer timing path from mem through rd_ptr mux to the output. Registered read adds one cycle latency but is easier to meet timing.

6. Self-Checking Testbench

module tb_sync_fifo;
  parameter WIDTH = 8;
  parameter DEPTH = 8;

  reg            clk, rst_n, wr_en, rd_en;
  reg  [WIDTH-1:0] wr_data;
  wire [WIDTH-1:0] rd_data;
  wire           full, empty;
  wire [$clog2(DEPTH):0] count;

  sync_fifo #(.WIDTH(WIDTH), .DEPTH(DEPTH)) dut (.*);

  initial clk = 0;
  always #5 clk = ~clk;

  integer i; reg [WIDTH-1:0] exp_q [0:DEPTH-1]; integer head, tail, qcnt;

  task automatic fifo_write(input [WIDTH-1:0] d);
    @(posedge clk); #1;
    if (full) begin $display("ERROR: write to full FIFO"); $finish; end
    wr_en = 1; wr_data = d;
    exp_q[tail % DEPTH] = d; tail++; qcnt++;
    @(posedge clk); #1; wr_en = 0;
  endtask

  task automatic fifo_read(input [WIDTH-1:0] expected);
    @(posedge clk); #1;
    if (empty) begin $display("ERROR: read from empty FIFO"); $finish; end
    rd_en = 1;
    @(posedge clk); #1; rd_en = 0;
    if (rd_data !== expected)
      $display("FAIL: got %0h, expected %0h", rd_data, expected);
    else
      $display("PASS: read %0h", rd_data);
  endtask

  initial begin
    $dumpfile("wave.vcd"); $dumpvars(0, tb_sync_fifo);
    {clk,rst_n,wr_en,rd_en,wr_data} = 0;
    repeat(2) @(posedge clk);
    rst_n = 1;

    // Fill to full
    for (i=0; i<DEPTH; i++) fifo_write(i * 8'h11);
    $display("full=%b (expect 1)", full);

    // Drain all — check order
    for (i=0; i<DEPTH; i++) fifo_read(i * 8'h11);
    $display("empty=%b (expect 1)", empty);

    // Simultaneous read+write
    fifo_write(8'hAB); fifo_write(8'hCD);
    @(posedge clk); #1;
    wr_en=1; rd_en=1; wr_data=8'hEF;
    @(posedge clk); #1; wr_en=0; rd_en=0;
    fifo_read(8'hCD); fifo_read(8'hEF);

    $display("All tests done."); $finish;
  end
endmodule

Simulate with Icarus Verilog

iverilog -o fifo_sim sync_fifo.v tb_sync_fifo.v
vvp fifo_sim
gtkwave wave.vcd

7. Timing & Waveform Behaviour

Cyclewr_enwr_datard_enrd_datacountFlags
0000empty
11A01
21B02
301A1
41C1B1simultaneous read+write
501C0empty after this cycle

Note: rd_data reflects the registered output — it is valid on the cycle after rd_en. Row 3 shows data A appearing the cycle after the read at cycle 3 (using registered read path).

Don't read on the same cycle you assert rd_en: With a registered read path, rd_data is one cycle late. If your consumer needs fall-through behaviour, switch to the combinational read variant shown in Section 5.

8. FIFO Depth Calculation

The minimum FIFO depth needed to prevent overflow during a burst:

Depth ≥ Burst_Length × (1 − Drain_Rate / Fill_Rate)
ScenarioFillDrainBurstMin Depth
USB FS → slow SPI12 Mb/s1 Mb/s64 B59 bytes
DDR burst to APB3200 MT/s200 MT/s1615 entries
UART TX (CPU writes)CPU speedbaud rate256 B~256 bytes

Always round up to the next power of 2 to use the MSB trick. Add 20–50% margin for real designs.

9. Variants & Extensions

VariantKey changeUse case
Asynchronous (dual-clock) FIFOGray-code pointers + 2-FF sync per bitCrossing clock domains (see CDC page)
Show-ahead / Fall-through FIFOCombinational rd_dataZero-latency consumer interfaces
FWFT FIFO (First Word Fall-Through)Pre-read first word after writeAXI-Stream source
Almost-full / Almost-emptyExtra threshold comparators on countFlow control with lookahead
Programmable depthRun-time DEPTH registerDMA channel rings
ECC-protected FIFOSECDED on each word in memSafety-critical / radiation hardened
Skid bufferDepth=2, valid-ready handshake wrapperAXI-Stream pipeline stage (see RTL patterns)
The RTL Design Patterns tutorial includes a complete synchronous FIFO and a skid buffer with valid-ready handshake in the context of a full pipeline design.