Memory Design Series

True Dual Port RAM — TDP RAM Verilog

Two fully independent read/write ports. Same-address collision modes, BRAM primitive inference, dual-clock operation, and an interactive simulator.

Port A + Port B Read-First / Write-First Dual-Clock RAMB36E2 Collision Handling FPGA BRAM
What is True Dual Port RAM?

A True Dual Port (TDP) RAM exposes two completely symmetric ports — Port A and Port B — each capable of independent read and write operations at any address, on every clock cycle. Both ports share the same physical memory array.

Shared Memory Array addr[0] addr[1] addr[2] Port A clka addra[N-1:0] dina[W-1:0] wea douta[W-1:0] Port B clkb addrb[N-1:0] dinb[W-1:0] web doutb[W-1:0] Both ports can access any address — collision must be managed
TDP vs SDP at a glance: SDP has one port locked to write and one to read. TDP gives both ports full R/W capability. On an FPGA BRAM, TDP mode uses more configuration bits but is essential for cache RAMs, multi-master register files, and processor interfaces.
Basic TDP RAM — Single Clock

The simplest TDP RAM: both ports share one clock. Each port independently reads or writes on every rising edge. The collision mode shown here is read-first: when a port reads the same address being written by the other port, it captures the old value.

module tdp_ram_rf #(
  parameter DATA_W = 8,
  parameter ADDR_W = 4
)(
  input  wire              clk,
  // Port A
  input  wire              wea,
  input  wire [ADDR_W-1:0] addra,
  input  wire [DATA_W-1:0] dina,
  output reg  [DATA_W-1:0] douta,
  // Port B
  input  wire              web,
  input  wire [ADDR_W-1:0] addrb,
  input  wire [DATA_W-1:0] dinb,
  output reg  [DATA_W-1:0] doutb
);
  reg [DATA_W-1:0] mem [0:(2**ADDR_W)-1];

  // Port A — read-first: douta captures old data
  always @(posedge clk) begin
    if (wea) mem[addra] <= dina;
    douta <= mem[addra];          // reads old value on collision
  end

  // Port B — read-first: doutb captures old data
  always @(posedge clk) begin
    if (web) mem[addrb] <= dinb;
    doutb <= mem[addrb];
  end
endmodule
module tdp_ram_wf #(
  parameter DATA_W = 8,
  parameter ADDR_W = 4
)(
  input  wire              clk,
  input  wire              wea,
  input  wire [ADDR_W-1:0] addra,
  input  wire [DATA_W-1:0] dina,
  output reg  [DATA_W-1:0] douta,
  input  wire              web,
  input  wire [ADDR_W-1:0] addrb,
  input  wire [DATA_W-1:0] dinb,
  output reg  [DATA_W-1:0] doutb
);
  reg [DATA_W-1:0] mem [0:(2**ADDR_W)-1];

  // Port A — write-first: read sees new data when wea and same address
  always @(posedge clk) begin
    if (wea) begin
      mem[addra] <= dina;
      douta      <= dina;
    end else begin
      douta <= mem[addra];
    end
  end

  // Port B — write-first
  always @(posedge clk) begin
    if (web) begin
      mem[addrb] <= dinb;
      doutb      <= dinb;
    end else begin
      doutb <= mem[addrb];
    end
  end
endmodule
module tdp_ram_nc #(
  parameter DATA_W = 8,
  parameter ADDR_W = 4
)(
  input  wire              clk,
  input  wire              wea,
  input  wire [ADDR_W-1:0] addra,
  input  wire [DATA_W-1:0] dina,
  output reg  [DATA_W-1:0] douta,
  input  wire              web,
  input  wire [ADDR_W-1:0] addrb,
  input  wire [DATA_W-1:0] dinb,
  output reg  [DATA_W-1:0] doutb
);
  reg [DATA_W-1:0] mem [0:(2**ADDR_W)-1];

  // Port A — no-change: output holds during write
  always @(posedge clk) begin
    if (wea) mem[addra] <= dina;
    else     douta <= mem[addra];
  end

  // Port B — no-change
  always @(posedge clk) begin
    if (web) mem[addrb] <= dinb;
    else     doutb <= mem[addrb];
  end
endmodule
Dual-Clock TDP RAM

Each port gets its own independent clock — clka and clkb. This is the configuration used for cross-domain memory sharing, e.g., a CPU on one clock feeding a DSP engine on another.

Collision in dual-clock mode: If Port A and Port B access the same address within the same memory cycle with one or both writing, the result is undefined. The only safe solution is external arbitration (e.g., a handshake or Gray-coded pointer scheme).
Verilog — Dual-Clock TDP RAM
module tdp_ram_2clk #(
  parameter DATA_W = 8,
  parameter ADDR_W = 4
)(
  // Port A
  input  wire              clka,
  input  wire              wea,
  input  wire [ADDR_W-1:0] addra,
  input  wire [DATA_W-1:0] dina,
  output reg  [DATA_W-1:0] douta,
  // Port B
  input  wire              clkb,
  input  wire              web,
  input  wire [ADDR_W-1:0] addrb,
  input  wire [DATA_W-1:0] dinb,
  output reg  [DATA_W-1:0] doutb
);
  (* ram_style = "block" *)
  reg [DATA_W-1:0] mem [0:(2**ADDR_W)-1];

  // Port A — clocked on clka
  always @(posedge clka) begin
    if (wea) mem[addra] <= dina;
    douta <= mem[addra];
  end

  // Port B — clocked on clkb
  always @(posedge clkb) begin
    if (web) mem[addrb] <= dinb;
    doutb <= mem[addrb];
  end
endmodule
Collision Modes

Collision occurs when both ports access the same address in the same clock cycle. The three cases are: read-read (safe), read-write (defined by mode), and write-write (always destructive).

Port A action Port B action Same addr? Read-First result Write-First result No-Change result
Read Read Yes Both see same old value Both see same old value Both see same old value
Write A Read B Yes douta=old, doutb=old douta=new, doutb=new douta holds, doutb=old
Read A Write B Yes douta=old, doutb=old douta=new, doutb=new douta=old, doutb holds
Write A Write B Yes Undefined — memory data corrupted. Prevent with arbitration.
Any Any No No collision — both operate independently
Write-Write to the same address is always unsafe. Even in single-clock TDP mode, the final value stored in memory is undefined. Always use an arbitration unit (priority encoder, round-robin, or handshake) to prevent simultaneous writes to the same address.
Byte-Enable TDP RAM

Both ports support byte-lane write enables. Common in processor data memories where the CPU writes 1, 2, or 4 bytes per transaction.

Verilog — 32-bit TDP with 4-bit byte enables
module tdp_ram_be #(
  parameter DATA_W = 32,
  parameter ADDR_W = 10,
  parameter BE_W   = DATA_W/8   // 4 byte-enables
)(
  input  wire              clka,
  input  wire [BE_W-1:0]   wea,            // per-byte write enable A
  input  wire [ADDR_W-1:0] addra,
  input  wire [DATA_W-1:0] dina,
  output reg  [DATA_W-1:0] douta,

  input  wire              clkb,
  input  wire [BE_W-1:0]   web,
  input  wire [ADDR_W-1:0] addrb,
  input  wire [DATA_W-1:0] dinb,
  output reg  [DATA_W-1:0] doutb
);
  (* ram_style = "block" *)
  reg [DATA_W-1:0] mem [0:(2**ADDR_W)-1];

  // Port A
  always @(posedge clka) begin
    if (wea[0]) mem[addra][ 7: 0] <= dina[ 7: 0];
    if (wea[1]) mem[addra][15: 8] <= dina[15: 8];
    if (wea[2]) mem[addra][23:16] <= dina[23:16];
    if (wea[3]) mem[addra][31:24] <= dina[31:24];
    douta <= mem[addra];
  end

  // Port B
  always @(posedge clkb) begin
    if (web[0]) mem[addrb][ 7: 0] <= dinb[ 7: 0];
    if (web[1]) mem[addrb][15: 8] <= dinb[15: 8];
    if (web[2]) mem[addrb][23:16] <= dinb[23:16];
    if (web[3]) mem[addrb][31:24] <= dinb[31:24];
    doutb <= mem[addrb];
  end
endmodule
FPGA BRAM — TDP Inference & Primitive

Xilinx/AMD synthesis tools infer RAMB36E2 (TDP mode) automatically from the pattern above when (* ram_style = "block" *) is set. You can also instantiate the primitive directly for full control over collision modes, output registers, and INIT strings.

// Vivado auto-infers RAMB36E2 TDP from this pattern
// Requirements:
//   1. (* ram_style = "block" *) attribute
//   2. Synchronous reads (dout assigned in clocked always block)
//   3. No async reset on dout
//   4. Two separate always blocks — one per clock domain

(* ram_style = "block" *)
reg [7:0] mem [0:255];

always @(posedge clka) begin
  if (wea) mem[addra] <= dina;
  douta <= mem[addra];
end

always @(posedge clkb) begin
  if (web) mem[addrb] <= dinb;
  doutb <= mem[addrb];
end
// RAMB36E2 instantiation — Xilinx UltraScale/UltraScale+
// RAM_MODE = "TDP" enables true dual-port
// READ_WIDTH_A/B, WRITE_WIDTH_A/B: 1,2,4,9,18,36 (including parity)
RAMB36E2 #(
  .RAM_MODE       ("TDP"),
  .READ_WIDTH_A   (9),      // 8 data + 1 parity
  .WRITE_WIDTH_A  (9),
  .READ_WIDTH_B   (9),
  .WRITE_WIDTH_B  (9),
  .WRITE_MODE_A   ("READ_FIRST"),
  .WRITE_MODE_B   ("READ_FIRST"),
  .DOA_REG        (0),      // 0 = no output register, 1 = registered
  .DOB_REG        (0)
) u_bram (
  .CLKARDCLK   (clka),
  .CLKBWRCLK   (clkb),
  .ADDRARDADDR ({addra, 3'b000}),
  .ADDRBWRADDR ({addrb, 3'b000}),
  .DIADI       (dina),
  .DIBDI       (dinb),
  .WEA         ({4{wea}}),
  .WEBWE       ({8{web}}),
  .ENARDEN     (1'b1),
  .ENBWREN     (1'b1),
  .RSTRAMARSTRAM(1'b0),
  .RSTRAMB     (1'b0),
  .DOADO       (douta),
  .DOBDO       (doutb)
);
Arbitration — Preventing Write-Write Collision

In a shared memory system, a simple round-robin arbiter grants one port at a time when both request the same address with write intent.

Verilog — Priority Arbiter for TDP Write Collision
module tdp_arbiter #(parameter ADDR_W = 4)(
  input  wire              clk,
  input  wire              rst_n,
  // Port A request
  input  wire              req_a,
  input  wire [ADDR_W-1:0] addr_a,
  // Port B request
  input  wire              req_b,
  input  wire [ADDR_W-1:0] addr_b,
  // Grants
  output reg               gnt_a,
  output reg               gnt_b,
  // Collision flag
  output wire              collision
);
  wire same_addr = (addr_a == addr_b);
  assign collision = req_a & req_b & same_addr;

  always @(posedge clk or negedge rst_n) begin
    if (!rst_n) begin
      gnt_a <= 0;
      gnt_b <= 0;
    end else begin
      if (collision) begin
        // Round-robin: flip priority each cycle
        gnt_a <= ~gnt_a;
        gnt_b <= ~gnt_b;
      end else begin
        gnt_a <= req_a;
        gnt_b <= req_b;
      end
    end
  end
endmodule
Testbench
SystemVerilog — TDP RAM Testbench
module tb_tdp_ram;
  parameter DATA_W = 8, ADDR_W = 4;

  logic              clka, clkb;
  logic              wea,  web;
  logic [ADDR_W-1:0] addra, addrb;
  logic [DATA_W-1:0] dina,  dinb;
  logic [DATA_W-1:0] douta, doutb;

  tdp_ram_rf #(.DATA_W(DATA_W),.ADDR_W(ADDR_W)) dut(.*);

  // Independent clocks (different frequencies)
  initial clka = 0;
  always #5  clka = ~clka;   // 100 MHz
  initial clkb = 0;
  always #7  clkb = ~clkb;   // ~71 MHz

  task write_a(input [ADDR_W-1:0] a, input [DATA_W-1:0] d);
    @(posedge clka); wea=1; addra=a; dina=d;
    @(posedge clka); wea=0;
  endtask

  task write_b(input [ADDR_W-1:0] a, input [DATA_W-1:0] d);
    @(posedge clkb); web=1; addrb=a; dinb=d;
    @(posedge clkb); web=0;
  endtask

  initial begin
    wea=0; web=0; addra=0; addrb=0; dina=0; dinb=0;
    #20;
    // Port A writes, Port B reads same address
    fork
      write_a(4'h3, 8'hAB);
      begin @(posedge clkb); web=0; addrb=4'h3; end
    join
    #30;
    $display("douta=%0h doutb=%0h", douta, doutb);

    // Simultaneous writes — different addresses (safe)
    fork
      write_a(4'h1, 8'h11);
      write_b(4'h2, 8'h22);
    join
    #30;
    $finish;
  end
endmodule
TDP vs SDP vs Single-Port RAM
Feature Single Port Simple Dual Port (SDP) True Dual Port (TDP)
Ports 1 (R or W) 2 (1 write, 1 read) 2 (each R+W)
Simultaneous access No Yes (diff ops) Yes (any combo)
Port symmetry Asymmetric Symmetric
Write-write collision N/A Impossible Possible
BRAM usage (Xilinx) ½ RAMB36 ½ RAMB36 1× RAMB36
Typical use LUT RAM, small ROM FIFO, ping-pong buffer Register file, shared bus memory
Interactive TDP RAM Simulator
16 × 8-bit TDP RAM — Read-First mode
Port A
Port B
Memory Array (addr 0–15)
Operation log
FAQ
What is a True Dual Port RAM?

A TDP RAM exposes two completely symmetric ports — Port A and Port B — each capable of independent read and write at any address on every clock cycle. Both ports share one physical memory array.

What happens when both ports write to the same address?

The final memory content is undefined. On Xilinx BRAMs, one write may win, or the data may be corrupted. Always prevent write-write collisions with an arbiter — check the collision flag before granting both write enables.

What is READ_FIRST vs WRITE_FIRST mode?

READ_FIRST: when a port reads and writes the same address in the same cycle, the output captures the old (pre-write) value. WRITE_FIRST: the output sees the new data (address forwarding). NO_CHANGE: output holds previous value during a write. Configurable per-port in RAMB36E2.

How does Xilinx RAMB36E2 implement TDP RAM?

Set RAM_MODE="TDP", configure independent READ_WIDTH_A/B and WRITE_WIDTH_A/B, set WRITE_MODE_A/B to READ_FIRST/WRITE_FIRST/NO_CHANGE. Vivado auto-infers RAMB36E2 TDP from RTL patterns with (* ram_style = "block" *).

When should I use TDP instead of SDP RAM?

Use TDP when both masters need to write — e.g., a CPU updating a frame buffer while a GPU reads it, or a processor register file where two instructions write different registers in the same cycle. Use SDP when one master writes and one reads — simpler and avoids write collision issues.

Previous
Simple Dual Port RAM
Next
ROM — Read-Only Memory