What is a single port RAM?

A single port RAM has one shared port for both read and write operations. In any given clock cycle, the port can either read or write — not both simultaneously. This is the most area-efficient memory type and maps directly to a single BRAM port on FPGAs. It is used for lookup tables, coefficient storage, line buffers, and any application where simultaneous read and write are not required.

What is the difference between write-first and read-first RAM?

In write-first (or write-through) mode, when a write occurs, the new data being written is also immediately visible on the read output at the same address — the output shows the new value. In read-first mode, when a write occurs, the output shows the OLD data at that address before it gets overwritten — the read is of the previous value. In no-change mode, the output register does not change during a write cycle at all. The choice affects BRAM primitive inference and read-after-write timing behavior.

How do you infer a BRAM from Verilog?

A synthesis tool infers a Block RAM (BRAM) when: the memory array is large enough (typically >256 bits), reads are synchronous (registered output — clocked always block), and the coding style matches the target BRAM primitive. For Xilinx: use a 2D reg array with synchronous read in an always @(posedge clk) block. For write-first, place the write before the read in the same always block. Use the (* ram_style = "block" *) attribute to force BRAM inference and prevent the tool from using distributed LUT RAM.

What is an asynchronous RAM read?

In asynchronous read mode, the output changes combinationally as soon as the address changes — no clock edge is needed to get the output. This is modeled in Verilog with a continuous assign statement: assign dout = mem[addr]. Asynchronous reads infer distributed RAM (LUT RAM) on FPGAs, not BRAM. They have lower read latency (no register delay) but consume more LUT resources and have worse timing at high frequencies.

Memory Design

Single Port RAM — Verilog Guide

The most common memory primitive — one port, shared read and write. Covers synchronous read-first, write-first, no-change modes, asynchronous read, BRAM inference, and an interactive memory simulator.

SynchronousAsynchronousWrite-FirstRead-FirstNo-ChangeBRAM Inference

Single Port RAM — Block Diagram

Port Description

Port	Width	Direction	Description
clk	1	Input	Clock — all synchronous operations occur on rising edge
en	1	Input	Chip enable — when 0, no read or write occurs (optional in some designs)
we	1	Input	Write enable — 1 = write, 0 = read
addr	ADDR_W	Input	Address bus — selects memory location. DEPTH = 2^ADDR_W
din	DATA_W	Input	Write data — valid when we=1
dout	DATA_W	Output	Read data — registered (sync) or combinational (async)

Synchronous Read Modes — Verilog Code

When a write and read happen to the same address in the same clock cycle, the three modes differ in what dout shows. This choice determines which BRAM primitive is inferred.

On a write, dout = new data (din). The write happens first, then the read sees the written value. Maps to Xilinx "write-first" BRAM mode.

// Single Port RAM — Write-First (Write-Through) Mode
// dout shows the NEW data on a write-read to same address
module sp_ram_write_first #(
  parameter DATA_W = 8,
  parameter ADDR_W = 8   // depth = 2^8 = 256 locations
)(
  input  wire              clk,
  input  wire              we,      // write enable
  input  wire [ADDR_W-1:0] addr,
  input  wire [DATA_W-1:0] din,
  output reg  [DATA_W-1:0] dout
);
  // Memory array
  reg [DATA_W-1:0] mem [0:(2**ADDR_W)-1];

  always @(posedge clk) begin
    if (we) begin
      mem[addr] <= din;   // write the new data
      dout      <= din;   // output = new data (write-first)
    end else begin
      dout <= mem[addr];  // read normally
    end
  end
endmodule

On a write, dout = old data (value before the write). Read happens before the write. Maps to Xilinx "read-first" / Intel "old data" mode.

// Single Port RAM — Read-First Mode
// dout shows the OLD data on a write-read to same address
module sp_ram_read_first #(
  parameter DATA_W = 8,
  parameter ADDR_W = 8
)(
  input  wire              clk,
  input  wire              we,
  input  wire [ADDR_W-1:0] addr,
  input  wire [DATA_W-1:0] din,
  output reg  [DATA_W-1:0] dout
);
  reg [DATA_W-1:0] mem [0:(2**ADDR_W)-1];

  always @(posedge clk) begin
    if (we)
      mem[addr] <= din;    // write
    dout <= mem[addr];     // read AFTER write — sees OLD value
    // Note: read is always registered regardless of we
  end
endmodule

On a write, dout holds its previous value — it does not update. This is the most power-efficient mode. Maps to Xilinx "no-change" BRAM mode.

// Single Port RAM — No-Change Mode
// dout holds its previous value during a write cycle
module sp_ram_no_change #(
  parameter DATA_W = 8,
  parameter ADDR_W = 8
)(
  input  wire              clk,
  input  wire              we,
  input  wire [ADDR_W-1:0] addr,
  input  wire [DATA_W-1:0] din,
  output reg  [DATA_W-1:0] dout
);
  reg [DATA_W-1:0] mem [0:(2**ADDR_W)-1];

  always @(posedge clk) begin
    if (we) begin
      mem[addr] <= din;    // write only
      // dout NOT updated — holds previous value
    end else begin
      dout <= mem[addr];   // read only on non-write cycles
    end
  end
endmodule

dout changes immediately when addr changes — no clock needed. Write is still synchronous. Infers distributed (LUT) RAM, not BRAM. Lower latency, higher LUT usage.

// Single Port RAM — Asynchronous Read (Distributed RAM)
// Read is combinational: dout changes when addr changes
// Write is synchronous: data captured on posedge clk
// Synthesis: infers LUT-based distributed RAM, NOT BRAM
module sp_ram_async_read #(
  parameter DATA_W = 8,
  parameter ADDR_W = 6   // 64 locations — typical for distributed RAM
)(
  input  wire              clk,
  input  wire              we,
  input  wire [ADDR_W-1:0] addr,
  input  wire [DATA_W-1:0] din,
  output wire [DATA_W-1:0] dout  // wire, not reg — combinational output
);
  reg [DATA_W-1:0] mem [0:(2**ADDR_W)-1];

  // Synchronous write
  always @(posedge clk) begin
    if (we)
      mem[addr] <= din;   // non-blocking: write on clock edge
  end

  // Asynchronous (combinational) read
  assign dout = mem[addr];   // output changes immediately with addr
endmodule

With Byte Write Enable (Byte-Enable RAM)

Real SoC designs need byte-granularity writes — write only byte 0, 1, 2, or 3 of a 32-bit word. Each bit in we controls one byte lane.

// Single Port RAM — Byte Write Enable (32-bit word, 4 byte lanes)
// we[0] = byte 0 (bits 7:0), we[1] = byte 1, we[2] = byte 2, we[3] = byte 3
module sp_ram_byte_en #(
  parameter DATA_W = 32,
  parameter ADDR_W = 10   // 1024 words × 4 bytes = 4 KB
)(
  input  wire              clk,
  input  wire [3:0]        we,     // per-byte write enables
  input  wire [ADDR_W-1:0] addr,
  input  wire [DATA_W-1:0] din,
  output reg  [DATA_W-1:0] dout
);
  reg [DATA_W-1:0] mem [0:(2**ADDR_W)-1];

  always @(posedge clk) begin
    // Byte-granularity write
    if (we[0]) mem[addr][ 7: 0] <= din[ 7: 0];
    if (we[1]) mem[addr][15: 8] <= din[15: 8];
    if (we[2]) mem[addr][23:16] <= din[23:16];
    if (we[3]) mem[addr][31:24] <= din[31:24];
    // Read: always registered
    dout <= mem[addr];  // read-first mode (reads before writes take effect)
  end
endmodule

Timing Diagram — Read-First vs Write-First

Read Mode Comparison

Write-First

Same-address simultaneous R/W: dout = din (new data). Useful when you write and immediately need to read the new value. Xilinx RAMB36: WRITE_MODE = "WRITE_FIRST".

Read-First

Same-address simultaneous R/W: dout = old data. Needed for shift-register-based constructs. Default mode for many tools. Supports ECC in Xilinx BRAMs. WRITE_MODE = "READ_FIRST".

No-Change

dout unchanged during write. Best power efficiency — output register doesn't toggle. WRITE_MODE = "NO_CHANGE". Cannot be used if you need read-while-write behavior.

Async Read

dout = mem[addr] combinationally. Zero read latency but infers distributed LUT RAM (not BRAM). Higher frequency penalty. Use only for small memories (<64 entries).

BRAM Inference Tips

Tool	Requirement for BRAM Inference	Attribute to Force
Xilinx Vivado	Synchronous read, array ≥ 1Kbit, no reset on dout	`(* ram_style = "block" *)`
Intel Quartus	Synchronous read, registered output, single clock	`// synthesis ramstyle = "M20K"`
Synopsys DC	Memory compiler + rf2gen for ASIC; BRAM not applicable	Instantiate hard macro directly
Cadence Genus	Same as DC — use memory compiler for SRAM macros	Use `// cadence map_to_module`

// Force BRAM inference — Xilinx/Vivado attribute
(* ram_style = "block" *)    // force Block RAM (not distributed)
module sp_ram_bram #(
  parameter DATA_W = 18,
  parameter ADDR_W = 10    // 1024 × 18 = 18 Kbits — fits in one RAMB18
)(
  input  wire              clk,
  input  wire              we,
  input  wire [ADDR_W-1:0] addr,
  input  wire [DATA_W-1:0] din,
  output reg  [DATA_W-1:0] dout
);
  (* ram_style = "block" *)
  reg [DATA_W-1:0] mem [0:(2**ADDR_W)-1];

  // Initialize from file (optional — for ROM-like use)
  // initial $readmemh("init.hex", mem);

  always @(posedge clk) begin
    if (we)
      mem[addr] <= din;    // synchronous write
    dout <= mem[addr];     // synchronous read (read-first)
  end
endmodule

Interactive Memory Simulator

Simulate a 16×8 single port RAM. Enter address (0–15) and data (0–255 or hex like 0xAB), then Read or Write.

Address (0–15)

Data (hex or dec)

Mode

Memory contents (16 × 8 bits):

Operation log (newest first):

dout = —

Other Memory Types

Simple Dual Port RAM

Separate read + write ports, same clock

True Dual Port RAM

Two independent R/W ports, collision handling

ROM Types

Sync/async ROM, $readmemh init, LUT ROM

Multi-read-port, flip-flop based, forwarding

Content Addressable Memory

Binary CAM, ternary CAM, match logic

Synchronous FIFO

Pointer arithmetic, full/empty, MSB trick

FAQ

Can a single port RAM read and write simultaneously?

No — a single port RAM has one address bus shared between read and write. You can only do one operation per cycle. If your design needs simultaneous read and write, use a simple dual-port RAM (separate read address + write address) or a true dual-port RAM.

Why does adding a reset to dout break BRAM inference?

BRAM output registers in most FPGAs cannot be reset to an arbitrary value — they have a fixed reset behavior (typically to 0 on GSR). If your Verilog specifies if (!rst_n) dout <= 0, the synthesis tool may not be able to map to the BRAM primitive and will fall back to LUT-based flip-flops. Solution: remove the synchronous reset on dout, or accept the tool will use flip-flops for the output register.

What is the latency of a synchronous RAM read?

One clock cycle. You present the address on cycle N, and dout is valid on cycle N+1. This is called "1-cycle read latency" or "registered read." Some BRAMs support a pipeline register that adds a second cycle of latency but allows higher clock frequency — useful for very deep memory arrays where internal propagation is the bottleneck.