Serial Peripheral Interface
SPI · CPOL/CPHA · Verilog Master
The fastest and simplest synchronous serial protocol — full-duplex data transfer with no overhead, no addressing, and clock speeds exceeding 100 MHz.
SPI Mode 0 (CPOL=0, CPHA=0) — 8-bit full-duplex transfer. MOSI = 0xB4, MISO = 0x4B. Data sampled on each rising SCLK edge (dashed lines). CS active-low throughout.
Table of Contents
1. What is SPI?
SPI (Serial Peripheral Interface) is a synchronous serial communication protocol developed by Motorola in the 1980s. It is the go-to protocol for short-distance, high-speed communication between a microcontroller or FPGA and peripheral ICs such as ADCs, DACs, display drivers, NOR Flash, SD cards, and wireless transceivers. For longer-distance or multi-drop buses, compare with I2C or UART.
SPI uses a master-slave architecture. The master always drives the clock. Communication is full-duplex — data moves in both directions simultaneously on every clock cycle. There is no addressing scheme; instead, the master asserts an active-low CS (Chip Select) line to select the intended slave.
Clock frequencies typically range from 1 MHz to 100 MHz, with high-speed variants (Quad-SPI, Octal-SPI) reaching beyond 200 MHz by adding extra data lines. Standard SPI uses four wires: SCLK, MOSI, MISO, and CS.
SPI is part of the broader Serial Protocols family. The protocol has no formal standard body — each device datasheet defines exact timing, frame width, and mode requirements — but the core four-wire interface is universally understood.
2. The Four SPI Signals
| Signal | Full Name | Direction | Description |
|---|---|---|---|
| SCLK | Serial Clock | Master → All Slaves | Clock generated exclusively by the master. Frequency sets the transfer rate. Slaves have no input clock of their own — they are entirely clock-slaved to the master. |
| MOSI | Master Out Slave In | Master → Active Slave | Serial data driven by the master to the selected slave. In daisy-chain mode, the MOSI of one slave feeds the next in the chain. Also called SDO (Serial Data Out) on some devices. |
| MISO | Master In Slave Out | Active Slave → Master | Serial data driven by the selected slave back to the master. When CS is deasserted (high), the slave tristates MISO to avoid bus conflicts. Also called SDI or CIPO on newer nomenclature. |
| CS / SS | Chip Select / Slave Select | Master → Individual Slave | Active-low enable. The master asserts CS low before the first SCLK edge and deasserts it high after the last bit. Each slave has its own dedicated CS line. A slave ignores SCLK and MOSI when its CS is high. |
WP (Write Protect, active-low)
and HOLD (pause transfer without deasserting CS). Quad-SPI devices
add IO2 and IO3 to send 4 bits per clock (x4 throughput). These are extensions, not part of base SPI.
3. CPOL and CPHA — The Four Modes
SPI defines two configuration bits that control when data is sampled relative to the clock: CPOL (Clock Polarity) and CPHA (Clock Phase). These yield four distinct modes. Master and slave must use the same mode — mismatched modes are a common source of SPI bugs.
| Mode | CPOL | CPHA | Clock Idle State | Data Sampled On | Data Shifted On | Common Devices |
|---|---|---|---|---|---|---|
| Mode 0 | 0 | 0 | Low | Rising edge | Falling edge | Most ADCs, SD cards (SPI mode), ADS1115, nRF24L01 |
| Mode 1 | 0 | 1 | Low | Falling edge | Rising edge | Some accelerometers (ADXL345 alt mode) |
| Mode 2 | 1 | 0 | High | Falling edge | Rising edge | Some SD/MMC controllers |
| Mode 3 | 1 | 1 | High | Rising edge | Falling edge | W25Q NOR flash, ILI9341 display, MCP4921 DAC |
How to Remember CPHA
CPHA=0: data is valid before the first edge — it is set up by the master before CS goes low,
and sampled on the first (leading) clock edge.
CPHA=1: data is set up on the first edge and sampled on the second (trailing) edge — one half
clock period of setup time is guaranteed by the protocol.
4. SPI Frame Format
A standard SPI byte transfer (Mode 0, 8-bit MSB-first) proceeds as follows:
- Master asserts CS low — selects the slave. SCLK remains low (idle).
- Master loads bit D7 (MSB) onto MOSI before the first rising edge.
- On each rising SCLK edge: both master and slave sample the current bit (MOSI and MISO respectively).
- On each falling SCLK edge: both shift the next bit onto the line.
- After 8 rising edges, the full byte has been transferred in both directions simultaneously.
- Master deasserts CS high. Slave tristates MISO.
5. Multi-Slave Configurations
SPI supports two physical topologies for connecting multiple slaves to one master.
Option A — Independent SS Lines (Most Common)
Each slave has a dedicated CS line. SCLK, MOSI, and MISO are shared on a common bus. Only the slave whose CS is asserted will drive MISO; all others tristate their MISO pins.
Master
├─ SCLK ──────┬──────┬──────┐
├─ MOSI ──────┼──────┼──────┤
├─ MISO ──────┼──────┼──────┤ (each slave tristates when CS high)
├─ CS0 ──────┤ │ │
├─ CS1 ──────┼──────┤ │
└─ CS2 ──────┼──────┼──────┘
Slave0 Slave1 Slave2
Advantage: Each slave can be accessed at any time. Max speed per slave is not limited by chain length. N slaves need N GPIO pins for CS.
Option B — Daisy-Chain (Shift-Register Chain)
All slaves share a single CS line. MISO of each slave feeds MOSI of the next. The master must clock N × (frame width) bits to reach the last device. Used in 74HC595 shift registers and LED driver chains.
Master (single CS for all) ├─ SCLK ──────────┬─────────────┬──────────────┐ ├─ CS ──────────┼─────────────┼──────────────┤ ├─ MOSI ──► Slave0 ─►(MISO→MOSI)─► Slave1 ──►(MISO→MOSI)─► Slave2 └─ MISO ◄─────────────────────────────────────── Slave2 MISO
Advantage: Only one CS line regardless of chain length. Limitation: Cannot address individual slaves; all shift together. Latency grows as O(N). Only works if all slaves support daisy-chain (SDO→SDI passthrough).
6. Verilog SPI Master
The following is a complete, synthesizable, parameterized SPI master supporting all four modes via CPOL and CPHA parameters. A configurable clock divider generates SCLK from the system clock. The FSM cycles: IDLE → ASSERT_CS → TRANSFER → DONE → IDLE.
// ─────────────────────────────────────────────────────
// SPI Master — parameterized, all 4 modes, CLK_DIV
// sclk_freq = clk / (2 * CLK_DIV)
// ─────────────────────────────────────────────────────
module spi_master #(
parameter CLK_DIV = 4, // sclk = clk / (2*CLK_DIV)
parameter CPOL = 0, // 0: idle low, 1: idle high
parameter CPHA = 0 // 0: sample leading, 1: sample trailing
)(
input clk, rst_n,
input start,
input [7:0] tx_data,
output reg [7:0] rx_data,
output reg done,
// SPI pins
output reg sclk,
output reg mosi,
input miso,
output reg cs_n
);
// FSM states
localparam IDLE = 2'd0,
ASSERT_CS = 2'd1,
TRANSFER = 2'd2,
DONE = 2'd3;
reg [1:0] state;
reg [7:0] tx_shift, rx_shift;
reg [3:0] bit_cnt; // counts 0..15 (rising+falling per bit)
reg [$clog2(CLK_DIV)-1:0] div_cnt;
reg clk_en; // sclk toggle strobe
reg sclk_r; // internal sclk state
// ── Clock divider ────────────────────────────────
always @(posedge clk or negedge rst_n) begin
if (!rst_n) begin
div_cnt <= 0;
clk_en <= 1'b0;
end else begin
if (div_cnt == CLK_DIV[0+:$clog2(CLK_DIV)]-1) begin
div_cnt <= 0;
clk_en <= 1'b1;
end else begin
div_cnt <= div_cnt + 1;
clk_en <= 1'b0;
end
end
end
// ── Main FSM ─────────────────────────────────────
always @(posedge clk or negedge rst_n) begin
if (!rst_n) begin
state <= IDLE;
cs_n <= 1'b1;
sclk_r <= CPOL[0]; // idle state
sclk <= CPOL[0];
mosi <= 1'b0;
done <= 1'b0;
bit_cnt <= 0;
tx_shift <= 0;
rx_shift <= 0;
rx_data <= 0;
end else begin
done <= 1'b0; // pulse for one cycle only
case (state)
// ── IDLE: wait for start pulse ──────────────
IDLE: begin
cs_n <= 1'b1;
sclk_r <= CPOL[0];
sclk <= CPOL[0];
if (start) begin
tx_shift <= tx_data;
bit_cnt <= 0;
state <= ASSERT_CS;
end
end
// ── ASSERT_CS: pull CS low, wait one div ───
ASSERT_CS: begin
cs_n <= 1'b0;
// For CPHA=0, put MSB on MOSI now (before first edge)
if (CPHA == 0)
mosi <= tx_shift[7];
if (clk_en)
state <= TRANSFER;
end
// ── TRANSFER: 8 bits, 16 half-cycles ───────
TRANSFER: begin
if (clk_en) begin
sclk_r <= ~sclk_r;
sclk <= ~sclk_r; // one cycle delay (pipeline)
if (CPHA == 0) begin
// Sample on leading edge, shift on trailing
if (sclk_r == CPOL[0]) begin // leading edge
rx_shift <= {rx_shift[6:0], miso};
bit_cnt <= bit_cnt + 1;
end else begin // trailing edge
mosi <= tx_shift[7 - (bit_cnt[2:0])];
end
end else begin
// CPHA=1: shift on leading edge, sample on trailing
if (sclk_r == CPOL[0]) begin // leading edge — shift
mosi <= tx_shift[7 - bit_cnt[2:0]];
end else begin // trailing edge — sample
rx_shift <= {rx_shift[6:0], miso};
bit_cnt <= bit_cnt + 1;
end
end
if (bit_cnt == 8)
state <= DONE;
end
end
// ── DONE: deassert CS, capture rx_data ─────
DONE: begin
cs_n <= 1'b1;
rx_data <= rx_shift;
done <= 1'b1;
sclk_r <= CPOL[0];
sclk <= CPOL[0];
mosi <= 1'b0;
state <= IDLE;
end
default: state <= IDLE;
endcase
end
end
endmodule
SPI Master — FSM State Diagram
CLK_DIV=4 and a 100 MHz system clock, SCLK frequency =
100 MHz / (2×4) = 12.5 MHz. Set CLK_DIV=1 for the fastest possible SCLK (clk/2).
Increase CLK_DIV for slower, more robust operation with long PCB traces or slow slaves.
7. Verilog SPI Slave
The SPI slave receives SCLK and MOSI from the master, samples MOSI on the correct clock edge, and drives MISO from its own TX shift register. Because the slave is entirely clock-slaved, it simply responds to edges — no clock divider needed.
// ─────────────────────────────────────────────────────
// SPI Slave — Mode 0 (CPOL=0, CPHA=0)
// Samples MOSI on rising SCLK, drives MISO MSB-first
// ─────────────────────────────────────────────────────
module spi_slave (
input clk, // system clock (for output staging)
input rst_n,
// SPI pins
input sclk,
input mosi,
output reg miso,
input cs_n,
// Application interface
input [7:0] tx_data, // byte to send back to master
output reg [7:0] rx_data, // byte received from master
output reg rx_valid // pulses high for one sys clk after full byte
);
reg [7:0] rx_shift, tx_shift;
reg [2:0] bit_cnt;
// ── Edge detection on SPI clock ─────────────────
reg sclk_d;
wire sclk_rising = (!sclk_d) && sclk;
wire sclk_falling = sclk_d && (!sclk);
always @(posedge clk or negedge rst_n)
if (!rst_n) sclk_d <= 1'b0;
else sclk_d <= sclk;
// ── Load TX shift register when CS asserted ──────
always @(posedge clk or negedge rst_n) begin
if (!rst_n) begin
tx_shift <= 8'h00;
rx_shift <= 8'h00;
rx_data <= 8'h00;
rx_valid <= 1'b0;
bit_cnt <= 3'd7; // MSB first
miso <= 1'b0;
end else begin
rx_valid <= 1'b0;
if (cs_n) begin
// CS deasserted: reset counters, pre-load TX
bit_cnt <= 3'd7;
tx_shift <= tx_data;
miso <= 1'b0; // tristate (driven low when idle)
end else begin
// Rising SCLK: sample MOSI (Mode 0)
if (sclk_rising) begin
rx_shift <= {rx_shift[6:0], mosi};
if (bit_cnt == 3'd0) begin
rx_data <= {rx_shift[6:0], mosi}; // latch full byte
rx_valid <= 1'b1;
end
end
// Falling SCLK: shift out MISO (Mode 0)
if (sclk_falling) begin
miso <= tx_shift[bit_cnt];
if (bit_cnt != 3'd0)
bit_cnt <= bit_cnt - 1;
else
bit_cnt <= 3'd7; // wrap for continuous transfer
end
end
end
end
endmodule
assign miso = cs_n ? 1'bz : miso_r;) when CS is high, so the MISO line is free for
other slaves. The Verilog above drives low instead of Z for simulation clarity — add tristate logic
for multi-slave FPGA/ASIC implementation.
8. Applications Table
| Device / IC | Category | SPI Mode | Max Clock | Notes |
|---|---|---|---|---|
| SD / microSD Card | Storage | Mode 0 | 25 MHz | SPI mode is a compatibility fallback. SDIO is faster for production. CS=DAT3, CMD=MOSI, DAT0=MISO. |
| W25Q128 NOR Flash | Storage | Mode 0 / Mode 3 | 104 MHz (STD), 133 MHz (QPI) | Winbond W25Q series — industry-standard SPI NOR. Supports Dual and Quad SPI for 2× / 4× throughput. |
| SSD1306 OLED | Display | Mode 0 | 10 MHz | 128×64 monochrome OLED. Uses extra D/C (Data/Command) pin alongside SPI. Also available in I2C. |
| ILI9341 TFT LCD | Display | Mode 0 / Mode 3 | 42 MHz (write) | 240×320 color TFT. SPI pixel writes common on microcontrollers. Parallel bus used for higher refresh rates. |
| ADS1115 ADC | ADC | Mode 1 | 400 kHz (I2C alt); SPI-like via custom clocking | 16-bit, 860 SPS. Typically accessed over I2C, but concept applies to similar SPI ADCs like MCP3208. |
| MCP4921 DAC | DAC | Mode 0 / Mode 3 | 20 MHz | 12-bit single-channel DAC from Microchip. 16-bit SPI frame: 4 config bits + 12 data bits. LDAC pin latches output. |
| nRF24L01+ Radio | Wireless | Mode 0 | 10 MHz | 2.4 GHz ISM band transceiver. SPI used for register config and FIFO access. IRQ pin signals events. |
| ENC28J60 Ethernet | Networking | Mode 0 | 20 MHz | 10BASE-T MAC/PHY with SPI interface. Widely used in embedded Ethernet. Buffer managed via SPI register commands. |
Frequently Asked Questions
What is SPI and how does it work?
SPI is a synchronous, full-duplex, master-slave serial protocol. The master generates SCLK and uses one CS line per slave. Data moves simultaneously on MOSI (master to slave) and MISO (slave to master) on every clock cycle. 8 SCLK pulses transfer one byte in both directions with no ACK overhead. Speeds reach 10–100+ MHz depending on hardware.
What are CPOL and CPHA in SPI?
CPOL sets the clock idle state (0 = idle low, 1 = idle high). CPHA sets which edge data is sampled on (0 = leading edge, 1 = trailing edge). The four combinations give four SPI modes. Mode 0 (idle low, sample rising) and Mode 3 (idle high, sample rising) are most common. Always match master and slave to the same mode — a mismatch causes every received byte to be corrupted.
How do you connect multiple slaves on SPI?
Use independent SS lines (one CS GPIO per slave) for independent access at full speed, or daisy-chain for shift-register chains with a single CS. Independent SS is strongly preferred for mixed-device buses. Ensure only one slave drives MISO at a time — assert one CS, complete the transaction, then assert the next. Add a brief dead-time (1–2 SCLK cycles) between deassert and reassert.
What is the maximum SPI clock speed?
There is no protocol-mandated maximum. Microcontrollers typically reach 1–50 MHz. SPI NOR flash supports up to 104 MHz (standard) or 133 MHz (QPI). FPGA-based masters can exceed 200 MHz. Above 50 MHz, PCB layout matters: keep traces short and impedance-matched (50 Ω), use series termination (22–33 Ω) at the driver, and avoid stubs or vias in the signal path.