Protocol 01 · Serial Protocols

Serial Peripheral Interface
SPI · CPOL/CPHA · Verilog Master

The fastest and simplest synchronous serial protocol — full-duplex data transfer with no overhead, no addressing, and clock speeds exceeding 100 MHz.

SCLK · MOSI · MISO · CS CPOL / CPHA 4 Modes Full-Duplex Verilog Master Verilog Slave Daisy-Chain
CS SCLK MOSI MISO HIGH LOW (active) D7 D6 D5 D4 D3 D2 D1 D0 CS SCLK MOSI (0xB4) MISO (0x4B) Mode 0 · CPOL=0 CPHA=0

SPI Mode 0 (CPOL=0, CPHA=0) — 8-bit full-duplex transfer. MOSI = 0xB4, MISO = 0x4B. Data sampled on each rising SCLK edge (dashed lines). CS active-low throughout.

Table of Contents

  1. What is SPI?
  2. The Four SPI Signals
  3. CPOL and CPHA — The Four Modes
  4. SPI Frame Format
  5. Multi-Slave Configurations
  6. Verilog SPI Master
  7. Verilog SPI Slave
  8. Applications Table

1. What is SPI?

SPI (Serial Peripheral Interface) is a synchronous serial communication protocol developed by Motorola in the 1980s. It is the go-to protocol for short-distance, high-speed communication between a microcontroller or FPGA and peripheral ICs such as ADCs, DACs, display drivers, NOR Flash, SD cards, and wireless transceivers. For longer-distance or multi-drop buses, compare with I2C or UART.

SPI uses a master-slave architecture. The master always drives the clock. Communication is full-duplex — data moves in both directions simultaneously on every clock cycle. There is no addressing scheme; instead, the master asserts an active-low CS (Chip Select) line to select the intended slave.

Clock frequencies typically range from 1 MHz to 100 MHz, with high-speed variants (Quad-SPI, Octal-SPI) reaching beyond 200 MHz by adding extra data lines. Standard SPI uses four wires: SCLK, MOSI, MISO, and CS.

Why choose SPI over I2C? SPI is faster (no start/stop bits, no ACK, no address phase), truly full-duplex, and simpler to implement in RTL. I2C wins when pin count is tight or when many devices share one bus with 7-bit addressing. SPI wins for ADCs, displays, and flash memories where throughput is paramount.

SPI is part of the broader Serial Protocols family. The protocol has no formal standard body — each device datasheet defines exact timing, frame width, and mode requirements — but the core four-wire interface is universally understood.

2. The Four SPI Signals

Signal Full Name Direction Description
SCLK Serial Clock Master → All Slaves Clock generated exclusively by the master. Frequency sets the transfer rate. Slaves have no input clock of their own — they are entirely clock-slaved to the master.
MOSI Master Out Slave In Master → Active Slave Serial data driven by the master to the selected slave. In daisy-chain mode, the MOSI of one slave feeds the next in the chain. Also called SDO (Serial Data Out) on some devices.
MISO Master In Slave Out Active Slave → Master Serial data driven by the selected slave back to the master. When CS is deasserted (high), the slave tristates MISO to avoid bus conflicts. Also called SDI or CIPO on newer nomenclature.
CS / SS Chip Select / Slave Select Master → Individual Slave Active-low enable. The master asserts CS low before the first SCLK edge and deasserts it high after the last bit. Each slave has its own dedicated CS line. A slave ignores SCLK and MOSI when its CS is high.
Optional pins on some devices: SPI NOR flash (e.g. W25Q128) adds WP (Write Protect, active-low) and HOLD (pause transfer without deasserting CS). Quad-SPI devices add IO2 and IO3 to send 4 bits per clock (x4 throughput). These are extensions, not part of base SPI.

3. CPOL and CPHA — The Four Modes

SPI defines two configuration bits that control when data is sampled relative to the clock: CPOL (Clock Polarity) and CPHA (Clock Phase). These yield four distinct modes. Master and slave must use the same mode — mismatched modes are a common source of SPI bugs.

Mode CPOL CPHA Clock Idle State Data Sampled On Data Shifted On Common Devices
Mode 0 0 0 Low Rising edge Falling edge Most ADCs, SD cards (SPI mode), ADS1115, nRF24L01
Mode 1 0 1 Low Falling edge Rising edge Some accelerometers (ADXL345 alt mode)
Mode 2 1 0 High Falling edge Rising edge Some SD/MMC controllers
Mode 3 1 1 High Rising edge Falling edge W25Q NOR flash, ILI9341 display, MCP4921 DAC
Rule of thumb: Check your slave's datasheet for "SPI Mode" or the CPOL/CPHA table. Mode 0 and Mode 3 are the most common — roughly 90 % of SPI peripherals support one of them. Mode 0 and Mode 3 both sample on the same physical edge when the clock is in its active half-cycle (rising for Mode 0, rising for Mode 3 — the difference is only the idle state).

How to Remember CPHA

CPHA=0: data is valid before the first edge — it is set up by the master before CS goes low, and sampled on the first (leading) clock edge.
CPHA=1: data is set up on the first edge and sampled on the second (trailing) edge — one half clock period of setup time is guaranteed by the protocol.

4. SPI Frame Format

A standard SPI byte transfer (Mode 0, 8-bit MSB-first) proceeds as follows:

  1. Master asserts CS low — selects the slave. SCLK remains low (idle).
  2. Master loads bit D7 (MSB) onto MOSI before the first rising edge.
  3. On each rising SCLK edge: both master and slave sample the current bit (MOSI and MISO respectively).
  4. On each falling SCLK edge: both shift the next bit onto the line.
  5. After 8 rising edges, the full byte has been transferred in both directions simultaneously.
  6. Master deasserts CS high. Slave tristates MISO.
No ACK, no error detection: SPI has no built-in acknowledgement or CRC. If the slave is busy, absent, or powered off, the master will not know — it just clocks out bits into the void. Higher-level application protocols (e.g. SD card command responses, flash status registers) implement their own status/busy signaling within the data bytes.
Frame width: SPI frames are not restricted to 8 bits. Many controllers support 4, 8, 16, or 32-bit transfer widths per CS assertion. Multi-byte transfers simply keep CS low and clock continuously — the slave buffers all incoming bytes in its internal shift register or FIFO.

5. Multi-Slave Configurations

SPI supports two physical topologies for connecting multiple slaves to one master.

Option A — Independent SS Lines (Most Common)

Each slave has a dedicated CS line. SCLK, MOSI, and MISO are shared on a common bus. Only the slave whose CS is asserted will drive MISO; all others tristate their MISO pins.

 Master
  ├─ SCLK ──────┬──────┬──────┐
  ├─ MOSI ──────┼──────┼──────┤
  ├─ MISO ──────┼──────┼──────┤ (each slave tristates when CS high)
  ├─ CS0  ──────┤      │      │
  ├─ CS1  ──────┼──────┤      │
  └─ CS2  ──────┼──────┼──────┘
              Slave0  Slave1  Slave2

Advantage: Each slave can be accessed at any time. Max speed per slave is not limited by chain length. N slaves need N GPIO pins for CS.

Option B — Daisy-Chain (Shift-Register Chain)

All slaves share a single CS line. MISO of each slave feeds MOSI of the next. The master must clock N × (frame width) bits to reach the last device. Used in 74HC595 shift registers and LED driver chains.

 Master                                             (single CS for all)
  ├─ SCLK ──────────┬─────────────┬──────────────┐
  ├─ CS    ──────────┼─────────────┼──────────────┤
  ├─ MOSI ──► Slave0 ─►(MISO→MOSI)─► Slave1 ──►(MISO→MOSI)─► Slave2
  └─ MISO ◄─────────────────────────────────────── Slave2 MISO

Advantage: Only one CS line regardless of chain length. Limitation: Cannot address individual slaves; all shift together. Latency grows as O(N). Only works if all slaves support daisy-chain (SDO→SDI passthrough).

MISO bus conflict: In independent SS mode, only one slave may drive MISO at a time. If two CS lines are accidentally asserted simultaneously, two slaves fight for MISO. Always ensure CS deasserts the previous slave before asserting the next, and add a small dead-time between CS edges.

6. Verilog SPI Master

The following is a complete, synthesizable, parameterized SPI master supporting all four modes via CPOL and CPHA parameters. A configurable clock divider generates SCLK from the system clock. The FSM cycles: IDLE → ASSERT_CS → TRANSFER → DONE → IDLE.

// ─────────────────────────────────────────────────────
//  SPI Master — parameterized, all 4 modes, CLK_DIV
//  sclk_freq = clk / (2 * CLK_DIV)
// ─────────────────────────────────────────────────────
module spi_master #(
  parameter CLK_DIV = 4,   // sclk = clk / (2*CLK_DIV)
  parameter CPOL    = 0,   // 0: idle low,  1: idle high
  parameter CPHA    = 0    // 0: sample leading, 1: sample trailing
)(
  input             clk, rst_n,
  input             start,
  input  [7:0]      tx_data,
  output reg [7:0] rx_data,
  output reg        done,
  // SPI pins
  output reg        sclk,
  output reg        mosi,
  input             miso,
  output reg        cs_n
);

  // FSM states
  localparam IDLE      = 2'd0,
             ASSERT_CS = 2'd1,
             TRANSFER  = 2'd2,
             DONE      = 2'd3;

  reg [1:0]  state;
  reg [7:0]  tx_shift, rx_shift;
  reg [3:0]  bit_cnt;    // counts 0..15 (rising+falling per bit)
  reg [$clog2(CLK_DIV)-1:0] div_cnt;
  reg         clk_en;    // sclk toggle strobe
  reg         sclk_r;    // internal sclk state

  // ── Clock divider ────────────────────────────────
  always @(posedge clk or negedge rst_n) begin
    if (!rst_n) begin
      div_cnt <= 0;
      clk_en  <= 1'b0;
    end else begin
      if (div_cnt == CLK_DIV[0+:$clog2(CLK_DIV)]-1) begin
        div_cnt <= 0;
        clk_en  <= 1'b1;
      end else begin
        div_cnt <= div_cnt + 1;
        clk_en  <= 1'b0;
      end
    end
  end

  // ── Main FSM ─────────────────────────────────────
  always @(posedge clk or negedge rst_n) begin
    if (!rst_n) begin
      state    <= IDLE;
      cs_n     <= 1'b1;
      sclk_r   <= CPOL[0];    // idle state
      sclk     <= CPOL[0];
      mosi     <= 1'b0;
      done     <= 1'b0;
      bit_cnt  <= 0;
      tx_shift <= 0;
      rx_shift <= 0;
      rx_data  <= 0;
    end else begin
      done <= 1'b0;  // pulse for one cycle only

      case (state)

        // ── IDLE: wait for start pulse ──────────────
        IDLE: begin
          cs_n   <= 1'b1;
          sclk_r <= CPOL[0];
          sclk   <= CPOL[0];
          if (start) begin
            tx_shift <= tx_data;
            bit_cnt  <= 0;
            state    <= ASSERT_CS;
          end
        end

        // ── ASSERT_CS: pull CS low, wait one div ───
        ASSERT_CS: begin
          cs_n <= 1'b0;
          // For CPHA=0, put MSB on MOSI now (before first edge)
          if (CPHA == 0)
            mosi <= tx_shift[7];
          if (clk_en)
            state <= TRANSFER;
        end

        // ── TRANSFER: 8 bits, 16 half-cycles ───────
        TRANSFER: begin
          if (clk_en) begin
            sclk_r <= ~sclk_r;
            sclk   <= ~sclk_r;  // one cycle delay (pipeline)

            if (CPHA == 0) begin
              // Sample on leading edge, shift on trailing
              if (sclk_r == CPOL[0]) begin   // leading edge
                rx_shift <= {rx_shift[6:0], miso};
                bit_cnt  <= bit_cnt + 1;
              end else begin                    // trailing edge
                mosi <= tx_shift[7 - (bit_cnt[2:0])];
              end
            end else begin
              // CPHA=1: shift on leading edge, sample on trailing
              if (sclk_r == CPOL[0]) begin   // leading edge — shift
                mosi <= tx_shift[7 - bit_cnt[2:0]];
              end else begin                    // trailing edge — sample
                rx_shift <= {rx_shift[6:0], miso};
                bit_cnt  <= bit_cnt + 1;
              end
            end

            if (bit_cnt == 8)
              state <= DONE;
          end
        end

        // ── DONE: deassert CS, capture rx_data ─────
        DONE: begin
          cs_n    <= 1'b1;
          rx_data <= rx_shift;
          done    <= 1'b1;
          sclk_r  <= CPOL[0];
          sclk    <= CPOL[0];
          mosi    <= 1'b0;
          state   <= IDLE;
        end

        default: state <= IDLE;
      endcase
    end
  end

endmodule

SPI Master — FSM State Diagram

IDLE start ASSERT_CS clk_en TRANSFER bit_cnt < 8 bit_cnt=8 DONE done pulse · rx_data captured
Clock divider: With CLK_DIV=4 and a 100 MHz system clock, SCLK frequency = 100 MHz / (2×4) = 12.5 MHz. Set CLK_DIV=1 for the fastest possible SCLK (clk/2). Increase CLK_DIV for slower, more robust operation with long PCB traces or slow slaves.

7. Verilog SPI Slave

The SPI slave receives SCLK and MOSI from the master, samples MOSI on the correct clock edge, and drives MISO from its own TX shift register. Because the slave is entirely clock-slaved, it simply responds to edges — no clock divider needed.

// ─────────────────────────────────────────────────────
//  SPI Slave — Mode 0 (CPOL=0, CPHA=0)
//  Samples MOSI on rising SCLK, drives MISO MSB-first
// ─────────────────────────────────────────────────────
module spi_slave (
  input             clk,       // system clock (for output staging)
  input             rst_n,
  // SPI pins
  input             sclk,
  input             mosi,
  output reg        miso,
  input             cs_n,
  // Application interface
  input  [7:0]      tx_data,   // byte to send back to master
  output reg [7:0] rx_data,   // byte received from master
  output reg        rx_valid   // pulses high for one sys clk after full byte
);

  reg [7:0] rx_shift, tx_shift;
  reg [2:0] bit_cnt;

  // ── Edge detection on SPI clock ─────────────────
  reg sclk_d;
  wire sclk_rising  = (!sclk_d) && sclk;
  wire sclk_falling =  sclk_d  && (!sclk);

  always @(posedge clk or negedge rst_n)
    if (!rst_n) sclk_d <= 1'b0;
    else        sclk_d <= sclk;

  // ── Load TX shift register when CS asserted ──────
  always @(posedge clk or negedge rst_n) begin
    if (!rst_n) begin
      tx_shift <= 8'h00;
      rx_shift <= 8'h00;
      rx_data  <= 8'h00;
      rx_valid <= 1'b0;
      bit_cnt  <= 3'd7;  // MSB first
      miso     <= 1'b0;
    end else begin
      rx_valid <= 1'b0;

      if (cs_n) begin
        // CS deasserted: reset counters, pre-load TX
        bit_cnt  <= 3'd7;
        tx_shift <= tx_data;
        miso     <= 1'b0;  // tristate (driven low when idle)
      end else begin
        // Rising SCLK: sample MOSI (Mode 0)
        if (sclk_rising) begin
          rx_shift <= {rx_shift[6:0], mosi};
          if (bit_cnt == 3'd0) begin
            rx_data  <= {rx_shift[6:0], mosi};  // latch full byte
            rx_valid <= 1'b1;
          end
        end
        // Falling SCLK: shift out MISO (Mode 0)
        if (sclk_falling) begin
          miso    <= tx_shift[bit_cnt];
          if (bit_cnt != 3'd0)
            bit_cnt <= bit_cnt - 1;
          else
            bit_cnt <= 3'd7;  // wrap for continuous transfer
        end
      end
    end
  end

endmodule
Edge detection via synchronizer: The slave samples SCLK with the system clock and detects edges by comparing the current and previous registered values. This is safe when the system clock is at least 4× faster than SCLK. For production ASIC designs, SCLK should be treated as an asynchronous input and fed through a CDC synchronizer before the edge detector.
MISO tristate: In real silicon, deassert MISO by driving it to high-Z (assign miso = cs_n ? 1'bz : miso_r;) when CS is high, so the MISO line is free for other slaves. The Verilog above drives low instead of Z for simulation clarity — add tristate logic for multi-slave FPGA/ASIC implementation.

8. Applications Table

Device / IC Category SPI Mode Max Clock Notes
SD / microSD Card Storage Mode 0 25 MHz SPI mode is a compatibility fallback. SDIO is faster for production. CS=DAT3, CMD=MOSI, DAT0=MISO.
W25Q128 NOR Flash Storage Mode 0 / Mode 3 104 MHz (STD), 133 MHz (QPI) Winbond W25Q series — industry-standard SPI NOR. Supports Dual and Quad SPI for 2× / 4× throughput.
SSD1306 OLED Display Mode 0 10 MHz 128×64 monochrome OLED. Uses extra D/C (Data/Command) pin alongside SPI. Also available in I2C.
ILI9341 TFT LCD Display Mode 0 / Mode 3 42 MHz (write) 240×320 color TFT. SPI pixel writes common on microcontrollers. Parallel bus used for higher refresh rates.
ADS1115 ADC ADC Mode 1 400 kHz (I2C alt); SPI-like via custom clocking 16-bit, 860 SPS. Typically accessed over I2C, but concept applies to similar SPI ADCs like MCP3208.
MCP4921 DAC DAC Mode 0 / Mode 3 20 MHz 12-bit single-channel DAC from Microchip. 16-bit SPI frame: 4 config bits + 12 data bits. LDAC pin latches output.
nRF24L01+ Radio Wireless Mode 0 10 MHz 2.4 GHz ISM band transceiver. SPI used for register config and FIFO access. IRQ pin signals events.
ENC28J60 Ethernet Networking Mode 0 20 MHz 10BASE-T MAC/PHY with SPI interface. Widely used in embedded Ethernet. Buffer managed via SPI register commands.
SPI in high-speed storage: Modern eMMC and UFS interfaces have replaced SPI flash in high-end embedded systems, but SPI NOR flash remains dominant for boot code storage (BIOS/UEFI chips, FPGA configuration memory) because of its simplicity, single-supply operation, and wide temperature range.

Frequently Asked Questions

What is SPI and how does it work?

SPI is a synchronous, full-duplex, master-slave serial protocol. The master generates SCLK and uses one CS line per slave. Data moves simultaneously on MOSI (master to slave) and MISO (slave to master) on every clock cycle. 8 SCLK pulses transfer one byte in both directions with no ACK overhead. Speeds reach 10–100+ MHz depending on hardware.

What are CPOL and CPHA in SPI?

CPOL sets the clock idle state (0 = idle low, 1 = idle high). CPHA sets which edge data is sampled on (0 = leading edge, 1 = trailing edge). The four combinations give four SPI modes. Mode 0 (idle low, sample rising) and Mode 3 (idle high, sample rising) are most common. Always match master and slave to the same mode — a mismatch causes every received byte to be corrupted.

How do you connect multiple slaves on SPI?

Use independent SS lines (one CS GPIO per slave) for independent access at full speed, or daisy-chain for shift-register chains with a single CS. Independent SS is strongly preferred for mixed-device buses. Ensure only one slave drives MISO at a time — assert one CS, complete the transaction, then assert the next. Add a brief dead-time (1–2 SCLK cycles) between deassert and reassert.

What is the maximum SPI clock speed?

There is no protocol-mandated maximum. Microcontrollers typically reach 1–50 MHz. SPI NOR flash supports up to 104 MHz (standard) or 133 MHz (QPI). FPGA-based masters can exceed 200 MHz. Above 50 MHz, PCB layout matters: keep traces short and impedance-matched (50 Ω), use series termination (22–33 Ω) at the driver, and avoid stubs or vias in the signal path.

← Previous Serial Protocols Hub Next → I2C Protocol