Every RISC-V instruction is exactly 32 bits — a single row of ones and zeros. But how does the CPU know that one row means "add" and another means "jump"? The answer is the instruction format: a fixed layout that says which bits mean what. Today we'll decode all six formats in plain English — this is the bridge from "assembly" to "what the hardware actually sees."
From Day 2, the CPU works with registers and a few operations. To tell it what to do, each instruction packs all the needed info into 32 bits. Think of it like a pre-printed form with labeled boxes:
Imagine a tiny form with boxes: "Operation", "Put result in", "Input A", "Input B". To say "add register 5 and register 6, put it in register 7," you fill: Operation = add, Result = x7, A = x5, B = x6. An instruction format is just which boxes the form has — and a 32-bit instruction is the filled-in form, written as bits.
RISC-V instructions are built from a small set of fields. Learn these seven and you can read almost any instruction:
| Field | Bits | What it holds (plain English) |
|---|---|---|
| opcode | 7 | The broad kind of instruction (is it arithmetic? a load? a branch?). |
| rd | 5 | Destination register — where the result goes (5 bits = one of 32 registers). |
| rs1 | 5 | Source register 1 — first input. |
| rs2 | 5 | Source register 2 — second input. |
| funct3 | 3 | Picks the exact operation within the opcode's family (e.g. add vs. sub vs. XOR). |
| funct7 | 7 | Extra bits to distinguish operations that share a funct3 (e.g. add vs. sub). |
| imm | varies | Immediate — a constant value baked into the instruction (like the "1" in "add 1"). |
Not every instruction needs every box. An add needs three registers but no constant; addi needs a register and a constant; a jump needs a big offset. That's exactly why there are several formats — each is a different combination of these boxes.
RV32I defines six layouts of the same 32 bits. Here's the whole map — notice how rs1, rs2 and rd stay in the same place wherever they appear (that's deliberate — §5):
add x7, x5, x6, sub, and, xor, sll.jalr. e.g. addi x6, x6, 1, lw x5, 0(x10).rs1 + imm. Needs two registers (address base + data) and an offset, but no destination. e.g. sw x5, 8(x10).beq x6, x7, loop.lui (load upper immediate), auipc.rd. e.g. jal x1, function.Look again at the diagram: rs1, rs2 and rd are in the exact same bit positions in every format that uses them, and the opcode is always bits 6:0. This is RISC-V being smart on purpose:
The price: the immediate gets chopped up and scattered (especially in S, B, U, J formats) so the register fields can stay put. It looks messy, but the sign bit is always bit 31, so the hardware can sign-extend quickly. Don't memorize the exact immediate bit-shuffling — tools and our decoder handle it; just know why it's split.
Let's turn one line of assembly into the actual 32 bits the CPU sees. add is R-type. Copy or download the worksheet:
Instruction: add x7, x5, x6 (x7 = x5 + x6) --> R-type Fill the R-type boxes: funct7 = 0000000 (add; for sub it would be 0100000) rs2 = x6 = 6 = 00110 rs1 = x5 = 5 = 00101 funct3 = 000 (add/sub family) rd = x7 = 7 = 00111 opcode = 0110011 (OP - register-register arithmetic) Lay them out, bit 31 -> bit 0: 0000000 | 00110 | 00101 | 000 | 00111 | 0110011 funct7 rs2 rs1 f3 rd opcode As one 32-bit word: 0000 0000 0110 0010 1000 0011 1011 0011 = 0x006283B3 That single number IS the instruction. Memory hands the CPU 0x006283B3, the decoder slices out the fields, and the ALU computes x5 + x6 into x7.
That hex value, 0x006283B3, is exactly what sits in instruction memory and what the CPU fetches. On Day 11 we'll build the decoder — the hardware that chops a 32-bit word back into opcode/rd/rs1/rs2/imm so the rest of the CPU can act on it.
A RISC-V instruction is 32 bits arranged into fields — opcode, rd, rs1, rs2, funct3/7 and an immediate. RV32I has six formats (R, I, S, B, U, J), each a different mix of those fields for different jobs. The register fields never move (for speed), so the immediate is split instead. Assembly like add x7,x5,x6 becomes one number (0x006283B3) the CPU decodes and executes.
add x7,x5,x6 = 0x006283B3).add use, and which does addi use? Why are they different?The fixed layout of bits inside a 32-bit instruction — which bits are the opcode, registers, and immediate. RV32I has six: R/I/S/B/U/J.
opcode (kind), rd (destination), rs1/rs2 (sources), funct3/funct7 (exact operation), and imm (a constant). Not every format uses all of them.
Different instructions need different info — three registers, a constant, an address, or a jump offset — so each format is a different mix of fields.
So the register fields can stay in fixed positions (which speeds up decoding); the sign bit stays at bit 31 for fast sign-extension.