You now know the registers, the formats, the instruction set, and memory. Time to put it all together and write real programs — and actually run them. This lesson closes out Phase 1: by the end you'll write RISC-V assembly with loops and arrays, and know exactly how to assemble and execute it. Then we start building the hardware that runs it.
You write assembly — readable mnemonics like add and lw. The CPU only understands machine code — the 32-bit words from Day 3. The assembler is the translator in between. It does three jobs:
li, mv, ret → real instructions.An assembly file mixes three things: directives (commands to the assembler, starting with a dot), labels (names for locations, ending in a colon), and instructions. Comments start with #.
| Directive | Meaning |
|---|---|
| .text | start of the code section (instructions) |
| .data | start of the data section (initialized variables) |
| .globl name | make a symbol global (e.g. export main) |
| .word v1,v2 | place 32-bit values in memory |
| label: | name this location (target for branches/jumps) |
Directives are not executed — they just organise the program. Only the instructions run.
You never write raw addresses for jumps. You write a label like loop: and branch to it (bge x6, x7, done). At assemble time the assembler computes the exact PC-relative offset for you. This is what makes loops and if-statements readable.
A complete, runnable program. It reads N from the data section and sums 1..N. Copy or download it:
.data
N: .word 10 # sum 1..10 -> expect 55
.text
.globl main
main:
la t3, N # t3 = address of N (la = pseudo: load address)
lw t2, 0(t3) # t2 = N (the limit)
li t0, 0 # sum = 0
li t1, 1 # i = 1
loop:
bgt t1, t2, done # if i > N, finish (bgt = pseudo)
add t0, t0, t1 # sum += i
addi t1, t1, 1 # i++
j loop # repeat (j = pseudo: jal x0, loop)
done:
mv a0, t0 # result in a0 (= 55)
# (on a sim with a kernel you'd ecall to exit; here a0 holds the answer)
This one uses the data section, a load inside a loop, and a branch — exercising memory + control flow together:
.data
arr: .word 4, 9, 1, 7, 2, 8 # the array
len: .word 6 # number of elements
.text
.globl main
main:
la t0, arr # t0 = &arr[0]
la t1, len
lw t1, 0(t1) # t1 = len (6)
lw t2, 0(t0) # max = arr[0]
li t3, 1 # i = 1
mloop:
bge t3, t1, mdone # if i >= len, done
slli t4, t3, 2 # t4 = i * 4 (byte offset; words are 4 bytes)
add t4, t0, t4 # t4 = &arr[i]
lw t5, 0(t4) # t5 = arr[i]
ble t5, t2, skip # if arr[i] <= max, skip
mv t2, t5 # else max = arr[i]
skip:
addi t3, t3, 1 # i++
j mloop
mdone:
mv a0, t2 # a0 = max (= 9)
Notice slli t4, t3, 2 — shifting left by 2 multiplies by 4, the byte stride for 32-bit words (alignment from Day 5). That index-to-address math is one of the most common patterns in all of assembly.
Three practical routes, easiest first:
Paste your .s into a web RISC-V simulator like Venus or Ripes — they assemble, run, and let you single-step while watching registers and memory. Perfect for learning. (Ripes even visualises the datapath — a great companion to this course.)
# Build for RV32I and run on the Spike ISA simulator
# (install: riscv-gnu-toolchain + spike + pk)
# assemble + link
riscv64-unknown-elf-gcc -march=rv32i -mabi=ilp32 -nostdlib \
-o sum.elf sum.s
# run on the Spike instruction-set simulator (with proxy kernel)
spike --isa=rv32i pk sum.elf
# ...or run under QEMU's user-mode emulator
# qemu-riscv32 sum.elf
The whole point of this course: from Day 15, the RISC-V CPU we build in Verilog will execute machine code assembled from programs exactly like these — in our browser Verilog simulator. You'll watch your own assembly run on your own processor. 🎉
You write the recipe in words a human reads ("add a cup of sugar"). The kitchen robot only understands numbered motor commands. The assembler is the translator that converts your written recipe into the robot's exact command codes — and labels are like step names ("repeat from step 3") it turns into precise line numbers.
An assembler turns readable mnemonics into machine code — expanding pseudo-instructions and resolving labels into addresses. An .s file mixes directives (.text/.data/.globl/.word), labels, and instructions. You can write real programs (loops, arrays) and run them in a browser sim (Venus/Ripes), the GNU toolchain + Spike/QEMU — and soon, on the CPU we build.
# = comment.slli i,2 (×4) + add base → element address.slli t4, t3, 2 when indexing a word array?.s file today.Translates assembly mnemonics into machine code, expands pseudo-instructions, and resolves labels into addresses.
Dot-prefixed commands to the assembler (.text, .data, .globl, .word) that organise the program but aren't executed.
Use a browser sim (Venus/Ripes), or the RISC-V GNU toolchain with Spike/QEMU. Soon, our own Verilog CPU will run it.
A named location (e.g. loop:) that branches/jumps target; the assembler converts it to the right address.