A 32-bit instruction is powerful but big — and in an embedded chip with a few kilobytes of flash, every byte costs money. Thumb is ARM's answer: squeeze most instructions into 16 bits and watch your program shrink by a third. Here's how it works, what it trades away, and why almost every microcontroller you'll meet runs it.
In classic ARM state, every instruction is exactly 32 bits (4 bytes). That's clean and flexible, but a 10,000-instruction program needs 40 KB of flash just to store it. On a tiny microcontroller, flash is often the single most expensive part of the chip. Code density — how much program fits per byte — directly drives cost, power, and how much your code-fetch traffic clogs a narrow memory bus.
Thumb re-encodes the most common operations into 16 bits. Half the width means roughly half the storage for those instructions. ARM achieved this by accepting sensible limits on the common case:
The bet pays off because real programs are dominated by simple moves, adds, loads, stores and branches — exactly what Thumb encodes compactly.
| ARM (32-bit) | Thumb (16-bit) | |
|---|---|---|
| Instruction size | 32 bits | 16 bits (mostly) |
| Code size | baseline | ~25–35% smaller |
| Registers reached | all r0–r15 | mostly r0–r7 |
| Conditional exec | every instruction | branches only |
| Raw performance | slightly higher | slightly lower (more instructions) |
Thumb is like writing in shorthand. Each note is smaller so your notebook lasts longer, but occasionally you need a full sentence (a 32-bit instruction) for something complex. Thumb-2 lets you mix both freely — and that's the winning combination.
Pure Thumb was too restrictive for serious work, so ARM created Thumb-2: a single instruction stream that mixes 16-bit and 32-bit encodings. Common operations stay 16 bits (great density); when an operation needs a big immediate, a high register, or extra power, the assembler emits a 32-bit Thumb instruction instead. The result is ~98% of ARM's performance with ~75% of the code size. This is exactly the instruction set the entire Cortex-M family runs — and they run nothing else.
On cores that support both, a single bit — the T bit in the program status register — selects ARM or Thumb state. You don't set it directly; you switch with an interworking branch:
BX Rn / BLX use bit 0 of the target address to pick state: 1 = Thumb, 0 = ARM.On Cortex-M there's nothing to switch — the core is Thumb-only. The T bit is effectively always 1, and trying to enter ARM state faults. On Cortex-A/R you can interwork between the two.
Thumb trades a little flexibility for a lot of code density by shrinking instructions to 16 bits. Thumb-2 removes the downside by mixing 16- and 32-bit forms, giving near-ARM performance at far smaller size — which is why every Cortex-M microcontroller is Thumb-only.
A mostly-16-bit ARM instruction set designed for code density, at a small cost in registers and per-instruction power.
A mixed 16/32-bit extension giving near-ARM performance with near-Thumb size; it's what Cortex-M runs.
The T bit selects it; BX/BLX set it from bit 0 of the target address. Cortex-M is Thumb-only.