What is the Thumb instruction set?

Thumb is a compressed instruction set where most instructions are 16 bits wide instead of the usual 32 bits in ARM state. It was created to shrink code size, which is critical in memory-constrained embedded systems, at a small cost in the number of registers and operations each instruction can reach.

Why does code density matter?

Smaller code means less flash or ROM to store the program, lower cost, and often better instruction-cache and fetch efficiency on narrow memory buses. In embedded devices where memory dominates the bill of materials, Thumb can cut code size by around 25 to 35 percent versus 32-bit ARM.

Thumb-2 extends Thumb with a mix of 16-bit and 32-bit instructions, so common operations stay small while complex ones remain fully capable. It gives nearly the performance of 32-bit ARM with nearly the code density of Thumb, and it is the instruction set used by Cortex-M cores.

How does the processor switch between ARM and Thumb state?

The T bit in the program status register selects the state. A BX or BLX instruction switches state based on the least significant bit of the target address: bit 0 set means Thumb, clear means ARM. Cortex-M cores run only Thumb (Thumb-2) and never enter classic ARM state.

DAY 16 · THE INSTRUCTION SET

The Thumb Instruction Set & Code Density

By EcrioniX · Updated Jun 6, 2026

A 32-bit instruction is powerful but big — and in an embedded chip with a few kilobytes of flash, every byte costs money. Thumb is ARM's answer: squeeze most instructions into 16 bits and watch your program shrink by a third. Here's how it works, what it trades away, and why almost every microcontroller you'll meet runs it.

1. The problem: code costs memory

In classic ARM state, every instruction is exactly 32 bits (4 bytes). That's clean and flexible, but a 10,000-instruction program needs 40 KB of flash just to store it. On a tiny microcontroller, flash is often the single most expensive part of the chip. Code density — how much program fits per byte — directly drives cost, power, and how much your code-fetch traffic clogs a narrow memory bus.

2. The idea: 16-bit instructions

Thumb re-encodes the most common operations into 16 bits. Half the width means roughly half the storage for those instructions. ARM achieved this by accepting sensible limits on the common case:

Most Thumb instructions reach only the low registers r0–r7 (a few reach high registers).
Smaller immediates and offsets than 32-bit ARM.
No per-instruction condition codes — Thumb mostly drops the conditional-execution field from Day 11 (you use conditional branches instead).
The barrel shifter isn't folded into every data-processing instruction the way it is in ARM.

The bet pays off because real programs are dominated by simple moves, adds, loads, stores and branches — exactly what Thumb encodes compactly.

3. The trade-off in one table

	ARM (32-bit)	Thumb (16-bit)
Instruction size	32 bits	16 bits (mostly)
Code size	baseline	~25–35% smaller
Registers reached	all r0–r15	mostly r0–r7
Conditional exec	every instruction	branches only
Raw performance	slightly higher	slightly lower (more instructions)

💡 Shorthand vs longhand

Thumb is like writing in shorthand. Each note is smaller so your notebook lasts longer, but occasionally you need a full sentence (a 32-bit instruction) for something complex. Thumb-2 lets you mix both freely — and that's the winning combination.

4. Thumb-2: the best of both worlds

Pure Thumb was too restrictive for serious work, so ARM created Thumb-2: a single instruction stream that mixes 16-bit and 32-bit encodings. Common operations stay 16 bits (great density); when an operation needs a big immediate, a high register, or extra power, the assembler emits a 32-bit Thumb instruction instead. The result is ~98% of ARM's performance with ~75% of the code size. This is exactly the instruction set the entire Cortex-M family runs — and they run nothing else.

; Same source, two encodings chosen automatically by the assembler: ADDS r0, r1, r2 ; 16-bit Thumb (low regs, simple) ADD r0, r1, #0x12345 ; 32-bit Thumb-2 (big immediate needs the wide form)

5. Switching state: the T bit

On cores that support both, a single bit — the T bit in the program status register — selects ARM or Thumb state. You don't set it directly; you switch with an interworking branch:

BX Rn / BLX use bit 0 of the target address to pick state: 1 = Thumb, 0 = ARM.
That's why function pointers to Thumb code have their low bit set — it's a state tag, not part of the address.

LDR r0, =thumb_func+1 ; +1 sets bit0 → request Thumb state BX r0 ; branch and switch to Thumb

On Cortex-M there's nothing to switch — the core is Thumb-only. The T bit is effectively always 1, and trying to enter ARM state faults. On Cortex-A/R you can interwork between the two.

✅ The mental model

Thumb trades a little flexibility for a lot of code density by shrinking instructions to 16 bits. Thumb-2 removes the downside by mixing 16- and 32-bit forms, giving near-ARM performance at far smaller size — which is why every Cortex-M microcontroller is Thumb-only.

🎯 Day 16 takeaways

Thumb = mostly 16-bit instructions → ~25–35% smaller code.
Trade-offs: low registers, smaller immediates, no per-instruction conditionals.
Thumb-2 mixes 16/32-bit → near-ARM speed + near-Thumb density.
The T bit selects state; BX/BLX switch using target bit 0.
Cortex-M is Thumb-only — it never enters classic ARM state.

Quick check

Roughly how much smaller is Thumb code than 32-bit ARM, and why?
What does Thumb-2 add that plain Thumb lacked?
What does bit 0 of a BX target address select?

FAQ

What is Thumb?

A mostly-16-bit ARM instruction set designed for code density, at a small cost in registers and per-instruction power.

What is Thumb-2?

A mixed 16/32-bit extension giving near-ARM performance with near-Thumb size; it's what Cortex-M runs.

How do you switch ARM/Thumb state?

The T bit selects it; BX/BLX set it from bit 0 of the target address. Cortex-M is Thumb-only.

← Back to the full course roadmap