HomeARM CourseDay 28
DAY 28 · ADVANCED (64-BIT & BEYOND)

Floating Point — The VFP Unit

By EcrioniX · Updated Jun 6, 2026

Everything so far has been integers. But graphics, physics, audio, scientific code and AI all live in the world of real numbers with fractions — and that needs floating point. Today we'll see how ARM does fractional math in hardware via the VFP/FPU, what IEEE 754 actually stores, and the subtle pitfalls (rounding, denormals, ABIs) that trip up real engineers.

1. Why hardware floating point?

You can do fractional math with integers (fixed-point), and tiny microcontrollers without an FPU must emulate floats in software — slow, hundreds of cycles per operation. A hardware FPU (Floating-Point Unit) executes a float add or multiply in just a few cycles. For any workload heavy in decimals — 3D transforms, DSP, neural nets — a hardware FPU is the difference between smooth and unusable.

On ARM this unit is historically called the VFP (Vector Floating Point). In modern AArch64 it's tightly integrated with NEON (Day 27) — they share the same V register file.

2. IEEE 754 — how a float is stored

Floating point follows the IEEE 754 standard. A number is stored as three fields — a sign, an exponent, and a fraction (mantissa) — essentially scientific notation in binary: value = ±1.fraction × 2^exponent.

PrecisionBitsSign / Exp / FracARM name
Half161 / 5 / 10H
Single (float)321 / 8 / 23S
Double (double)641 / 11 / 52D

More exponent bits → larger range; more fraction bits → more precision. The standard also defines special values: ±0, ±∞ (infinity), and NaN (Not-a-Number, e.g. from 0/0). Half precision has surged in importance because machine-learning inference often uses 16-bit floats to halve memory and double throughput.

💡 The "1.5 isn't 1.5" gotcha

Because the fraction is finite binary, many decimals can't be stored exactly — classically 0.1 + 0.2 ≠ 0.3 exactly. This isn't a bug; it's the nature of IEEE 754. Never compare floats with ==; compare within a small tolerance (epsilon).

3. The floating-point register file

In AArch64 the FP and NEON units share 32 registers V0–V31. For scalar floating point you access them through width-specific views:

So S5, D5 and V5 are the same physical register viewed at different widths — exactly mirroring how X and W share GPRs in Day 26.

FADD s0, s1, s2 // single-precision add: s0 = s1 + s2 FMUL d0, d1, d2 // double-precision multiply FMADD d0, d1, d2, d3 // fused multiply-add: d0 = d1*d2 + d3 (one rounding) SCVTF s0, w0 // convert integer w0 to float s0

Note FMADD — a fused multiply-add computes a*b+c with a single rounding, which is both faster and more accurate than separate multiply then add. It's the backbone of dot products and matrix math.

4. Control & status: FPCR and FPSR

Two special registers govern FP behaviour:

Rounding modes

Since results rarely fit exactly, IEEE 754 defines how to round. The FPCR selects one:

Denormals (subnormals)

Numbers extremely close to zero are represented as denormals. They preserve accuracy near zero but can be very slow on some hardware. Performance-critical audio/DSP code often sets flush-to-zero in FPCR, trading a sliver of accuracy for consistent speed.

5. Soft-float vs hard-float — an ABI you must match

This catches many embedded developers. There are two ways floating point reaches the hardware, and they're incompatible ABIs:

Hard-floatSoft-float
FP opsreal FPU instructionsemulated in software libraries
FP args passed inFP registers (S/D)integer registers
Speedfastslow
Needs FPU?yesno

Every object you link must use the same float ABI — mixing a hard-float library with soft-float code produces subtle, maddening bugs or link errors. On chips with an FPU (all AArch64), hard-float is standard. On the smallest MCUs without an FPU, soft-float is the only option. A middle option, softfp, uses the FPU but passes args in integer registers for compatibility.

6. FP and performance

✅ The mental model

The VFP/FPU does fractional math in hardware using IEEE 754 (sign + exponent + fraction) at half (H), single (S) and double (D) precision, in registers that share NEON's V file. FPCR sets rounding and denormal behaviour; FPSR records exception flags. Match your hard-/soft-float ABI everywhere, never compare floats with ==, and prefer single precision + FMA for speed.

🎯 Day 28 takeaways

Quick check

  1. Name the three IEEE 754 fields and the three common precisions.
  2. Why are S5, D5 and V5 related?
  3. What does the FPCR control, and what does the FPSR record?
  4. Why must all linked code agree on hard- vs soft-float?

FAQ

What is the VFP/FPU?

ARM's hardware floating-point unit; it executes IEEE 754 math directly and shares the V register file with NEON in AArch64.

What is IEEE 754?

The standard for representing floats as sign/exponent/fraction, with half, single and double precision plus rules for rounding, infinities and NaN.

Soft-float vs hard-float?

Hard-float uses real FPU instructions and FP registers for arguments; soft-float emulates FP in software. They are incompatible ABIs.

Why doesn't 0.1 + 0.2 equal 0.3?

Those decimals can't be stored exactly in finite binary, so compare floats within a tolerance, not with ==.

Previous
← Day 27: NEON & SIMD

← Back to the full course roadmap