What is the VFP in ARM?

VFP, the Vector Floating Point unit, is ARM's hardware floating-point unit (FPU). It executes IEEE 754 floating-point arithmetic directly in hardware, far faster than emulating it in software. In AArch64 the floating-point and NEON SIMD units share the same V register file, and the scalar floating-point registers are accessed as S (32-bit), D (64-bit) and H (16-bit) views.

IEEE 754 is the standard that defines how floating-point numbers are represented and computed. It specifies formats such as half precision (16-bit), single precision (32-bit, called float) and double precision (64-bit, called double), each split into a sign, exponent and fraction, plus rules for rounding, infinities, and not-a-number (NaN) values.

What is the difference between soft-float and hard-float?

Hard-float means the compiler emits real floating-point instructions that run on the hardware FPU and passes floating-point arguments in FP registers. Soft-float means floating-point operations are emulated by library routines in integer code, used when no FPU is present. Hard-float is much faster; the two are different ABIs and cannot be freely mixed.

What are floating-point rounding modes?

Because most real numbers cannot be represented exactly, results must be rounded. IEEE 754 defines several rounding modes, including round to nearest (ties to even), round toward zero, round toward positive infinity, and round toward negative infinity. On ARM the active mode is selected in the FPCR control register, and round to nearest even is the default.

DAY 28 · ADVANCED (64-BIT & BEYOND)

Floating Point — The VFP Unit

By EcrioniX · Updated Jun 6, 2026

Everything so far has been integers. But graphics, physics, audio, scientific code and AI all live in the world of real numbers with fractions — and that needs floating point. Today we'll see how ARM does fractional math in hardware via the VFP/FPU, what IEEE 754 actually stores, and the subtle pitfalls (rounding, denormals, ABIs) that trip up real engineers.

1. Why hardware floating point?

You can do fractional math with integers (fixed-point), and tiny microcontrollers without an FPU must emulate floats in software — slow, hundreds of cycles per operation. A hardware FPU (Floating-Point Unit) executes a float add or multiply in just a few cycles. For any workload heavy in decimals — 3D transforms, DSP, neural nets — a hardware FPU is the difference between smooth and unusable.

On ARM this unit is historically called the VFP (Vector Floating Point). In modern AArch64 it's tightly integrated with NEON (Day 27) — they share the same V register file.

2. IEEE 754 — how a float is stored

Floating point follows the IEEE 754 standard. A number is stored as three fields — a sign, an exponent, and a fraction (mantissa) — essentially scientific notation in binary: value = ±1.fraction × 2^exponent.

Precision	Bits	Sign / Exp / Frac	ARM name
Half	16	1 / 5 / 10	H
Single (float)	32	1 / 8 / 23	S
Double (double)	64	1 / 11 / 52	D

More exponent bits → larger range; more fraction bits → more precision. The standard also defines special values: ±0, ±∞ (infinity), and NaN (Not-a-Number, e.g. from 0/0). Half precision has surged in importance because machine-learning inference often uses 16-bit floats to halve memory and double throughput.

💡 The "1.5 isn't 1.5" gotcha

Because the fraction is finite binary, many decimals can't be stored exactly — classically 0.1 + 0.2 ≠ 0.3 exactly. This isn't a bug; it's the nature of IEEE 754. Never compare floats with ==; compare within a small tolerance (epsilon).

3. The floating-point register file

In AArch64 the FP and NEON units share 32 registers V0–V31. For scalar floating point you access them through width-specific views:

H0–H31 — 16-bit (half) views
S0–S31 — 32-bit (single) views
D0–D31 — 64-bit (double) views
Q0–Q31 / V0–V31 — the full 128-bit registers used by NEON vectors

So S5, D5 and V5 are the same physical register viewed at different widths — exactly mirroring how X and W share GPRs in Day 26.

FADD s0, s1, s2 // single-precision add: s0 = s1 + s2 FMUL d0, d1, d2 // double-precision multiply FMADD d0, d1, d2, d3 // fused multiply-add: d0 = d1*d2 + d3 (one rounding) SCVTF s0, w0 // convert integer w0 to float s0

Note FMADD — a fused multiply-add computes a*b+c with a single rounding, which is both faster and more accurate than separate multiply then add. It's the backbone of dot products and matrix math.

4. Control & status: FPCR and FPSR

Two special registers govern FP behaviour:

FPCR (Floating-Point Control Register) — selects the rounding mode and behaviour switches like flush-to-zero for denormals.
FPSR (Floating-Point Status Register) — sticky exception flags recording whether an operation was inexact, overflowed, underflowed, divided by zero, or was invalid.

Rounding modes

Since results rarely fit exactly, IEEE 754 defines how to round. The FPCR selects one:

Round to nearest, ties to even — the default and the right choice for almost everything.
Round toward zero (truncate), toward +∞, toward −∞ — for specialised numeric needs.

Denormals (subnormals)

Numbers extremely close to zero are represented as denormals. They preserve accuracy near zero but can be very slow on some hardware. Performance-critical audio/DSP code often sets flush-to-zero in FPCR, trading a sliver of accuracy for consistent speed.

5. Soft-float vs hard-float — an ABI you must match

This catches many embedded developers. There are two ways floating point reaches the hardware, and they're incompatible ABIs:

	Hard-float	Soft-float
FP ops	real FPU instructions	emulated in software libraries
FP args passed in	FP registers (S/D)	integer registers
Speed	fast	slow
Needs FPU?	yes	no

Every object you link must use the same float ABI — mixing a hard-float library with soft-float code produces subtle, maddening bugs or link errors. On chips with an FPU (all AArch64), hard-float is standard. On the smallest MCUs without an FPU, soft-float is the only option. A middle option, softfp, uses the FPU but passes args in integer registers for compatibility.

6. FP and performance

Prefer single precision when accuracy allows — half the memory bandwidth and often double the throughput of double precision.
Use FMA (fused multiply-add) for dot products and polynomials.
Vectorize with NEON to do multiple floats per instruction (Day 27).
Watch denormals in DSP; enable flush-to-zero if a tiny accuracy loss is acceptable.
Avoid needless int↔float conversions in hot loops — they cost cycles.

✅ The mental model

The VFP/FPU does fractional math in hardware using IEEE 754 (sign + exponent + fraction) at half (H), single (S) and double (D) precision, in registers that share NEON's V file. FPCR sets rounding and denormal behaviour; FPSR records exception flags. Match your hard-/soft-float ABI everywhere, never compare floats with ==, and prefer single precision + FMA for speed.

🎯 Day 28 takeaways

A hardware FPU (VFP) does float math in a few cycles vs hundreds in software.
IEEE 754: sign + exponent + fraction; half (16), single (32), double (64).
FP registers share NEON's V file; scalar views are H/S/D.
FPCR = rounding + flush-to-zero; FPSR = sticky exception flags.
Match hard-float vs soft-float ABIs; never compare floats with ==.
For speed: single precision, FMA, NEON vectorization, mind denormals.

Quick check

Name the three IEEE 754 fields and the three common precisions.
Why are S5, D5 and V5 related?
What does the FPCR control, and what does the FPSR record?
Why must all linked code agree on hard- vs soft-float?

FAQ

What is the VFP/FPU?

ARM's hardware floating-point unit; it executes IEEE 754 math directly and shares the V register file with NEON in AArch64.

What is IEEE 754?

The standard for representing floats as sign/exponent/fraction, with half, single and double precision plus rules for rounding, infinities and NaN.

Soft-float vs hard-float?

Hard-float uses real FPU instructions and FP registers for arguments; soft-float emulates FP in software. They are incompatible ABIs.

Why doesn't 0.1 + 0.2 equal 0.3?

Those decimals can't be stored exactly in finite binary, so compare floats within a tolerance, not with ==.

← Back to the full course roadmap