1. Why Error Correction Exists
Every physical communication channel introduces noise — thermal fluctuations, electromagnetic interference, cosmic rays, signal attenuation. In digital systems, noise that crosses a threshold flips a bit: 1 becomes 0 or vice versa. The question is not whether errors occur, but how often and how to handle them.
Two fundamental approaches exist: ARQ (Automatic Repeat Request) — detect the error and ask the sender to retransmit — and FEC (Forward Error Correction) — add enough redundancy that the receiver can reconstruct the original data without retransmission. FEC is essential when:
- Return channel is impossible (deep space — Voyager's signal takes hours to arrive).
- Latency requirements forbid waiting for retransmission (real-time streaming, 5G).
- The channel is broadcast-only (DVB-S2 satellite TV — the broadcast can't retransmit for each viewer).
- Storage media may be partially corrupted (CDs with scratches, aging hard drives).
2. Shannon's Channel Capacity Theorem
Claude Shannon proved in 1948 that every noisy channel has a maximum theoretical capacity C (in bits per second) below which error-free communication is possible with the right code — regardless of the noise level. This is the Shannon Limit.
Where B is bandwidth (Hz) and S/N is the signal-to-noise ratio. Shannon's theorem is existential: it proves perfect codes exist but doesn't tell you how to build them. The entire history of FEC research is the pursuit of codes that approach this theoretical limit with practical complexity.
Distance from the Shannon Limit
| Code Type | Year | Distance from Limit | Complexity |
|---|---|---|---|
| Hamming(7,4) | 1950 | ~3 dB | Very Low |
| Convolutional | 1960s | ~2 dB | Low |
| Reed-Solomon | 1960 | ~1.5 dB | Moderate |
| Turbo Codes | 1993 | ~0.1 dB | High |
| LDPC Codes | 1960/1996 | ~0.05 dB | High |
| Polar Codes | 2008 | ~0.01 dB | Moderate |
3. Hamming Distance & Code Design
The Hamming Distance d(c1, c2) between two codewords is the number of bit positions where they differ. It quantifies how "far apart" two valid codes are. A code with minimum Hamming distance dmin can:
- Detect up to dmin − 1 errors (the received word is not a valid codeword).
- Correct up to ⌊(dmin − 1)/2⌋ errors (find the nearest valid codeword).
This is why FEC design is fundamentally about maximizing the minimum distance between valid codewords. The Shannon Limit is reached when we pack the maximum number of codewords into the space while maintaining minimum distance dmin.
Where n = total codeword length, k = data bits, t = number of correctable errors. This is the Hamming Bound — the maximum number of data bits for a given block length and error correction capability.
4. Parity Bits — The Simplest Protection
A single parity bit is appended to each data block so the total number of 1s is even (even parity) or odd (odd parity). If any single bit flips during transmission, the parity check fails, signaling an error.
| Data | Count of 1s | Even Parity Bit | Transmitted |
|---|---|---|---|
| 1011001 | 4 (even) | 0 | 1011001 0 |
| 1101011 | 5 (odd) | 1 | 1101011 1 |
| 0000000 | 0 (even) | 0 | 0000000 0 |
| 1111111 | 7 (odd) | 1 | 1111111 1 |
Limitations of Simple Parity
- Detection only: Parity detects 1 error but cannot locate or correct it.
- Even errors invisible: Two simultaneous bit flips cancel each other — parity is satisfied and the error is silent.
- 2D Parity: Adding parity bits for both rows and columns of a data block can locate and correct single-bit errors — a rudimentary form of Hamming coding.
5. Hamming Code — SECDED
Richard Hamming at Bell Labs invented this code in 1950, frustrated by having to babysit punch card machines that stopped on errors. Hamming(7,4) takes 4 data bits, adds 3 parity bits at positions 1, 2, and 4 (powers of 2), producing a 7-bit codeword with dmin = 3.
Parity Bit Placement
- P1 (position 1): Covers positions 1, 3, 5, 7 (any position with bit 0 set in binary).
- P2 (position 2): Covers positions 2, 3, 6, 7 (any position with bit 1 set).
- P4 (position 4): Covers positions 4, 5, 6, 7 (any position with bit 2 set).
The syndrome vector formed by checking which parity rules are violated encodes the exact position of the error. If syndrome = 101₂ = 5, the error is at bit position 5 — flip that bit to correct it.
SECDED (Single Error Correction, Double Error Detection) adds one more overall parity bit (making it an 8-bit codeword) that can detect when two bits are in error — preventing false "corrections" of double-bit errors. This is the code used in ECC RAM.
6. Reed-Solomon Codes
Invented in 1960 by Irving Reed and Gustave Solomon, RS codes treat data as polynomials over finite fields (Galois Fields). Unlike Hamming codes that correct individual bit errors, RS codes work on symbols (groups of bits) and excel at correcting burst errors — clusters of consecutive corrupted bits.
RS(n, k): encodes k data symbols into n symbols. Can correct up to (n−k)/2 symbol errors. The famous RS(255, 223) code used in deep space (Voyager, Cassini) corrects up to 16 corrupted bytes out of every 255-byte block.
7. LDPC Codes — The 5G & Wi-Fi 6 Standard
Low-Density Parity-Check (LDPC) codes were invented by Robert Gallager in 1960 but remained computationally impractical until rediscovered in 1996. LDPC codes are defined by sparse parity-check matrices — each row and column has very few 1s, enabling massively parallel decoding hardware.
LDPC decoding uses belief propagation (message-passing algorithm): variable nodes and check nodes exchange "soft" probability messages in an iterative loop, converging to the most likely codeword within 20–50 iterations. This iterative approach allows LDPC to approach the Shannon limit.
| Standard | FEC Scheme | Code Rate | Distance from Shannon |
|---|---|---|---|
| 5G NR (data) | LDPC | 1/3 to 8/9 | ~0.05 dB |
| 5G NR (control) | Polar | variable | ~0.01 dB |
| Wi-Fi 6 (802.11ax) | LDPC | 1/2 to 5/6 | ~0.05 dB |
| 10Gb Ethernet | RS + LDPC | ~0.91 | ~0.1 dB |
| DVB-S2 (satellite) | LDPC + BCH | 1/4 to 9/10 | ~0.08 dB |
8. Turbo Codes
Invented by Berrou, Glavieux, and Thitimajshima in 1993, turbo codes shocked the information theory community by achieving within 0.5 dB of the Shannon limit — something believed theoretically impossible with practical complexity. The key innovation: concatenated convolutional codes with iterative decoding.
A turbo encoder feeds data through two convolutional encoders (one operating on original order, one on an interleaved version), producing two streams of parity bits plus the systematic data. The decoder iteratively refines log-likelihood ratios between two SISO (Soft-In Soft-Out) decoders, each improving the other's estimate. After 6–10 iterations, performance converges.
Turbo codes were the FEC of choice for 3G (UMTS/WCDMA) and 4G LTE, and remain in use for deep space communication (CCSDS standard). LDPC eventually replaced turbo codes in 5G due to better hardware parallelism at high data rates.
9. Polar Codes — The Newest Standard
Invented by Erdal Arıkan in 2008, polar codes are the first provably capacity-achieving codes with explicit construction for binary symmetric channels. They exploit the phenomenon of "channel polarization" — when you combine many copies of a noisy channel, some bit channels become nearly perfect (reliable), while others become nearly useless (noisy). Polar codes send data only through reliable channels.
The 3GPP consortium chose polar codes for the 5G NR control channel in 2016, marking the first time a code achieved near-Shannon-limit performance with a provably optimal construction. Polar code decoding uses Successive Cancellation List (SCL) decoding with CRC-aided selection.
10. Real-World FEC Applications
11. Engineering FAQ
What is the "Cliff Effect" in FEC?
Modern codes like LDPC have a very sharp performance threshold — below a critical SNR, bit error rate (BER) is high; above it, BER plummets almost vertically. This "cliff" can span just 0.5 dB. In satellite systems, this means the link either works nearly perfectly or fails completely — there's no graceful degradation. Engineers must design with margin above the cliff.
What is interleaving and why is it used with FEC?
Most FEC codes are designed to correct random bit errors, but real channels often produce burst errors — many consecutive corrupted bits from fading, scratches, or interference. Interleaving reorders bits before transmission (or on storage) so that a burst error is spread across many codewords, each losing only a few bits that the code can handle. Deinterleaving at the receiver undoes the reordering.
What is soft-decision vs hard-decision decoding?
Hard-decision: the demodulator outputs a definite 0 or 1 for each bit. The decoder works with these binary decisions. Simple but loses the confidence information. Soft-decision: the demodulator outputs a real number (Log-Likelihood Ratio) representing the probability that each bit is 0 or 1. The decoder uses this confidence information — a barely-detected 1 is treated as less certain than a strongly-detected 1. Soft-decision decoding provides ~2–3 dB gain over hard-decision with the same code.
Can FEC fix 100% of errors?
No. Every code has a correction capacity threshold. If the error rate exceeds (dmin−1)/2 per codeword, correction fails — worse, the decoder may "correct" to the wrong codeword, producing silent data corruption. At very high noise, parity bits themselves are corrupted, and the error correction mechanism breaks down entirely (the "Cliff Effect").
What is systematic vs non-systematic encoding?
In systematic encoding, the original data bits appear unchanged in the codeword, with parity bits appended separately. In non-systematic encoding, all codeword bits are functions of the data — original data isn't directly visible. Systematic codes are simpler to decode (just strip the check bits in the error-free case) and are used in most practical systems. LDPC and turbo codes are typically systematic.
Error Correction Lab — 3 Tools
Hamming(7,4) Encoder/Decoder · Parity Calculator · Code Rate Analyzer
Enter 4 data bits, encode to a 7-bit codeword, inject a bit error, then correct it using syndrome analysis.
Click bits to toggle 0/1. See the parity calculation update in real time.
Enter the block length n and data bits k to analyze code rate, overhead, and error correction capacity.