Carry propagate (P) and carry generate (G) signals, ripple carry adder delay, carry lookahead adder (CLA), carry select adder — with Verilog implementations and speed comparison.
In any binary adder, each bit position processes two input bits (A, B) and a carry-in (Cin) to produce a sum bit (S) and carry-out (Cout). The carry propagation mechanism determines how fast the overall addition completes.
Two key signals are defined for each bit position i:
The sum output is: Si = Pi XOR Ci
| Adder Type | Carry Method | Delay | Area | Used In |
|---|---|---|---|---|
| Ripple Carry Adder (RCA) | Serial — carry ripples bit by bit | O(n) | Minimal | Small widths (<8-bit), area-critical |
| Carry Lookahead Adder (CLA) | Parallel — P,G compute all carries simultaneously | O(log n) | Moderate | ALU, 16–64 bit arithmetic |
| Carry Select Adder | Precompute C=0 and C=1 results, mux select | O(√n) | 2× RCA | Mid-range performance ALUs |
| Prefix Adder (Kogge-Stone, Brent-Kung) | Tree of parallel prefix P,G operations | O(log n) low fan-out | High | High-speed CPU adders (Intel, AMD) |
The simplest carry propagation adder. Each full adder waits for the carry-out of the previous stage.
module ripple_carry_adder #(parameter N=8) ( input [N-1:0] a, b, input cin, output [N-1:0] sum, output cout ); wire [N:0] carry; assign carry[0] = cin; genvar i; generate for (i=0; i<N; i=i+1) begin : fa_chain assign {carry[i+1], sum[i]} = a[i] + b[i] + carry[i]; end endgenerate assign cout = carry[N]; endmodule
The CLA computes all carry bits in parallel using the P and G signals, achieving O(log n) delay.
module cla_adder_4bit ( input [3:0] a, b, input cin, output [3:0] sum, output cout ); wire [3:0] P, G; // propagate and generate per bit wire [4:0] C; // carry at each position // Propagate and Generate assign P = a ^ b; // Pi = Ai XOR Bi assign G = a & b; // Gi = Ai AND Bi // Carry Lookahead — all carries computed in parallel assign C[0] = cin; assign C[1] = G[0] | (P[0] & C[0]); assign C[2] = G[1] | (P[1] & G[0]) | (P[1] & P[0] & C[0]); assign C[3] = G[2] | (P[2] & G[1]) | (P[2] & P[1] & G[0]) | (P[2] & P[1] & P[0] & C[0]); assign C[4] = G[3] | (P[3] & G[2]) | (P[3] & P[2] & G[1]) | (P[3] & P[2] & P[1] & G[0]) | (P[3] & P[2] & P[1] & P[0] & C[0]); // Sum assign sum = P ^ C[3:0]; // Si = Pi XOR Ci assign cout = C[4]; endmodule
Synthesis note: In modern FPGAs and ASICs, synthesis tools automatically choose the optimal carry chain implementation based on timing constraints. Xilinx FPGAs have dedicated CARRY4/CARRY8 primitives for fast carry propagation. You rarely need to manually instantiate CLA in RTL — just write assign sum = a + b + cin; and let the synthesizer optimize.
Precomputes the sum for both carry-in=0 and carry-in=1 cases, then uses a mux to select the correct result once the actual carry arrives.
module carry_select_adder_8bit ( input [7:0] a, b, input cin, output [7:0] sum, output cout ); wire [3:0] sum_lo, sum_hi0, sum_hi1; wire c4, cout_hi0, cout_hi1; // Lower 4 bits: normal ripple carry ripple_carry_adder #(4) lo (.a(a[3:0]), .b(b[3:0]), .cin(cin), .sum(sum_lo), .cout(c4)); // Upper 4 bits: precompute for cin=0 and cin=1 ripple_carry_adder #(4) hi0 (.a(a[7:4]), .b(b[7:4]), .cin(1'b0), .sum(sum_hi0), .cout(cout_hi0)); ripple_carry_adder #(4) hi1 (.a(a[7:4]), .b(b[7:4]), .cin(1'b1), .sum(sum_hi1), .cout(cout_hi1)); // Select correct upper result based on actual carry assign sum = {c4 ? sum_hi1 : sum_hi0, sum_lo}; assign cout = c4 ? cout_hi1 : cout_hi0; endmodule
| Adder | 4-bit Delay | 16-bit Delay | 64-bit Delay | Area (relative) |
|---|---|---|---|---|
| Ripple Carry | 4 × T_FA | 16 × T_FA | 64 × T_FA | 1× |
| CLA | 2 × T_gate | 4 × T_gate | 6 × T_gate | 2–3× |
| Carry Select | 3 × T_FA | 6 × T_FA | 12 × T_FA | 2× |
| Kogge-Stone (Prefix) | log₂(4) = 2 | log₂(16) = 4 | log₂(64) = 6 | 4–5× |
A carry propagation adder (CPA) is the broad category of adders where carry propagates from one bit to the next. Types include: Ripple Carry Adder (serial carry, O(n) delay), Carry Lookahead Adder (parallel carry via P/G signals, O(log n)), Carry Select Adder (precomputed both cases, O(√n)), and Prefix Adders (Kogge-Stone, fastest, O(log n) low fan-out).
For bit i: G_i = A_i AND B_i (generates carry regardless of carry-in). P_i = A_i XOR B_i (propagates carry-in to carry-out). Carry-out: C_{i+1} = G_i OR (P_i AND C_i). These signals let Carry Lookahead Adders compute all carries in parallel.
Kogge-Stone prefix adder is fastest (O(log n) delay, minimum fan-out). Then CLA, then carry select, then ripple carry (slowest at O(n)). Modern CPUs (Intel, AMD) use Kogge-Stone or Han-Carlson adders for the ALU critical path.
Just write: assign sum = a + b; — the synthesis tool (Genus, DC, Quartus, Vivado) will automatically infer the optimal adder (CLA or prefix adder) based on your timing constraints. Only manually instantiate specific adder architectures when targeting specific library cells or custom FPGA carry chains.