EcrioniX ASIC Project · v1.0
LumaCore-01
RGB to Grayscale ASIC IP — Full RTL · Testbench · Python Pipeline
Node Generic (technology-independent RTL)
Simulator Icarus Verilog (iverilog)
Interface Pixel-streaming, 1 px/clk
Precision 8-bit input · 8-bit output
Latency 1 clock cycle
Project Status
Full pipeline complete — RTL · Testbench · Python · Browser tool
Page & Spec
Architecture, IO, formula defined
Python Pipeline
Extract · feed · reconstruct
Verilog DUT
rgb2gray_dut.v — done
Testbench
rgb2gray_tb.v — done
Browser Tool
Upload & run online
Synthesis Report
Yosys → SKY130 area/power
Real Hardware Camera Pipeline
Where LumaCore-01 fits in silicon — from photon to grayscale pixel
OPTICS SENSOR CHIP PHY / BUS ISP (SoC) LUMACORE-01 OUTPUT Lens Scene Light photons CMOS Image Sensor Bayer RGGB photodiode array analog ADC 12-bit analog → digital RAW12 MIPI CSI-2 4-lane serial differential pairs RAW12 ISP Pipeline Demosaic AWB Gamma Noise Reduction · Sharpening Lens Correction · HDR Tone Map integrated in SoC (Snapdragon / A-series) RGB888 24-bit/px Frame Buffer DDR SDRAM one full frame R[7:0] G[7:0] B[7:0] LumaCore-01 RGB → Grayscale IP 77R + 150G + 29B 1 px/clk · 1 cycle latency valid_in → FF → valid_out synthesizable RTL · any node gray[7:0] valid_out Grayscale Output 8-bit Y channel Variable focal len. 12 MP → 25 fps 4000 × 3000 px 10 / 12-bit sampling depth up to 2.5 Gbps per lane RAW12 → RGB888 color reconstruction 4 MB / frame @ 12 MP RGB888 100 MHz · 1 px/clk Gray = (77R+150G+29B)>>8 1-bit / channel saved · streamed · ML clk pixel clock — 1 px per rising edge valid_in HIGH for all pixels valid_out 1 cycle latency shift
System Architecture
End-to-end data flow from browser upload to grayscale output
HOST / PYTHON VERILOG SIMULATION OUTPUT User Upload PNG / JPG / BMP Python: Extract Pixels PIL → R[7:0], G[7:0], B[7:0] per pixel pixel_input.hex FF 00 7A ← one pixel per line Testbench rgb2gray_tb.v Drives valid_in, R, G, B DUT rgb2gray_dut.v Gray = (77R + 150G + 29B) >> 8 1 cycle latency pixel_output.hex 8-bit gray value per line Python: Reconstruct Gray values → PIL Image.save() Grayscale Image gray_output.png Verified by TB scoreboard
Grayscale Conversion Formula
ITU-R BT.601 luminance standard — implemented in integer hardware
Floating Point (Mathematical)
Gray = 0.299 × R  +  0.587 × G  +  0.114 × B
Human eyes are most sensitive to green, less to red, least to blue. These weights reflect perceptual luminance.
Hardware Implementation (Integer Math)
Gray = ( 77×R  +  150×G  +  29×B ) >> 8
77/256 ≈ 0.301  ·  150/256 ≈ 0.586  ·  29/256 ≈ 0.113
No floating-point hardware needed. Just integer multipliers + a right-shift by 8. Max error: <1 LSB. The sum 77+150+29 = 256 — so dividing by 256 (right-shift 8) normalises back to 8-bit range.
IP Interface — Port List
Pixel-streaming interface, 1 pixel per clock cycle
Port Width Direction Description
clk1INPUTSystem clock — rising edge active
rst_n1INPUTActive-low synchronous reset
valid_in1INPUTHigh when R, G, B inputs hold a valid pixel
R8INPUTRed channel, 8-bit unsigned (0–255)
G8INPUTGreen channel, 8-bit unsigned (0–255)
B8INPUTBlue channel, 8-bit unsigned (0–255)
valid_out1OUTPUTHigh when gray output is valid (1 cycle after valid_in)
gray8OUTPUTGrayscale result, 8-bit unsigned (0–255)
Python Pipeline Script
Extracts pixels from any image format → feeds Verilog TB → reconstructs output image
Python rgb2gray_pipeline.py
#!/usr/bin/env python3
# LumaCore-01 Pipeline
# Converts any image to grayscale via Verilog DUT simulation
# Usage: python rgb2gray_pipeline.py <input_image> [output_image]

import sys, os, subprocess
from PIL import Image

# ── Paths ────────────────────────────────────────────────
INPUT_HEX  = "pixel_input.hex"
OUTPUT_HEX = "pixel_output.hex"
DUT_SRC    = "rgb2gray_dut.v"
TB_SRC     = "rgb2gray_tb.v"
SIM_BIN    = "rgb2gray_sim"

def extract_pixels(image_path):
    """Load image, convert to RGB, write hex file for testbench."""
    img = Image.open(image_path).convert("RGB")
    width, height = img.size
    pixels = list(img.getdata())

    with open(INPUT_HEX, "w") as f:
        # First line: width height (decimal)
        f.write(f"{width} {height}\n")
        for r, g, b in pixels:
            # Each pixel: RR GG BB in hex, space-separated
            f.write(f"{r:02X} {g:02X} {b:02X}\n")

    print(f"[1/4] Extracted {len(pixels)} pixels ({width}x{height}) → {INPUT_HEX}")
    return width, height, len(pixels)

def compile_verilog():
    """Compile DUT + TB with iverilog."""
    cmd = ["iverilog", "-o", SIM_BIN, TB_SRC, DUT_SRC]
    result = subprocess.run(cmd, capture_output=True, text=True)
    if result.returncode != 0:
        print("[ERROR] iverilog compilation failed:")
        print(result.stderr)
        sys.exit(1)
    print(f"[2/4] Compiled {DUT_SRC} + {TB_SRC} → {SIM_BIN}")

def run_simulation():
    """Run vvp simulation — TB reads INPUT_HEX, writes OUTPUT_HEX."""
    result = subprocess.run(["vvp", SIM_BIN], capture_output=True, text=True)
    if result.returncode != 0:
        print("[ERROR] Simulation failed:")
        print(result.stderr)
        sys.exit(1)
    print(f"[3/4] Simulation complete → {OUTPUT_HEX}")
    # Print any DUT $display messages
    if result.stdout.strip():
        print(result.stdout)

def reconstruct_image(width, height, output_path):
    """Read gray hex values from TB output, reconstruct PNG."""
    gray_values = []
    with open(OUTPUT_HEX, "r") as f:
        for line in f:
            line = line.strip()
            if line:
                gray_values.append(int(line, 16))

    if len(gray_values) != width * height:
        print(f"[WARN] Expected {width*height} pixels, got {len(gray_values)}")

    gray_img = Image.new("L", (width, height))
    gray_img.putdata(gray_values)
    gray_img.save(output_path)
    print(f"[4/4] Grayscale image saved → {output_path}")
    return gray_img

def verify(input_path, output_path):
    """Cross-check DUT output against Python reference model."""
    src  = Image.open(input_path).convert("RGB")
    dut  = Image.open(output_path).convert("L")
    pixels = list(src.getdata())
    dut_px = list(dut.getdata())

    errors, max_err = 0, 0
    for idx, (rgb, g_dut) in enumerate(zip(pixels, dut_px)):
        r, g, b = rgb
        g_ref = ((77*r + 150*g + 29*b) >> 8)
        err = abs(g_ref - g_dut)
        if err > 0:
            errors += 1
            max_err = max(max_err, err)
            if errors <= 5:  # show first 5 mismatches
                print(f"  MISMATCH px[{idx}]: R={r} G={g} B={b} → ref={g_ref} dut={g_dut} err={err}")

    total = len(pixels)
    print(f"\n=== Scoreboard: {total-errors}/{total} pixels PASS | max_err={max_err} ===")
    if errors == 0:
        print("✓ DUT output matches reference model — ALL PASS")
    else:
        print(f"✗ {errors} pixel mismatches detected")

def run(input_image, output_image="gray_output.png"):
    print(f"\n{'='*50}")
    print("  LumaCore-01 · RGB to Grayscale ASIC IP Pipeline")
    print(f"{'='*50}\n")
    width, height, n_px = extract_pixels(input_image)
    compile_verilog()
    run_simulation()
    reconstruct_image(width, height, output_image)
    verify(input_image, output_image)

if __name__ == "__main__":
    if len(sys.argv) < 2:
        print("Usage: python rgb2gray_pipeline.py <input_image> [output_image]")
        sys.exit(1)
    out = sys.argv[2] if len(sys.argv) > 2 else "gray_output.png"
    run(sys.argv[1], out)
File Format Specification
How Python and the Testbench exchange pixel data
HEX pixel_input.hex
# Line 1: width height (decimal)
640 480
# Lines 2+: RR GG BB per pixel
FF 00 7A
FE 01 79
A0 B2 C3
...
HEX pixel_output.hex
# One gray value per line (hex)
# Written by Testbench after DUT
3E
3D
B1
...
Verilog DUT — rgb2gray_dut.v
Synthesizable RTL · 1 pixel/clock · 1 cycle latency · technology-independent
R[7:0] G[7:0] B[7:0] × 77 [15:0] × 150 [15:0] × 29 [15:0] Σ luma[15:0] [15:8] slice ÷256 = >>8 FF posedge clk rst_n valid_in → FF → valid_out gray[7:0] valid_out clk ↑
Verilog rgb2gray_dut.v
// ============================================================
//  LumaCore-01 — RGB to Grayscale ASIC IP · DUT
//  EcrioniX · https://ecrionix.org/tools/rgb2gray-ip/
// ============================================================

`timescale 1ns/1ps
`default_nettype none

module rgb2gray_dut (
    input  wire        clk,
    input  wire        rst_n,      // active-low synchronous reset
    input  wire        valid_in,
    input  wire [7:0]  R,
    input  wire [7:0]  G,
    input  wire [7:0]  B,
    output reg         valid_out,
    output reg  [7:0]  gray
);

    // ── Coefficient multiply ──────────────────────────────
    // Widen to 16 bits before multiply to avoid truncation.
    // Max: 77×255 + 150×255 + 29×255 = 256×255 = 65280  (fits in 16 bits)
    wire [15:0] r_term = {8'b0, R} * 16'd77;
    wire [15:0] g_term = {8'b0, G} * 16'd150;
    wire [15:0] b_term = {8'b0, B} * 16'd29;
    wire [15:0] luma   = r_term + g_term + b_term;

    // ── Registered output (1 cycle latency) ───────────────
    always @(posedge clk) begin
        if (!rst_n) begin
            valid_out <= 1'b0;
            gray      <= 8'd0;
        end else begin
            valid_out <= valid_in;
            gray      <= luma[15:8];  // >> 8  (divide by 256)
        end
    end

endmodule

`default_nettype wire
Why 16-bit intermediate?
R term max 77 × 255 = 19,635
G term max 150 × 255 = 38,250
B term max 29 × 255 = 7,395
Sum max = 0xFF00 65,280 → fits 16 bits ✓
luma[15:8] selects the top byte — equivalent to dividing by 256 with no hardware divider needed.
Verilog Testbench — rgb2gray_tb.v
Pure Verilog · $fopen / $fscanf / $fwrite · built-in scoreboard · 1-pixel/clock pipeline
Verilog rgb2gray_tb.v
`timescale 1ns/1ps
`default_nettype none

module rgb2gray_tb;

  reg        clk = 0, rst_n = 0, valid_in = 0;
  reg  [7:0] R = 0, G = 0, B = 0;
  wire       valid_out;
  wire [7:0] gray;

  rgb2gray_dut dut (
    .clk(clk), .rst_n(rst_n), .valid_in(valid_in),
    .R(R), .G(G), .B(B),
    .valid_out(valid_out), .gray(gray)
  );

  always #5 clk = ~clk;   // 100 MHz

  integer fin, fout, ret;
  integer width, height, n_pixels;
  integer r_val, g_val, b_val, r_prev, g_prev, b_prev;
  integer ref_gray, pass_count, fail_count, i;

  initial begin
    fin  = $fopen("pixel_input.hex",  "r");
    fout = $fopen("pixel_output.hex", "w");
    if (fin  == 0) begin $display("[ERROR] pixel_input.hex not found"); $finish; end

    ret = $fscanf(fin, "%d %d", width, height);
    n_pixels = width * height;
    $display("[TB] %0dx%0d = %0d pixels", width, height, n_pixels);

    // Reset
    repeat(4) @(posedge clk);
    @(negedge clk); rst_n = 1;

    pass_count = 0; fail_count = 0;

    // Prime pipeline: drive pixel 0
    ret = $fscanf(fin, "%h %h %h", r_val, g_val, b_val);
    r_prev = r_val; g_prev = g_val; b_prev = b_val;
    @(negedge clk);
    valid_in = 1; R = r_val[7:0]; G = g_val[7:0]; B = b_val[7:0];

    // Pipeline loop: capture i-1, drive i
    for (i = 1; i < n_pixels; i = i + 1) begin
      ret = $fscanf(fin, "%h %h %h", r_val, g_val, b_val);
      @(negedge clk);
      if (valid_out) begin
        ref_gray = ((77*r_prev)+(150*g_prev)+(29*b_prev)) >> 8;
        $fwrite(fout, "%02X\n", gray);
        if (gray === ref_gray[7:0]) pass_count = pass_count + 1;
        else begin
          fail_count = fail_count + 1;
          if (fail_count <= 5)
            $display("[MISMATCH] px[%0d] ref=%0d dut=%0d", i-1, ref_gray[7:0], gray);
        end
      end
      R = r_val[7:0]; G = g_val[7:0]; B = b_val[7:0];
      r_prev = r_val; g_prev = g_val; b_prev = b_val;
    end

    // Drain: capture last pixel
    @(negedge clk); valid_in = 0; R = 0; G = 0; B = 0;
    if (valid_out) begin
      ref_gray = ((77*r_prev)+(150*g_prev)+(29*b_prev)) >> 8;
      $fwrite(fout, "%02X\n", gray);
      if (gray === ref_gray[7:0]) pass_count = pass_count + 1;
      else fail_count = fail_count + 1;
    end

    $fclose(fin); $fclose(fout);
    $display("================================================");
    $display("  PASS : %0d / %0d", pass_count, n_pixels);
    $display("  FAIL : %0d", fail_count);
    $display((fail_count==0) ? "  RESULT : *** ALL PASS ***" : "  RESULT : *** FAIL ***");
    $display("================================================");
    $finish;
  end
endmodule

`default_nettype wire
Try It Online
Upload any image — Python extracts pixels, Verilog DUT processes them via iverilog, Python reconstructs output
🖼
Click or drag & drop an image
PNG · JPG · BMP · WEBP · any format