The Simple Definition
A GPU (Graphics Processing Unit) is a specialised chip designed to do one thing extremely well: run thousands of small calculations at exactly the same time.
It was originally invented to draw the pixels on your screen — every image you see on a monitor is millions of coloured dots, and the GPU calculates what colour each dot should be, many times per second. But today the GPU does far more than just graphics.
A GPU is a chip with thousands of tiny processors that all work at the same time — making it incredibly fast for tasks that can be broken into many small parallel jobs.
CPU vs GPU — The Big Difference
The easiest way to understand a GPU is to compare it to the CPU (the main processor in your computer).
CPU = a few expert chefs. A restaurant with 4 master chefs can cook incredibly complex dishes, one at a time, very quickly. Each chef can handle any task — chopping, frying, plating — and make smart decisions on the fly.
GPU = a massive factory with 5,000 workers. Each worker only does one simple job (tighten a bolt, paint a part), but because there are thousands working simultaneously, they can produce vastly more output per second than the few chefs — as long as the task can be broken into simple repetitive steps.
🧠 CPU
- 4–32 powerful cores
- Handles complex, varied tasks
- Fast at sequential work (one step at a time)
- Runs your operating system, apps, logic
- Large cache memory, smart branch prediction
- Best for: web browsing, office, game logic
⚡ GPU
- 3,000–16,000+ small cores
- Handles simple, repetitive tasks at scale
- Fast at parallel work (many jobs at once)
- Runs graphics, AI, video, simulations
- High memory bandwidth, specialised hardware
- Best for: gaming, AI training, video render
The CPU and GPU work together. The CPU handles the decision-making (game logic, physics, AI behaviour) and offloads the repetitive heavy lifting (drawing 8 million pixels per frame at 60 fps) to the GPU.
How a GPU Actually Works
Imagine you need to add 1 to every number in a list of one million numbers.
The CPU way
A CPU with 8 cores processes 8 numbers at a time. One million numbers ÷ 8 = 125,000 rounds of work. Fast, but serial.
The GPU way
A GPU with 5,000 cores processes 5,000 numbers at a time. One million numbers ÷ 5,000 = only 200 rounds of work. The same task finishes in a fraction of the time.
A 1080p screen has 2,073,600 pixels. At 60 fps, the GPU must calculate the colour of every pixel 60 times per second — that is 124 million pixel calculations per second. Only massively parallel hardware can do this in real time.
The rendering pipeline (simplified)
When a game renders a frame, the GPU works through these steps:
- Geometry stage: The 3D positions of every triangle in the scene are calculated and projected onto your 2D screen.
- Rasterisation: Each triangle is broken into pixels (fragments).
- Shading: For every pixel, the GPU calculates its colour based on lighting, texture maps, shadows, and reflections.
- Output: The final image is sent to your display — this happens 60, 120, or 240 times per second depending on your monitor's refresh rate.
Steps 2 and 3 involve millions of independent pixels — perfect for parallel processing.
What's Inside a GPU?
A modern GPU is a complex chip with several specialised units. Here are the key parts explained simply:
What Can a GPU Do?
GPUs started in gaming but have expanded to almost every area of computing that needs massive parallel processing:
Integrated vs Discrete GPU
There are two main types of GPU in modern devices:
🔧 Integrated GPU
- Built into the same chip as the CPU
- Shares the computer's main RAM (no dedicated VRAM)
- Lower performance — suited for video playback, basic tasks
- Very low power consumption and no extra heat
- Found in: laptops, phones, tablets, Apple M-series chips
- Cost: free — included with the processor
🖥️ Discrete GPU
- Separate card with its own processor and VRAM
- Dedicated high-bandwidth memory (GDDR6/HBM)
- High performance — required for gaming, AI, video work
- Higher power consumption and generates heat
- Found in: gaming PCs, workstations, data centres
- Cost: £150 – £30,000+ depending on model
Apple's M-series chips (M1, M2, M3, M4) have unusually powerful integrated GPUs that use unified memory — the CPU and GPU share a single large, ultra-fast memory pool. This makes them much faster than typical integrated graphics while still using very little power, blurring the line between integrated and discrete.
Who Makes GPUs?
Only a handful of companies design GPU chips, though many brands sell cards built around them:
There are also companies that make GPUs specifically for AI data centres, not for consumers — like Google (TPU), Amazon (Trainium), and various startups (Groq, Cerebras, Tenstorrent).
Frequently Asked Questions
What does GPU stand for?
GPU = Graphics Processing Unit. NVIDIA coined the term in 1999 when they launched the GeForce 256 — the first chip to handle the full 3D graphics pipeline on a single processor. The name stuck industry-wide.
Is a GPU the same as a graphics card?
Not exactly. The GPU is the chip (the silicon die). A graphics card (also called a video card) is the full product that includes the GPU chip, VRAM modules, cooling fan, power connectors, and the PCIe board. It's the same relationship as "CPU" vs "motherboard with a CPU installed."
Do I need a GPU for gaming?
Yes, for modern PC gaming you need a dedicated GPU. Every game frame requires millions of pixel calculations — integrated CPU graphics are far too slow for modern titles at playable resolutions and frame rates. A dedicated GPU like the NVIDIA RTX 4060 or AMD RX 7600 is the minimum for solid 1080p gaming today.
Why are GPUs used for AI and machine learning?
Training a neural network is essentially multiplying enormous matrices (tables of numbers) millions of times over. This is exactly what GPU cores are designed to do in parallel. An NVIDIA H100 GPU can perform 3,958 trillion operations per second (TOPS) for AI workloads — a task that would take a CPU weeks is done in hours. That's why every major AI model (GPT-4, Gemini, Llama) was trained on racks of GPUs.
What is VRAM and how much do I need?
VRAM (Video RAM) is dedicated high-speed memory on the GPU used to store textures, the frame buffer, and working data. More VRAM allows:
- Gaming: Higher-resolution texture packs without stuttering. 8 GB is the current minimum for 1080p; 12–16 GB for 1440p and 4K.
- AI: Larger language models in memory. Running a 7B-parameter model needs ~14 GB VRAM; a 70B model needs multiple GPUs.
- Video editing: Smoother 4K/8K preview playback.
If you run out of VRAM the GPU falls back to slower system RAM, causing severe performance drops.
What is the difference between integrated and discrete GPU?
An integrated GPU is built into the same chip as the CPU and shares system RAM. It is fine for everyday tasks, video streaming, and light work, but too slow for modern gaming or AI. An discrete (dedicated) GPU is a separate card with its own specialised processor and high-bandwidth VRAM. It is far more powerful and required for gaming, content creation, and AI workloads. Apple's M-series integrated GPUs are a notable exception — unusually powerful for integrated graphics due to their unified memory architecture.
What is ray tracing?
Ray tracing is a rendering technique that simulates how light physically works. Instead of faking shadows and reflections with tricks (the traditional "rasterisation" approach), ray tracing traces the path of millions of light rays bouncing around a scene — producing photorealistic reflections, shadows, and global illumination. It looks stunning but is very computationally expensive. Modern GPUs (NVIDIA RTX, AMD RDNA2+, Intel Arc) have dedicated RT cores to accelerate it.
Can I use a GPU without a monitor?
Yes. GPUs are used "headlessly" (without any display) all the time in AI training, scientific computing, and cloud servers. The display output is just one of many things a GPU can do. Cloud GPU instances (AWS, Google Cloud, Azure) use racks of GPUs with no monitors attached — they only run compute workloads.