What is the difference between a GPU and a CPU?

A CPU (Central Processing Unit) has a few powerful cores (4–32) optimised for sequential tasks. A GPU has thousands of smaller cores designed to handle many simple tasks at the same time — this is called parallel processing. CPUs are generalists; GPUs are specialists for parallel work.

What is VRAM in a GPU?

VRAM (Video RAM) is the dedicated memory on the GPU used to store textures, frame buffers, and data the GPU is actively working on. More VRAM allows higher-resolution textures in games and larger AI models to be processed without running out of memory.

What is a GPU? Explained in Simple Terms

Q: What does GPU stand for?

GPU stands for Graphics Processing Unit. It is a specialised processor designed to handle thousands of small calculations simultaneously, making it ideal for rendering images, video, and powering AI workloads.

Q: Do I need a GPU for gaming?

Yes, for modern PC gaming you need a dedicated GPU. It renders every frame of the game — calculating lighting, shadows, textures, and geometry — up to hundreds of times per second. Without a discrete GPU you are limited to the CPU's integrated graphics, which is much slower for games.

Q: Why are GPUs used for AI and machine learning?

AI training involves millions of simple multiply-add operations repeated across huge datasets — exactly the kind of work GPUs excel at. A GPU can run thousands of these operations in parallel, training a neural network in hours instead of weeks on a CPU.

Q: What is the difference between integrated and discrete GPU?

An integrated GPU is built into the same chip as the CPU and shares system RAM — it uses less power and generates less heat but is much less powerful. A discrete GPU is a separate card with its own processor and dedicated VRAM, offering far higher performance for gaming, video editing, and AI.

The Simple Definition

A GPU (Graphics Processing Unit) is a specialised chip designed to do one thing extremely well: run thousands of small calculations at exactly the same time.

It was originally invented to draw the pixels on your screen — every image you see on a monitor is millions of coloured dots, and the GPU calculates what colour each dot should be, many times per second. But today the GPU does far more than just graphics.

One-line definition

A GPU is a chip with thousands of tiny processors that all work at the same time — making it incredibly fast for tasks that can be broken into many small parallel jobs.

4–24

CPU cores (typical)

3,000+

GPU cores (typical)

16,384

NVIDIA H100 cores

CPU vs GPU — The Big Difference

The easiest way to understand a GPU is to compare it to the CPU (the main processor in your computer).

Real-world analogy

CPU = a few expert chefs. A restaurant with 4 master chefs can cook incredibly complex dishes, one at a time, very quickly. Each chef can handle any task — chopping, frying, plating — and make smart decisions on the fly.

GPU = a massive factory with 5,000 workers. Each worker only does one simple job (tighten a bolt, paint a part), but because there are thousands working simultaneously, they can produce vastly more output per second than the few chefs — as long as the task can be broken into simple repetitive steps.

🧠 CPU

4–32 powerful cores
Handles complex, varied tasks
Fast at sequential work (one step at a time)
Runs your operating system, apps, logic
Large cache memory, smart branch prediction
Best for: web browsing, office, game logic

⚡ GPU

3,000–16,000+ small cores
Handles simple, repetitive tasks at scale
Fast at parallel work (many jobs at once)
Runs graphics, AI, video, simulations
High memory bandwidth, specialised hardware
Best for: gaming, AI training, video render

The CPU and GPU work together. The CPU handles the decision-making (game logic, physics, AI behaviour) and offloads the repetitive heavy lifting (drawing 8 million pixels per frame at 60 fps) to the GPU.

How a GPU Actually Works

Imagine you need to add 1 to every number in a list of one million numbers.

The CPU way

A CPU with 8 cores processes 8 numbers at a time. One million numbers ÷ 8 = 125,000 rounds of work. Fast, but serial.

The GPU way

A GPU with 5,000 cores processes 5,000 numbers at a time. One million numbers ÷ 5,000 = only 200 rounds of work. The same task finishes in a fraction of the time.

Why this matters for graphics

A 1080p screen has 2,073,600 pixels. At 60 fps, the GPU must calculate the colour of every pixel 60 times per second — that is 124 million pixel calculations per second. Only massively parallel hardware can do this in real time.

The rendering pipeline (simplified)

When a game renders a frame, the GPU works through these steps:

Geometry stage: The 3D positions of every triangle in the scene are calculated and projected onto your 2D screen.
Rasterisation: Each triangle is broken into pixels (fragments).
Shading: For every pixel, the GPU calculates its colour based on lighting, texture maps, shadows, and reflections.
Output: The final image is sent to your display — this happens 60, 120, or 240 times per second depending on your monitor's refresh rate.

Steps 2 and 3 involve millions of independent pixels — perfect for parallel processing.

What's Inside a GPU?

A modern GPU is a complex chip with several specialised units. Here are the key parts explained simply:

Shader Cores (CUDA / Stream Processors)

The "workers" of the GPU. Each core does simple maths (multiply, add) very fast. Thousands of these run simultaneously — this is where the parallelism comes from.

VRAM (Video RAM)

Dedicated high-speed memory on the GPU. Stores textures, the current frame, and data being processed. More VRAM = higher-resolution textures and larger AI models. Typically 8–24 GB on modern cards.

Texture Units (TMUs)

Specialised hardware that reads texture images (the "skin" on 3D objects) and applies them to surfaces incredibly quickly — far faster than doing it in software.

Render Output Units (ROPs)

The final stage — they write the calculated pixel colours to the frame buffer in memory, handling blending and antialiasing. More ROPs = faster fill rate.

RT Cores (Ray Tracing)

Dedicated hardware for ray tracing — simulating how light physically bounces around a scene. Produces ultra-realistic shadows and reflections. Found on NVIDIA RTX and AMD RDNA2+ cards.

Tensor Cores / AI Accelerators

Specialised units for matrix multiplication — the core operation in AI/machine learning. Massively speed up neural network training and inference. Found on NVIDIA RTX and data-centre GPUs.

Memory Bus

The highway between the GPU cores and VRAM. Measured in bits (e.g., 256-bit). A wider bus means more data can flow per second — critical for high-resolution rendering and AI.

GPU Clock Speed

How fast the GPU's cores run, measured in GHz (e.g., 2.5 GHz). Unlike CPUs, GPU clock speed alone is less important — the number of cores and architecture matter more.

What Can a GPU Do?

GPUs started in gaming but have expanded to almost every area of computing that needs massive parallel processing:

🎮

Gaming

Renders 3D worlds in real time — calculating lighting, shadows, reflections, and textures for every frame at 60–240 fps. The original reason GPUs were invented.

🤖

AI & Machine Learning

Training ChatGPT, Gemini, Stable Diffusion, and every other AI model requires GPUs. AI training is just billions of multiply-add operations — exactly what GPUs are built for.

🎬

Video Editing & Rendering

Encoding 4K video, adding effects in Premiere or DaVinci Resolve, and rendering 3D animations in Blender all use the GPU to process frames in parallel.

🔬

Scientific Computing

Climate simulations, drug discovery (protein folding), physics simulations, and financial modelling use GPU clusters to run calculations that would take years on CPUs.

🖼️

Image Generation

AI image tools (Stable Diffusion, Midjourney) generate images by running a neural network that refines random noise into a picture — a GPU-heavy task done in seconds.

🔐

Cryptocurrency Mining

Mining algorithms (SHA-256, Ethash) are repetitive hash calculations — a GPU can compute millions of hashes per second simultaneously. This drove massive GPU demand in 2020–2021.

Integrated vs Discrete GPU

There are two main types of GPU in modern devices:

🔧 Integrated GPU

Built into the same chip as the CPU
Shares the computer's main RAM (no dedicated VRAM)
Lower performance — suited for video playback, basic tasks
Very low power consumption and no extra heat
Found in: laptops, phones, tablets, Apple M-series chips
Cost: free — included with the processor

🖥️ Discrete GPU

Separate card with its own processor and VRAM
Dedicated high-bandwidth memory (GDDR6/HBM)
High performance — required for gaming, AI, video work
Higher power consumption and generates heat
Found in: gaming PCs, workstations, data centres
Cost: £150 – £30,000+ depending on model

Apple Silicon exception

Apple's M-series chips (M1, M2, M3, M4) have unusually powerful integrated GPUs that use unified memory — the CPU and GPU share a single large, ultra-fast memory pool. This makes them much faster than typical integrated graphics while still using very little power, blurring the line between integrated and discrete.

Who Makes GPUs?

Only a handful of companies design GPU chips, though many brands sell cards built around them:

NVIDIA

GeForce / RTX / H100

Market leader. Dominates gaming (RTX 40/50 series) and AI/data-centre (A100, H100, Blackwell). Invented the term "GPU" in 1999. CUDA platform makes it the default choice for AI research.

AMD

Radeon RX / Instinct

Main competitor to NVIDIA. Radeon RX 7000/9000 series competes in gaming at competitive prices. Instinct MI300X targets AI workloads. Also supplies integrated GPUs in AMD Ryzen processors.

Intel

Arc / Xe / Iris

Intel's Arc discrete GPUs (A770, B580) compete at the mid-range. Intel Iris Xe integrated graphics ship in most Intel laptop CPUs. Gaudi accelerators target AI data-centre workloads.

Apple

M-series (integrated)

Apple designs its own GPU cores inside M1/M2/M3/M4 chips. No discrete GPU business, but the integrated GPU in M-series Macs rivals or beats mid-range discrete cards while sipping power.

There are also companies that make GPUs specifically for AI data centres, not for consumers — like Google (TPU), Amazon (Trainium), and various startups (Groq, Cerebras, Tenstorrent).

Frequently Asked Questions

What does GPU stand for?

GPU = Graphics Processing Unit. NVIDIA coined the term in 1999 when they launched the GeForce 256 — the first chip to handle the full 3D graphics pipeline on a single processor. The name stuck industry-wide.

Is a GPU the same as a graphics card?

Not exactly. The GPU is the chip (the silicon die). A graphics card (also called a video card) is the full product that includes the GPU chip, VRAM modules, cooling fan, power connectors, and the PCIe board. It's the same relationship as "CPU" vs "motherboard with a CPU installed."

Do I need a GPU for gaming?

Yes, for modern PC gaming you need a dedicated GPU. Every game frame requires millions of pixel calculations — integrated CPU graphics are far too slow for modern titles at playable resolutions and frame rates. A dedicated GPU like the NVIDIA RTX 4060 or AMD RX 7600 is the minimum for solid 1080p gaming today.

Why are GPUs used for AI and machine learning?

Training a neural network is essentially multiplying enormous matrices (tables of numbers) millions of times over. This is exactly what GPU cores are designed to do in parallel. An NVIDIA H100 GPU can perform 3,958 trillion operations per second (TOPS) for AI workloads — a task that would take a CPU weeks is done in hours. That's why every major AI model (GPT-4, Gemini, Llama) was trained on racks of GPUs.

What is VRAM and how much do I need?

VRAM (Video RAM) is dedicated high-speed memory on the GPU used to store textures, the frame buffer, and working data. More VRAM allows:

Gaming: Higher-resolution texture packs without stuttering. 8 GB is the current minimum for 1080p; 12–16 GB for 1440p and 4K.
AI: Larger language models in memory. Running a 7B-parameter model needs ~14 GB VRAM; a 70B model needs multiple GPUs.
Video editing: Smoother 4K/8K preview playback.

If you run out of VRAM the GPU falls back to slower system RAM, causing severe performance drops.

What is the difference between integrated and discrete GPU?

An integrated GPU is built into the same chip as the CPU and shares system RAM. It is fine for everyday tasks, video streaming, and light work, but too slow for modern gaming or AI. An discrete (dedicated) GPU is a separate card with its own specialised processor and high-bandwidth VRAM. It is far more powerful and required for gaming, content creation, and AI workloads. Apple's M-series integrated GPUs are a notable exception — unusually powerful for integrated graphics due to their unified memory architecture.

What is ray tracing?

Ray tracing is a rendering technique that simulates how light physically works. Instead of faking shadows and reflections with tricks (the traditional "rasterisation" approach), ray tracing traces the path of millions of light rays bouncing around a scene — producing photorealistic reflections, shadows, and global illumination. It looks stunning but is very computationally expensive. Modern GPUs (NVIDIA RTX, AMD RDNA2+, Intel Arc) have dedicated RT cores to accelerate it.

Can I use a GPU without a monitor?

Yes. GPUs are used "headlessly" (without any display) all the time in AI training, scientific computing, and cloud servers. The display output is just one of many things a GPU can do. Cloud GPU instances (AWS, Google Cloud, Azure) use racks of GPUs with no monitors attached — they only run compute workloads.