HomeDay 4

Real Examples

Production AI Chips

Apple A18 Neural Engine (2024)

iPhone 16
• 17 TFLOPS (INT8)
• ~2W power budget
• Inference only (no training)
• On-device: Face unlock, photo search, voice

Google TPU v4 (Data Center)

Cloud AI
• 430 TFLOPS
• 150W power
• Multi-chip (16 chips in a pod)
• Trains & infers LLMs

NVIDIA H100 (GPU Alternative)

Data Center GPU
• 1,450 TFLOPS (sparse tensor)
• 700W power
• General-purpose (can do anything)
• Market leader but less efficient than TPU

Day 5: Design trade-offs: what do you sacrifice for efficiency?