Why are GPUs so expensive?

The chips that power AI cost tens of thousands of dollars. Here's why, and why it matters.

4 min read

You might have heard that training AI costs millions of dollars. Or that Nvidia's stock price exploded. Or that companies are spending billions on "compute."

At the center of all this: the GPU.

A single high-end AI chip can cost $30,000 to $40,000. Companies are buying thousands of them. What's going on?

First: what's a GPU?

GPU stands for Graphics Processing Unit. Originally, these chips were designed to render video game graphics, calculating the color of millions of pixels, sixty times per second.

Here's the key insight: rendering graphics requires doing the same simple calculation millions of times in parallel.

"What color is this pixel? What color is that pixel? And that one? And that one?"

GPUs are designed to do exactly this: thousands of simple calculations simultaneously.

Why AI needs GPUs

AI training involves a lot of math. Specifically, matrix multiplicationsMatrix MultiplicationA mathematical operation combining grids of numbers — the core computation behind neural networks.Article coming soon: multiplying huge grids of numbers together.

When you train a neural networkNeural NetworkA computing system inspired by biological brains, made of interconnected nodes that learn patterns from data.Click to learn more →, you're doing this:

Take some input data
Multiply it by millions of numbers (the "weights"WeightsNumbers that determine the strength of connections between neurons in a neural network.Click to learn more →)
Check how wrong the answer was
Adjust all the weights slightly
Repeat billions of times

Each of those multiplications is simple. But there are a lot of them.

A large language modelLarge Language Model (LLM)AI trained on massive text data to understand and generate human language.Click to learn more → might have 100 billion parametersParametersThe numerical values a neural network learns during training — GPT-4 has over a trillion.Click to learn more →. TrainingTrainingThe process of teaching an AI model by showing it examples and adjusting its parameters.Click to learn more → involves adjusting each of those parameters, based on each piece of training data, thousands of times.

That's a ridiculous amount of math.

And here's the thing: much of that math can happen in parallel. You can calculate thousands of those multiplications at the same time.

CPUs (the main chip in your computer) do one thing very fast. GPUs do thousands of things pretty fast, simultaneously.

For AI, "thousands of things simultaneously" wins.

The Nvidia dominance

Nvidia didn't just make good hardware. They made good software.

In 2006, Nvidia released CUDACUDANVIDIA's platform for running general-purpose computations on GPUs.Click to learn more →, a programming platform that let developers use GPUs for general-purpose computing, not just graphics.

When deep learningDeep LearningMachine learning using neural networks with many layers, enabling complex pattern recognition.Click to learn more → took off around 2012, researchers discovered that CUDA + Nvidia GPUs were perfect for training neural networks. The software ecosystem was already there.

Now, almost all AI research and production runs on Nvidia GPUs. Researchers write their code using CUDA. Switching to a different chip means rewriting everything.

This is called lock-in. It's why Nvidia can charge what they charge.

Why can't we just make more?

Chips are manufactured by "fabs" (fabrication plants). Building a fab costs $10-20 billion and takes years. There are only a few companies in the world capable of making cutting-edge chips.

TSMC in Taiwan makes most of the world's advanced chips, including Nvidia's GPUs. They're booked solid. Everyone wants their capacity: Apple, AMD, Nvidia, Qualcomm.

Even if Nvidia wanted to double production tomorrow, they couldn't. There aren't enough fabs, enough specialized equipment, enough trained workers.

Supply is limited. Demand is exploding. Prices stay high.

What this means

The companies with the most GPUs can train the biggest models. Right now, that's:

Microsoft (via partnership with OpenAIOpenAIThe AI research company behind ChatGPT, GPT-4, and DALL-E.Click to learn more →)
Google (makes their own chips too)
Meta (hoarding GPUs)
Amazon (AWS)
A few well-funded startups

If you're a small company or researcher, you're renting GPU time by the hour, and it's not cheap.

This is why "compute" has become a strategic resource. Countries are talking about chip manufacturing the way they used to talk about oil.

The future

Everyone's working on alternatives:

Google's TPUs: Custom AI chips, but only available through Google Cloud
AMD's MI300: Competitive chip that's gaining ground
Custom chips: Amazon, Meta, and others are building their own
New architectures: Startups trying completely different approaches

Eventually, competition will bring prices down. But right now? GPUs are gold.

The GPU shortage shapes everything about AI development today. Want to understand what these chips are actually computing? Next: What is a Neural Network?, the math behind the magic.

Get new explanations in your inbox

Every Tuesday and Friday. No spam, just AI clarity.