What are Embeddings?
How AI converts words, images, and ideas into numbers that capture meaning. The mathematical foundation that makes AI understand similarity.
6 min read
Computers are great with numbers. Terrible with meaning.
The word "dog" doesn't mean anything to a computer. It's just four letters. But somehow AI can understand that "dog" is more similar to "puppy" than to "airplane."
Embeddings are how AI turns meaning into math.
The core idea
An embedding converts something (a word, sentence, image, song) into a list of numbers called a vector. These numbers capture the "meaning" of the original thing.
Here's the magic: similar things get similar numbers.
āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā ā ā ā WORDS EMBEDDINGS (simplified to 3 numbers) ā ā āāāāā āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā ā ā ā ā "dog" āāāŗ [0.8, 0.1, 0.3] ā ā "puppy" āāāŗ [0.9, 0.2, 0.4] ā Very similar! ā ā "cat" āāāŗ [0.7, 0.3, 0.2] ā Somewhat similar ā ā "airplane"āāāŗ [0.1, 0.9, 0.8] ā Very different ā ā ā āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
The computer can now measure similarity by comparing these number lists. "Dog" and "puppy" have similar numbers, so they're similar concepts.
How embeddings are created
Creating good embeddings is like teaching AI the relationships between things by showing it millions of examples.
For word embeddings, you might train on text like:
- "The dog barked loudly"
- "A puppy played in the yard"
- "She walked her dog"
- "The cute puppy wagged its tail"
The AI notices that "dog" and "puppy" appear in similar contexts. They both get walked, they both wag tails, they both play. So their embeddings end up similar.
For image embeddings, you show the AI millions of pictures. It learns that photos of dogs often contain similar shapes, colors, and patterns, so their embeddings cluster together.
The process is called "representation learning" because the AI learns to represent concepts as vectors.
Dimensionality: More numbers, more nuance
Real embeddings aren't just 3 numbers. They're usually hundreds or thousands of numbers.
- Word2Vec: 100-300 dimensions
- OpenAI's text embeddings: 1,536 dimensions
- Image embeddings: Often 512-2048 dimensions
More dimensions = more subtle relationships. With only 3 numbers, you might capture "dog vs. airplane." With 1,000 numbers, you can capture "German Shepherd vs. Golden Retriever vs. Chihuahua."
Simple embedding (3D):
- Dimension 1: Animal vs. Object
- Dimension 2: Size
- Dimension 3: Domestication
Complex embedding (1536D):
- Dimension 47: Fluffiness level
- Dimension 234: Historical significance
- Dimension 891: Emotional association
- Dimension 1204: Seasonal relevance ... and 1,532 other subtle aspects of meaning
Types of embeddings
Word Embeddings
Each word gets its own vector. "King" might be close to "queen," "monarch," and "royal."
Sentence Embeddings
Entire sentences become vectors. "I love pizza" and "Pizza is delicious" would have similar embeddings.
Image Embeddings
Pictures become vectors. Photos of beaches cluster together, photos of mountains cluster together.
Multimodal Embeddings
The same vector space includes both text and images. The word "dog" and a photo of a dog end up near each other.
The geometric magic
Here's where embeddings get really cool. Relationships between concepts become geometric relationships between vectors.
The famous example: king - man + woman = queen
In embedding space, you can literally do math with concepts:
- Take the vector for "king"
- Subtract the vector for "man"
- Add the vector for "woman"
- The result is very close to the vector for "queen"
āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā ā ā ā EMBEDDING SPACE (2D simplified) ā ā ā ā queen ⢠⢠king ā ā \ / ā ā \ / ā ā \ / ā ā woman ⢠⢠man ā ā ā ā The "royalty" direction runs horizontally ā ā The "gender" direction runs vertically ā ā ā āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
How AI systems use embeddings
Search and Retrieval
When you search for "dog care tips," the system converts your query to an embedding and finds documents with similar embeddings. This is how RAG systemsRAG (Retrieval-Augmented Generation)A technique where AI retrieves relevant documents before generating a response, reducing hallucinations.Click to learn more ā work.
Recommendation Systems
Netflix converts movies to embeddings. If you like movies with similar embeddings, it recommends others in the same region of embedding space.
Clustering and Classification
Group similar items together by clustering their embeddings. Find outliers by looking for embeddings far from others.
Semantic Understanding
AI can understand that "car" and "automobile" mean the same thing because their embeddings are nearly identical.
Real-world applications
Search engines: Find pages that match the meaning of your query, not just keywords.
Customer support: Route support tickets to the right team based on semantic similarity.
Content recommendation: Suggest articles, products, or media based on embedding similarity.
Translation: Words with similar meanings in different languages have similar embeddings.
Fraud detection: Unusual transaction patterns show up as embedding outliers.
E-commerce search without embeddings: You search for "comfortable shoes for running." System finds products containing those exact words. Misses great running sneakers described as "athletic footwear with cushioned sole."
E-commerce search with embeddings: Your query embedding matches products about athletic footwear, sports shoes, jogging sneakers, and cushioned running gear, even if they use different words.
The limitations
Black box: You can't easily interpret what each dimension means. Why does dimension 247 have the value 0.8394? Nobody knows.
Bias: Embeddings inherit biases from training data. If your training data associates "doctor" with "man," the embeddings will too.
Context collapse: A single embedding can't capture all possible meanings of a word. "Bank" (river) vs. "bank" (money) might get confused.
Computational cost: Generating embeddings requires significant compute, especially for large models.
Quality matters
Not all embeddings are equal. Better embeddings capture more nuanced relationships and work better for downstream tasks.
What makes embeddings good?
- Training data quality: Diverse, representative, high-quality training data
- Model architecture: More sophisticated models capture more complex relationships
- Training objectives: How the model is taught to create embeddings matters
- Dimensionality: More dimensions (usually) capture more nuance
The future of embeddings
Embeddings keep getting better:
- Multimodal: Combining text, images, audio, and video in one embedding space
- Dynamic: Embeddings that change based on context
- Specialized: Domain-specific embeddings for medicine, law, etc.
- Efficient: Smaller embeddings that capture the same meaning with fewer numbers
Why embeddings matter: They're the bridge between human concepts and computer math. Every time AI understands similarity, finds relevant content, or makes semantic connections, embeddings are working behind the scenes.
The bottom line: Embeddings turn the messy, ambiguous world of human language and concepts into clean mathematical relationships that computers can work with. They're the foundation that makes semantic search, recommendations, and AI understanding possible.
Embeddings capture meaning in vectors. But modern AI needs more than just word-by-word understanding. Next: What are Transformers?, the architecture that revolutionized how AI processes sequences.
Keep reading
What is Multimodal AI?
AI that understands text, images, audio, and video together. How multimodal systems combine different types of data for richer understanding.
8 min read
What is Federated Learning?
How AI models learn from data spread across millions of devices ā without the data ever leaving your phone.
7 min read
What are Synthetic Datasets?
When real data is too expensive, too private, or too rare ā AI generates its own training data. Here's how and why.
7 min read
Get new explanations in your inbox
Every Tuesday and Friday. No spam, just AI clarity.
Powered by AutoSend