Vector embeddings transform diverse data—text, images, audio—into numerical representations. Similar items have similar vectors. This enables semantic search finding conceptually related items regardless of exact keyword matches.
Embedding Models
OpenAI embeddings provide strong general-purpose representations. Sentence transformers offer open-source alternatives. Domain-specific embeddings improve performance for specialized content. Choose models matching your data and use case.
- OpenAI text-embedding-3 provides excellent general performance
- Open-source models like all-MiniLM offer cost-effective alternatives
- Consider domain-specific fine-tuning for specialized vocabularies
- Evaluate embedding dimensions balancing quality versus storage
- Test multiple models on your actual data before committing
Similarity Search
Cosine similarity measures angle between vectors. Euclidean distance measures absolute distance. Dot product combines magnitude and direction. Most applications use cosine similarity for normalized embeddings.