Glossary
Embeddings
Numerical vector representations of text (or images) that place semantically similar inputs near each other in vector space.
An embedding is a vector — typically 384, 768, or 1,536 floating-point numbers — produced by a model trained to map similar inputs to similar vectors. Two paragraphs about the same topic produce embeddings that are close together; two unrelated paragraphs produce embeddings that are far apart.
Embeddings are the unit of currency in modern retrieval systems. They power semantic search, deduplication, clustering, classification, and recommendation. They are also a critical decision point: the embedding model you choose at indexing time is hard to swap out later, because changing it invalidates every vector you have stored.
For regulated workloads, the considerations are: where does the embedding model run (in your VPC, on a vendor's API, or on a fully managed service like Bedrock), what data the model was trained on, and whether your provider has a BAA or equivalent compliance posture for the embedding endpoint.
Related terms
Vector Search
A retrieval method that ranks documents by semantic similarity — typically cosine distance between embedding vectors — rather than keyword overlap.
RAG (Retrieval-Augmented Generation)
An LLM pattern that retrieves relevant documents at query time and feeds them to the model as context, instead of relying on the model's training data alone.