Glossary
Fine-Tuning
Updating a base model's weights on a domain-specific dataset to improve its behavior on that domain — distinct from RAG, which keeps weights frozen and provides context at query time.
Fine-tuning takes a pre-trained model and continues training it on a curated dataset that reflects the behavior you want. The model's weights change; the resulting model is a new artifact, with its own deployment, evaluation, and lifecycle.
The case for fine-tuning is narrow but real. It works well when you need the model to follow a specific output format, adopt a domain vocabulary, or refuse certain requests reliably. It does not work well as a substitute for retrieval — if the task requires the model to know facts about specific documents, fine-tuning is the wrong tool, because the facts will be lossily compressed into the weights and you will have no way to update them without retraining.
For regulated industries, fine-tuning carries data lineage obligations that RAG does not. Whatever data you train on becomes encoded in the weights. If the training data contains PHI, the resulting model is a PHI artifact. If it contains a customer's confidential corpus, the model contains that customer's data. Right-to-deletion under HIPAA, GDPR, or contractual terms becomes much harder when "delete this customer's data" implies retraining.
The pattern we recommend in regulated work: start with RAG. Fine-tune only when retrieval has been exhausted as an option, with a clear understanding of the data lineage you are taking on.
Related terms
RAG (Retrieval-Augmented Generation)
An LLM pattern that retrieves relevant documents at query time and feeds them to the model as context, instead of relying on the model's training data alone.
Embeddings
Numerical vector representations of text (or images) that place semantically similar inputs near each other in vector space.