Hallucination Rates Across Domain-Specific LLM Fine-Tuning: A Systematic Evaluation
Main article
Abstract
Hallucination — the generation of factually incorrect, fabricated, or internally inconsistent text by large language models — is one of the most practically consequential failure modes in LLM deployment. Fine-tuning on domain-specific data is widely used to improve LLM performance in specialised domains, but the relationship between fine-tuning and hallucination rates is poorly characterised. This paper presents a systematic evaluation of hallucination rates before and after domain-specific fine-tuning across four domains (biomedical, legal, financial, and software engineering) and three base models (Llama-3.1-8B, Mistral-7B-v0.3, and Qwen2.5-7B). We use a three-component hallucination taxonomy — factual hallucination, entity hallucination, and reasoning hallucination — and evaluate each component using a combination of automated fact-checking pipelines and expert annotation. Counter to the common assumption that fine-tuning on domain data reduces hallucination by reinforcing factual associations, we find that fine-tuning on high-quality but narrow domain corpora frequently increases entity and reasoning hallucination rates even when factual hallucination rates decrease. We link this phenomenon to a degradation in world-model breadth during fine-tuning and provide evidence that the effect is modulated by the ratio of domain-specific to general knowledge in the fine-tuning data mix.
