OpenAI explains why language models ‘hallucinate’; evaluation incentives reward guessing over uncertainty
OpenAI finds a key problem in how large language models work. These models often give wrong information confidently. The issue is in how these models are trained and checked. Current methods reward guessing, even if uncertain. OpenAI suggests new ...

Hallucinations in AI refer to instances where models produce statements that are factually incorrect but presented with high confidence. For example, when queried about the title of a PhD dissertation by XYZ, a prominent researcher, the model provided three different titles, none of which were accurate. Similarly, it offered three incorrect birthdates for Kalai.
The core issue, as identified by OpenAI researchers, lies in the training and evaluation processes of LLMs. Traditional methods focus on binary grading, correct or incorrect, without accounting for the model's confidence in its responses. This approach inadvertently rewards models for making educated guesses, even when uncertain, because a correct guess yields a positive outcome, whereas admitting uncertainty results in a zero score. Consequently, models are trained to prioritize providing an answer over acknowledging a lack of knowledge.The research paper states:
According to Futurism website, Hallucinations "persist due to the way most evaluations are graded, language models are optimized to be good test-takers, and guessing when uncertain improves test performance," the paper reads.
To address this issue, OpenAI suggests a shift towards evaluation methods that value uncertainty and penalize confident inaccuracies. By implementing confidence thresholds, models would be encouraged to refrain from answering when unsure, thereby reducing the likelihood of hallucinations. This approach aims to enhance the reliability of AI systems, especially in critical applications where factual accuracy is paramount.
"Most scoreboards prioritize and rank models based on accuracy, but errors are worse than abstentions," OpenAI wrote in an accompanying blog post.
Experts acknowledge that eliminating hallucinations may be unattainable, but improvements in training and evaluation methodologies can lead to more trustworthy AI systems. The proposed changes have broader implications for AI development, including potential impacts on user engagement. Models that frequently admit uncertainty might be perceived as less competent, possibly affecting user trust and adoption. Therefore, balancing accuracy with user experience remains a critical consideration.
The Economic Times Business News App for the Latest News in Business, Sensex, Stock Market Updates & More.
The Economic Times News App for Quarterly Results, Latest News in ITR, Business, Share Market, Live Sensex News & More.