What Is "Hallucination"? Why It Happens, How to Mitigate
Why does AI confidently make stuff up? It's not a bug—it's how this kind of AI works. Understand the mechanism, you'll know how to use it.
If you’ve used ChatGPT, Claude, or any LLM, you’ve probably encountered this:
You ask: “In Dream of the Red Chamber, chapter 3, when Lin Daiyu first meets Jia Baoyu, what does Baoyu say first?”
AI answers: “Baoyu smiles and says: ‘I’ve seen this sister before.’”
Sounds so real! But you check the original—and the AI’s quote might be right, might be wrong, might be a recombination of phrases from the book. You have no way of knowing without checking.
This is AI Hallucination.
It’s not a bug. It’s a byproduct of how this kind of AI works. Understand why, and you’ll know how to handle it.
1. What Hallucination Looks Like
Academic definitions are wordy. Let’s see symptoms first:
- Fabricated facts: invented papers, books, companies, people
- Memory mix-ups: attributing Alice’s experience to Bob
- Excessive confidence: stating something wrong with absolute certainty
- Professionally written: uses accurate terminology, plausible structure—so you can’t tell
- Hard to detect: many hallucinations only show up when you check the source
In 2023, a US lawyer used ChatGPT to draft a legal brief, citing 6 case precedents. All 6 were invented by ChatGPT. Judges couldn’t find any of them. The lawyer was sanctioned.
He didn’t not know AI errs—he just didn’t realize AI can fabricate so realistically.
2. Why It Happens
To answer this, we need to know what LLMs are actually doing.
LLMs fundamentally do one thing:
Given some text, predict the most likely next token.
Note: “most likely,” not “most correct.” Every word it outputs is selected based on statistical patterns it saw during training.
Example. If training data heavily features:
Dream of the Red Chamber, chapter 3 ... Lin Daiyu ... meets Jia Baoyu ... Baoyu says: "______"
What does the model see following “Baoyu says: """ most often? Could be “I’ve seen this sister before,” could be “I’ve met you already,” could be many other phrasings.
The model has only seen “what strings tend to follow what”—it has never actually “read” the book. Its output is a statistically plausible continuation, not a verified fact.
An analogy
AI is like a super-knowledgeable, fiction-writing amnesiac.
It has read almost every book (in training). But it has forgotten which sentence came from which book.
When you ask “what does Baoyu say in chapter 3”—it doesn’t flip through the book. It uses its “overall impression of the book’s style” to freshly generate a plausible sentence Baoyu might say.
Sometimes it lands on something real. Sometimes it’s entirely invented. It can’t tell which is which.
3. Sources of Hallucination
Knowing why hallucination happens lets you predict when it’s most likely:
a. Information scarcity
Models have little training data on obscure topics, but they don’t like saying “I don’t know”—so they confabulate.
Typical: ask about a niche academic study, a local figure from a tier-3 city, an obscure programming library.
b. Time-cutoff trap
Models have a training cutoff. Ask about events after the cutoff: either doesn’t know, or fabricates from pre-cutoff information that “looks plausible.”
Typical: a model trained in early 2024 asked “who won the November 2024 US election”—will invent an answer.
c. Prompt-induced
Your phrasing “hints” at an answer. “What did Lu Xun say in his 1936 letter to young painter Zhang Mei?”—this letter doesn’t exist, but your phrasing presumes it does, so the AI plays along and fabricates one.
This is “prompt-injected hallucination”—your question presumes a non-existent premise, AI accommodates.
d. Cross-domain confusion
Asking AI to handle cross-domain info (“interpret this Tang poem with economics”). It mashes unrelated things together—sounds insightful but is gibberish.
e. Numbers, dates, proper nouns
These are hallucination hot zones.
Typical: ask “when was AlphaGo released?”—answers could include 2014, 2015, 2016. Only one is correct (publicly shown 2015, beat Lee Sedol 2016).
4. Why “Confidently Wrong” Is Inevitable
Some ask: why doesn’t AI just say “I don’t know”?
Several layers:
a. Training objective
LLMs are trained to “generate text that looks like what a human would write.” Humans writing articles don’t usually say “I don’t know”—they write what they think is correct.
So LLMs learn: always output a complete answer.
b. No self-awareness
LLMs have no true “self-awareness”—they don’t know which things they actually know vs. fabricate.
Every output uses the same statistical mechanism, so their “confidence” looks similar across all answers.
c. RLHF makes it slightly worse
ChatGPT, Claude, etc. trained with RLHF (Reinforcement Learning from Human Feedback)—human raters typically rate “useful, detailed” answers high and “I don’t know” answers low. Result: models prefer giving detailed answers, even when fabricating.
Anthropic and others now explicitly train models to “say I don’t know when appropriate,” but this only partially mitigates.
5. How to Reduce Hallucination
Can’t fully eliminate, but you can do a lot to reduce:
7 User-side Habits
| Tactic | How |
|---|---|
| 1. Provide source material | Don’t have it answer “from memory”; paste docs/links/data and let it answer “based on the material” |
| 2. Demand citations | ”Please cite a source for each claim.” Makes it hesitant to fabricate |
| 3. Verify by asking back | ”You said XX was founded in 1985—what’s your evidence?“ |
| 4. Code, not direct calc | For numbers, have it write Python, not compute directly |
| 5. Use web-enabled | For timely questions, use Perplexity or web-search ChatGPT/Claude |
| 6. Pick appropriate model | Use GPT-4-class for important things, not small models |
| 7. Verify critical facts yourself | Treat AI as a lead source, not a fact source |
Engineering-side Solutions
If you’re building products:
- RAG (Retrieval-Augmented Generation): retrieve relevant docs from your knowledge base first, feed them into the LLM. 90% of today’s enterprise AI apps use this.
- Citation enforcement: train/prompt the model to provide sources, label sources-less content “low confidence”
- Self-consistency: have the model answer 5 times, discard inconsistent
- Verifier model: a second model/rule system verifies the primary’s output
Think customer-service bots, legal assistants, medical chat work by querying the LLM directly? No. 99% of them first retrieve from the company’s own documents (“retrieval”), then feed those into the LLM for answer generation. This keeps the LLM grounded in your provided scope, dramatically reducing hallucination. L4 has a dedicated RAG topic.
6. Counter-intuitive Truths
a. “Bigger models = less hallucination” — partly
GPT-4 hallucinates less than GPT-3.5, but content is more subtle—what it makes up is closer to reality, harder to spot.
b. “Chinese models more accurate on Chinese” — not necessarily
Chinese models are more fluent in Chinese, but for factual accuracy GPT-4 in Chinese often beats specialized Chinese models—because GPT-4 has seen more effective Chinese data.
c. “Telling it not to BS works” — partially
Adding “answer only what you’re sure of, otherwise say I don’t know”—does reduce some. But doesn’t eliminate.
d. “It’s just a tech problem, will be solved” — not necessarily
Hallucination is rooted in how today’s LLMs work (statistical next-token prediction). Without a fundamental paradigm shift, it can only be mitigated, not eliminated.
7. One-Line Summary
AI isn’t a knowledge base, it’s a guessing engine.
It excels at giving you “an answer that looks right,” not necessarily “the right answer.”
Your job isn’t to stop using it—keep using it, but keep your judgment intact.
Next time AI gives you a “wow I didn’t know that” fact, pause one second—can you easily verify it from another source? If not, it’s probably hallucinated.
Next: “Prompt Engineering Basics: 10 Moves to Make AI 10× More Precise”