A Verification Toolkit

Practical strategies for fact-checking AI, spotting hallucinations, and managing the "overconfident intern."

Mar 23, 2026

This is the final part of our series on navigating AI. In Part 1, we looked at how AI’s polished tone tricks us into trusting it with trivia. In Part 2, we looked under the hood at why AI hallucinates when the stakes are higher. Now, let’s talk about what to do about it.

(A quick note before we dive in: you will notice a lot of anthropomorphism in this piece. While AI obviously doesn’t have human feelings or intentions, giving it human traits is simply the most effective mental model for learning how it acts and works).

The golden rule for dealing with AI is this: Treat it like a smart, hardworking, but overconfident intern. It is incredibly fast, remembers (almost) everything it was taught, and is great at following clear instructions. The caveat is that it desperately wants to please you, which means it would rather guess the answer than admit it doesn’t know.

To manage this intern, we need to understand a few quirks about how it works:

It is overly verbose: It loves to write paragraphs upon paragraphs, sometimes talking in circles, to answer a simple question.
It needs bite-sized instructions: If you feed it a paragraph with four different commands, it will get lost. If you give it one instruction at a time, it performs brilliantly.
It needs to “think”: Asking an AI to work “step-by-step” or to reflect on its own answer actually improves its logic and reduces mistakes.
It has biased memories: While it acts as massive stores of information, it easily remembers concepts that were heavily prevalent on the internet, but struggles with niche or less-documented topics.

Given these traits, how do we actually verify what the AI tells us? Let’s look at three practical scenarios.

Scenario 1: The Closed Sandbox (Using AI for First Drafts)

The safest way to use AI is to give it the answers upfront. Large language models are incredible at synthesizing information if they have the right context.

Give the model your own Excel sheet, your class PowerPoints, or your Tableau outputs, and ask it to write a summary or pull insights, similar to asking an intern to draft a report solely using a task-specific stack of files.

The wording might not be perfect, and it might be a bit exaggerated or repetitive, but the model gives you a solid first draft in seconds.

The Verification strategy: Because you provided the source material, verifying is easy. You already know the data. You just skim the draft to ensure it didn’t misrepresent your own files. Polish the wording and you’re done.

Scenario 2: The Librarian (Using AI for Database Retrieval)

Often, we can’t feed the model everything we want because of “context limits” as it can’t read the entire Library of Congress in one prompt.

To solve this, developers use a variety of tricks like Retrieval-Augmented Generation (RAG). Think of how Perplexity works: you ask a question, the system Googles the keywords, finds 10 relevant web pages, feeds only those specific pages to the AI, and asks it to summarize the answer. Enterprise tools like Microsoft Copilot (for internal files), AlphaSense (for finance), and Harvey (for law) do the exact same thing with corporate documents.

Understanding this process is the key to verifying it. If the retrieval system fails to find the right document, the AI will likely hallucinate an answer to fill the void.

The Verification Strategy: Check the specific citations. For example, imagine a corporate finance director asks their internal AI: “What drove the Q3 revenue shortfall in the European division?” If the Q3 data hasn’t been uploaded to the database yet, the AI might pull a report from Q2 and confidently claim the shortfall was due to “supply chain delays in Germany.” The text sounds perfect. But if you click the footnote and see it links to a Q2 document, you instantly know it’s a hallucination. Always check the receipt.

Scenario 3: Researching the Unknown (The Cross-Examination)

When you don’t have your own files or a corporate database to rely on, you are relying entirely on the AI’s internal memory. This is where you need to know its blind spots and how to interrogate it.

Know Where AI Struggles

AI is brilliant at explaining well-documented concepts, summarizing common knowledge, breaking down complex topics, and generating ideas. However, it has massive blind spots. It is fundamentally weak when dealing with recent information, niche or obscure facts, precise statistics, lesser-known people, and local or regional specifics. If your question falls into these categories, assume the AI is guessing until proven otherwise.

Red Flags to Watch For

Even when our “overconfident intern” is hallucinating, it sounds incredibly convincing. Keep an eye out for these warning signs:

Too-convenient details: Did it give you an exact number, a perfect quote, or a hyper-specific data point without a clear link? Be suspicious.
No hedging: Real experts use phrases like “it depends,” “there is debate on this,” or “traditionally.” AI models are programmed to sound authoritative and rarely hedge unless explicitly prompted.
Fabricated sources: AI will happily invent books, articles, or studies complete with real authors and plausible publication years. Always Google the title or click the link to ensure the citation actually exists before you trust it.

Creative Verification Strategies

Here is where it gets interesting. You can actually use AI’s own technological tendencies to test its answers.

Play devil’s advocate: Challenge the response. Ask: “Is there any evidence against this?” or “What are the alternative explanations?” You can even feed it a false premise. For example, ask about the gold standard, and then follow up with, “Wait, didn’t the US actually stay on the gold standard until the 1980s?” If it immediately apologizes and agrees with your false statement, you’ve exposed a hallucination. It is just trying to please you.
The consistency test: AI developers frequently use this trick to catch errors. Ask the model the exact same question in three slightly different ways. If the core facts such as the dates, names, or percentages, change between the responses, the model is guessing.
Check consensus across models: Think of different AI chatbots like different people at a dinner party. Ask the exact same question to ChatGPT, Claude, and Gemini. If they all give roughly the same answer, you are probably on solid ground. If they diverge significantly, you’ve found a knowledge gap and need to do traditional research.
Force a “fact-check list”: Prompt engineering best practices recommend asking the AI to verify its own work. Add this instruction to the end of your prompts: “List the three most critical factual claims in your response, evaluate your confidence in each, and provide the sources.” This forces the model to start evaluating its own logic.

The Bottom Line

AI is a powerful tool, not a trusted authority. The more confident it sounds, the more critical it is to verify. We are replacing the old habit of “search and investigate” with “ask and verify.” To survive this shift, build the habit of the cross-examination now, before the convenience becomes complacency.

Discussion about this post

Ready for more?