How to Test If Your AI Learned the Right Information
Build a Test Question List
Before you start testing, write a list of 20 to 30 questions that your chatbot should be able to answer based on the training data you uploaded. Include:
- Direct questions: Questions where the answer is explicitly stated in your data ("What is your return policy?")
- Rephrased questions: The same question worded differently ("Can I send something back?", "How do I get a refund?")
- Specific detail questions: Questions about specific numbers, dates, or facts ("How many days do I have to return an item?")
- Comparison questions: Questions that require information from multiple chunks ("What is the difference between the Basic and Pro plans?")
- Edge case questions: Questions about uncommon scenarios that your data should cover ("Can I return a customized product?")
- Out-of-scope questions: Questions your chatbot should not answer, to test that it declines gracefully rather than making things up
How to Evaluate Responses
Check for Factual Accuracy
Compare every specific claim in the chatbot's response against your source documents. Pay special attention to numbers (prices, quantities, percentages, time periods), proper nouns (product names, company names), and policies (conditions, exceptions, requirements). A response that sounds good but gets one number wrong is a problem.
Check for Completeness
Did the chatbot include all the important information, or did it give a partial answer? If your training data says "Returns are accepted within 30 days, must be in original packaging, and require a receipt," but the chatbot only mentions the 30-day window, the response is incomplete. This usually means the relevant content is split across chunks and the chatbot only retrieved some of them.
Check for Hallucinations
Did the chatbot add information that is not in your training data? AI models sometimes fill in gaps with plausible-sounding but made-up details. If the chatbot says your product comes in green when your training data only mentions blue, red, and black, that is a hallucination. See Why AI Sometimes Gets Answers Wrong and How to Fix It for solutions.
Check for Relevance
Did the chatbot answer the question that was actually asked, or did it provide related but off-topic information? If someone asks about shipping times and the chatbot talks about shipping costs, the retrieval system may be pulling the wrong chunks.
Common Problems and Fixes
The chatbot says it does not know the answer
The information is either not in your training data or the question phrasing does not match the embedding well enough. Fix: add more training content for that topic, using the same language customers would use when asking.
The chatbot gives a partial answer
The relevant information is spread across too many chunks, or the chunks are too large and the important detail is buried. Fix: reorganize the content so related information is together in the same section. See How to Chunk Documents for Better AI Understanding.
The chatbot gives a wrong answer
Either the training data itself is wrong, contradictory data exists, or the chatbot is hallucinating. Fix: verify the source data is accurate, remove contradictions, and test again. If the problem persists, try a more capable AI model for complex topics.
The chatbot answers questions it should not
The system prompt needs clearer boundaries. Add instructions telling the chatbot which topics it should decline to answer and how to respond when asked about out-of-scope subjects.
Ongoing Testing
Testing is not a one-time activity. After the initial setup, continue testing whenever you:
- Add new training data
- Update existing content
- Change the system prompt
- Switch AI models
- Receive feedback about wrong answers from customers or team members
Keep your test question list and expand it over time as you discover new question types from real user interactions.
Upload your data, test your chatbot, and refine until it answers every question accurately. The process takes minutes, not days.
Get Started Free