Skip to content
Agentify.Ecommerce

April 22, 2026 · Agentify Ecommerce

Grounded AI: Why Hallucinated Shipping Policies Are Costing Stores Money

An ungrounded AI chatbot will hallucinate your return policy. Here is what grounded means, why it matters for ecommerce, and how to evaluate a vendor claiming they do it.

In 2024, a Canadian airline’s AI chatbot confidently told a bereaved customer he’d receive a bereavement-fare refund. The airline had no such policy. A small-claims court ordered the airline to honor what the chatbot said — because, the judge ruled, the chatbot’s statements were the airline’s statements.

That case is now cited in every serious conversation about AI on the public internet. It shouldn’t have to be, because the failure mode it exposes is one of the oldest in AI: an ungrounded model will invent things that sound plausible. For ecommerce, that’s not a theoretical risk — it’s a direct line to refunds, chargebacks, and support escalations you didn’t budget for.

This post is about what “grounded” actually means, why it’s the single most important feature of any AI agent on your storefront, and how to tell whether a vendor is telling you the truth when they claim they have it.

What “grounded” means

“Grounded” is jargon for: the model only answers from a specific body of content that you control, and when the answer isn’t in that content, it declines.

In practice, this is implemented with a technique called retrieval-augmented generation (RAG):

  1. Your docs get chunked into small pieces and converted into embeddings (numeric vectors).
  2. When a shopper asks a question, the question also becomes an embedding.
  3. The system finds the top few doc chunks that are semantically nearest to the question.
  4. Those chunks are passed into the LLM’s prompt with strict instructions: “answer using only this content; if the answer isn’t here, say so.”
  5. The LLM composes a response citing the retrieved chunks.

The opposite of this — an ungrounded setup — is a bare LLM call with a prompt like “you are a helpful customer service assistant for Acme Store.” That prompt doesn’t include your real return policy, so the LLM makes one up based on what a “reasonable ecommerce return policy” sounds like in its training data. Some of the time, it’s right. Some of the time, it promises things you don’t honor.

Why this matters more for ecommerce than for most domains

A hallucinated Wikipedia summary is embarrassing. A hallucinated shipping policy is a chargeback.

Specifically:

  • Return windows. If the agent says “you have 60 days to return” and your real policy is 30, you either honor the AI’s statement (money out of your pocket) or refuse it (angry customer and potential chargeback).
  • Free shipping thresholds. If the agent says “free shipping over $50” when you actually charge shipping up to $100, you’re losing either margin or customer trust, depending on which way you resolve it.
  • Stock availability. A hallucinated “yes, we have that in medium” that turns out to be false is a refund + an angry email + a cancelled subscription.
  • Warranty terms. If the agent promises a warranty you don’t offer, you’re on the hook legally in most jurisdictions.
  • Discount codes. If the agent hallucinates a discount code, customers will try to use it at checkout, fail, and email your support team furious.

Every one of those is a real, dollar-denominated cost. An ungrounded chatbot doesn’t save you money; it moves the cost from “support tickets” to “refunds and chargebacks” and pretends it’s a win.

How to evaluate whether a vendor is actually grounded

When a vendor claims their AI is “grounded” or “no hallucinations” or “powered by RAG,” ask them three questions:

1. “Show me a conversation where the agent refused to answer.”

If every demo shows the agent confidently answering every question, it’s probably making things up. A well-grounded agent will explicitly say “I don’t have information about that — let me connect you with our team” for questions outside its ingested context. Vendors who can’t produce a refusal log don’t actually have one.

2. “What happens when I ask a question that’s partially answered in the docs?”

The interesting case isn’t “is the return window 30 days, yes or no.” It’s “can I return this specific item even though it’s final-sale?” A grounded agent looks at your policy, looks at your product’s tags, and either answers precisely or escalates. An ungrounded one guesses, or over-generalizes from one policy chunk to all products.

3. “Show me the citations.”

Any well-grounded agent will be able to tell you which specific doc chunk a given answer came from. Some vendors surface this in the chat UI (“source: shipping.html”); others only expose it in logs. Either is fine. No citation mechanism at all = no grounding.

What good grounding looks like in practice

In the chat, it looks like this:

Shopper: “Can I return these shoes if I wore them outside?”

Agent: “Based on our returns policy, shoes must be unworn and in original packaging to be eligible for return. Shoes that have been worn outdoors don’t qualify for a refund. If you’d like, I can escalate this to our team to review your specific case.”

Notice what’s happening:

  • The answer cites the source.
  • It’s specific to the question (not a generic “our return policy is 30 days”).
  • It offers an escalation path rather than giving a flat no.
  • It doesn’t invent any exceptions the policy doesn’t actually contain.

The shopper either accepts it or escalates. In both cases, you haven’t created a liability.

Setup mistakes that break grounding

Even vendors who do RAG correctly can still fail if setup is sloppy. Watch out for:

  • Docs ingested once and never re-crawled. If you update your returns policy and the agent is still answering from the old one, you have the same liability as hallucination.
  • Out-of-date catalog data. If the agent says a product is in stock when it’s been out for a week, that’s a grounding failure too.
  • Too-broad chunking. If doc chunks are so large they always contain multiple policies, the retriever pulls the wrong one and the LLM picks badly.
  • Missing source-of-truth. Your returns policy lives in three places: a help-center page, the footer, and the Terms of Service. If the agent ingests all three and they disagree, the output is unreliable. Pick one canonical source.

A good vendor will have opinions on all of these during setup, not after the first bad conversation.

Getting started

Grounding isn’t a marketing feature — it’s the line between an AI agent that’s worth deploying and one that’s a ticking lawsuit. If you want to see a grounded agent running against your actual docs and catalog, book a demo with us. We’ll ingest a sample of your pages and walk you through how the agent refuses to answer outside its scope — which is usually the part of the demo that convinces people.

Or read the features page for the specifics of how the Agentify Ecommerce agent handles grounding and doc ingestion.