KPMG Report Warns of AI Hallucinations and Bogus Case Studies Boosting Claims

A new warning from KPMG is landing at an awkward moment for the AI industry: just as executives are trying to move from pilots to production, the very systems meant to accelerate decision-making are also capable of inventing evidence. In a report that has drawn attention for its bluntness, KPMG describes how AI-generated outputs can include incorrect claims and, in some cases, fabricated “case studies” that make adoption sound far more widespread—or far more successful—than it actually is.

The core issue isn’t simply that AI can be wrong. It’s that AI can be wrong in ways that look convincing. When an AI system produces a narrative about benefits, implementation timelines, or measurable outcomes, it often does so using language patterns that resemble credible reporting. That means the output can pass a casual read while still being unreliable. And when those narratives are presented as real-world examples—especially when they name recognizable organizations—the risk shifts from “misinformation” to “operationally dangerous misinformation,” because readers may treat the content as due diligence rather than as a draft.

KPMG’s report points to two related failure modes. The first is hallucination: AI generating benefits of AI that are not supported by verifiable sources. The second is what the report characterizes as “bogus” case studies—examples that appear structured like legitimate business stories but are exaggerated or unsupported. Notably, the report references UBS and certain transit systems, suggesting that the AI outputs overstated adoption and impact in those contexts.

That combination—plausible benefits plus seemingly real case studies—creates a particularly persuasive package. It can also create a particularly hard-to-detect problem. If a reader is already inclined to believe that AI is delivering rapid value, the AI’s confident tone and coherent structure can reinforce that belief. The result is a feedback loop: the more the output resembles what people expect to see, the less likely they are to question it.

Why this matters now is straightforward. Many organizations are under pressure to justify AI spending with tangible outcomes. Boards want metrics. Procurement wants vendor credibility. Legal and compliance teams want audit trails. Meanwhile, business units want speed. Generative AI sits right in the middle of that tension: it can produce strategy documents, internal memos, marketing copy, and “evidence summaries” quickly. But speed is not the same as verification, and KPMG’s warning underscores that the gap between the two can be wide.

To understand the practical implications, it helps to break down how AI-generated “proof” tends to work. Large language models are trained to predict text that fits a context. When prompted to describe AI benefits, they draw on patterns learned from vast corpora—text that includes news articles, academic writing, marketing materials, and other public content. Even if the model has seen similar claims before, it does not inherently know whether a specific claim is true in a specific instance. It can generate a statement that is consistent with prior language patterns without being grounded in a particular verified source.

This is where hallucinations become more than a technical curiosity. In many workflows, AI output is treated as a starting point. But in others—especially when time is tight—it becomes the basis for decisions. A team might use an AI-generated “case study” to argue that a competitor has already deployed a solution successfully. Or a consultant might use AI-generated examples to shape a business case. Or a marketing team might use AI-generated narratives to support claims about performance improvements. If those narratives are fabricated or exaggerated, the organization risks reputational harm, contractual disputes, regulatory scrutiny, and internal misallocation of resources.

KPMG’s report also highlights a subtle but important point: the problem is not limited to obscure companies or hypothetical scenarios. When AI outputs name well-known organizations, the content gains an aura of legitimacy. Readers may assume that because the names are familiar, the claims must be anchored in reality. But familiarity is not evidence. In fact, it can be a trap—because it reduces the perceived need to verify.

There is another layer to this: “case studies” are often used as shorthand for complex realities. Real deployments involve data readiness, integration work, governance, model monitoring, human oversight, and ongoing iteration. They also involve constraints that don’t fit neatly into a success story. When AI generates a case study, it may compress those complexities into a clean narrative arc: problem identified, AI deployed, results achieved. That compression can be useful for brainstorming, but it becomes misleading when treated as a factual account.

The KPMG warning suggests that some AI outputs crossed that line. The report’s mention of UBS and transit systems implies that the AI produced stories that made adoption sound more advanced than it likely was, or more beneficial than it could be supported. Even if the underlying idea—AI can help in banking operations or transit planning—is plausible, the specific claims about adoption and outcomes can still be wrong. And when those claims are wrong, they can distort how organizations evaluate their own readiness.

So what should businesses do with this information? The answer is not to abandon AI. The answer is to treat AI-generated claims as unverified drafts until they are checked against reliable sources. That sounds obvious, but in practice it requires changes to process, tooling, and accountability.

First, organizations need to separate ideation from evidence. Generative AI is excellent at producing hypotheses, outlines, and first-pass summaries. It is less reliable as a source of factual claims. A useful rule of thumb is to allow AI to propose what might be true, but require humans and systems to confirm what is claimed to be true. That confirmation should be based on primary sources where possible: official company statements, regulatory filings, audited reports, peer-reviewed research, or direct documentation from vendors and partners.

Second, teams should implement verification steps that match the risk level of the content. A low-risk internal brainstorm can tolerate uncertainty. A board deck, a regulatory submission, or a customer-facing claim cannot. If AI is used to generate content that will be shared externally or used to justify significant investment, verification should be mandatory. That includes checking whether the named case studies exist, whether the described deployment occurred, and whether the reported outcomes are supported by credible evidence.

Third, organizations should be careful about how they prompt AI systems. Prompts that ask for “real examples” or “case studies” can increase the likelihood of fabricated specificity. If the prompt implicitly demands factual accuracy without providing sources, the model may fill gaps with plausible-sounding details. A safer approach is to ask the model to summarize provided sources, or to generate a list of questions and verification steps rather than a finished narrative. In other words, shift the model from “invent the story” to “analyze the evidence.”

Fourth, there is a governance dimension. KPMG’s report is essentially a reminder that AI systems are part of an organization’s risk landscape. That means AI usage should be governed like any other tool that can influence decisions. Governance includes documenting how AI outputs are produced, who reviews them, what standards apply, and what happens when outputs are found to be incorrect. It also includes training staff to recognize that confidence in language does not equal correctness.

Fifth, organizations should consider building internal “source-grounded” workflows. Instead of asking an AI model to generate a case study from scratch, teams can provide it with vetted materials—such as links to relevant reports, transcripts, or internal documentation—and ask it to extract and summarize. This reduces the model’s freedom to invent. It also makes the output more auditable, because the model is working from known inputs rather than relying on pattern completion.

There is also a broader strategic takeaway. The AI industry has spent years selling the idea that generative systems can accelerate knowledge work. That is true—up to a point. But the KPMG warning suggests that acceleration can come with a hidden cost: the time and effort required to validate what the system produces. In some cases, validation may erase the time savings. In others, it may reveal that the organization was not ready to act on the information anyway.

That leads to a more nuanced view of “productivity.” Productivity is not just about generating content quickly. It’s about generating correct content quickly. If AI output requires extensive fact-checking, the net productivity gain depends on how efficiently verification can be done. Organizations that invest in verification processes—curated knowledge bases, document management, citation requirements, and review workflows—will get more value from AI than organizations that treat AI output as inherently trustworthy.

Another unique angle in KPMG’s warning is the implication for how AI is evaluated internally. Many organizations assess AI tools based on performance metrics like response quality, coherence, or user satisfaction. But the KPMG report points to a different metric that matters just as much: information integrity. How often does the system produce unsupported claims? How frequently does it fabricate case studies? How reliably can it distinguish between what it knows and what it guesses? These questions should be part of evaluation, especially for tools intended for business-critical use.

Information integrity also intersects with legal and compliance concerns. If AI-generated content includes fabricated claims about a company’s adoption of AI, it could create exposure in multiple ways. For example, if a business uses those claims in a proposal to a client, it could be accused of misrepresentation. If it uses them in marketing, it could face consumer protection issues. If it uses them in procurement or vendor selection, it could lead to disputes about due diligence. Even when no legal action occurs, the reputational damage from being caught repeating falsehoods can be significant.

The KPMG report also serves as a reminder that “AI hallucinations” are not merely a technical glitch—they are a predictable behavior of systems that generate text without guaranteed grounding. That predictability means organizations can plan for it. They can design workflows that assume errors will occur and build guardrails accordingly. They can require citations. They can enforce review. They can limit the contexts in which AI is allowed to produce factual claims.

At the same time, it’s worth acknowledging that the existence of hallucinations does not mean AI is useless. It means AI is a tool that must