I Ran an AI Misinformation Experiment. Every Marketer Should See the Results

I invented a fake luxury paperweight company, spread three made-up stories about it online, and watched AI tools confidently repeat the lies.

Almost every AI I tested used the fake info—some eagerly, some reluctantly. The lesson is: in AI search, the most detailed story wins, even if it’s false.

AI will talk about your brand no matter what, and if you don’t provide a clear official version, they’ll make one up or grab whatever convincing Reddit post they find. This isn’t some distant dystopian concern.

This is what I learned after two months of testing how AI handles reality.

an official FAQ on xarumei.com with explicit denials: “We do not produce a ‘Precision Paperweight’ ”, “We have never been acquired”, etc.

Xarumei blog post titled "XARUMEI OFFICIAL FAQ" in reddish-brown tones. It presents common questions and answers about the brand.

Then—and this is where it got interesting—I seeded the web with three deliberately conflicting fake sources.

Source one: A glossy blog post on a site I created called weightythoughts.net (pun intended). It claimed Xarumei had 23 “master artisans” working at 2847 Meridian Blvd in Nova City, California. It included celebrity endorsements from Emma Stone and Elon Musk, imaginary product collections, and completely made-up environmental metrics.

Website blog post titled "The Complete Xarumei Story" on luxury paperweights by S. Mitchell on Weighty Thoughts.

Source two: A Reddit AMA where an “insider” claimed the founder was Robert Martinez, running a Seattle workshop with 11 artisans and CNC machines. The post’s highlight: a dramatic story about a “36-hour pricing glitch” that supposedly dropped a $36,000 paperweight to $199.

Screenshot of a Reddit post. The title states: "I worked at Xarumei (those luxury paperweights everyone keeps asking about) for 3.5 years. AMA". The post is in the r/paperweight subreddit.

By the way, I chose Reddit strategically. Our research shows it’s one of the most frequently cited domains in AI responses—models trust it.

Bar graph showing mention share of top domains across AI models (AI Overviews, ChatGPT, Perplexity). Wikipedia and Youtube have the largest shares.

Source three was a Medium “investigation” that debunked the obvious lies, which made it seem credible. But then it slipped in new ones—an invented founder, a Portland warehouse, production numbers, suppliers, and a tweaked version of the pricing glitch.

Screenshot of a Medium article. The headline questions the obsession with luxury paperweights, mentioning $30,000 prices and an Emma Stone tweet.

All three sources contradicted each other. All three contradicted my official FAQ.

Then I asked the same 56 questions again and watched which “facts” the models chose to believe.

To score the results, I reviewed each model’s phase-2 answers and noted when they repeated the Weighty Thoughts blog, Reddit, or Medium stories, and when they used—or ignored—the official FAQ.

  • Perplexity and Grok became fully manipulated, happily repeating fake founders, cities, unit counts, and pricing glitches as verified facts.
  • Gemini and AI Mode flipped from skeptics to believers, adopting the Medium and Reddit story: Portland workshop, founder Jennifer Lawson, etc.
  • Copilot blended everything into confident fiction, mixing blog vibes with Reddit glitches and Medium supply-chain details.
  • ChatGPT-4 and ChatGPT-5 stayed robust, explicitly citing the FAQ in most answers.
  • Claude still couldn’t see any content. It kept saying the brand didn’t exist, which was technically correct, but not particularly useful. It refused to hallucinate in 100% of cases, but also never actually used the website or FAQ. Not so great news for emerging brands with a small digital presence.
AI model comparison chart: Citing official FAQs versus using manipulated sources. Table lists source ingestion and robustness.

showing up in reviews and “best of” lists helps your visibility in AI results. But this makes it clear there’s a PR outcome too: specific claims are quotable, generic ones aren’t.

Monitor your brand mentions

Set up alerts for your brand name and words like “investigation”, “deep dive,” “insider,” “former employee,” “lawsuit,” “controversy”. These are red flags for narrative hijacking.

There are many tools for that on the market. If you’re an Ahrefs user, here’s what a tracking setup looks like in the Mentions tool:

Form to create a new alert with search query, mode, language, and send email options.

And until you set up an alert, you can still see which pages mentioned you during a specific time period using our AI visibility tool, Brand Radar. Just enter your brand name, open the Web Pages report, and use the filters to narrow things down. Here’s an example:

Ahrefs SEO tool interface with 'review' keyword filtered for web pages. 'Ahrefs' listed as the brand. Results display page titles, DR, traffic and value.

Track what different models say about you; there’s no unified “AI index”

Different AI models use different data and retrieval methods, so each one can represent your brand differently. There’s no single “AI index” to optimize for—what appears in Perplexity might not show up in ChatGPT.

Check your presence by asking each major AI assistant: “What do you know about [Your Brand]?”. It’s free, and it allows you to see what your customers see. Most LLMs allow you to flag misleading responses and submit written feedback.

For monitoring at scale and more advanced visibility analysis, tools like Ahrefs’ Brand Radar show which AI indexes mention your brand and how you compare to competitors.

Dashboard showing Stripe's AI Share of Voice compared to competitors across platforms like Google and ChatGPT. Data displayed in a table.

You should also watch for hallucinated pages AIs invent and treat as real, which can send users to 404s. This study shows how to spot and fix those issues.

Further reading

Final thoughts

This isn’t about dunking on AI. These tools are remarkable, and I use them daily. But these productivity tools are being used as answer engines in a world where anyone can spin up a credible-looking story in an hour.

Until they get better at judging source credibility and spotting contradictions, we’re competing for narrative ownership. It’s PR, but for machines that can’t tell who’s lying.

A big thank you to Xibeijia Guan for helping out with the APIs. 

Got questions or comments? Let me know on LinkedIn.

Similar Posts