Fletch Logo

Chat Showed Us What AI Could Do. It Was Never Meant To Be The Entire Interface.

The AI industry defaulted to the chat box because it was easy, not because it was right. New data, and a painful e-commerce experiment, are proving the point.

A laptop with an AI chat search bar
Image Source: Pinterest

There's a thought making the rounds in product circles right now, attributed to a16z general partner Andrew Chen: that bolting an AI chat panel onto an existing product is this generation's "weak form" of AI. The framing borrows from a framework Chen has written about extensively, the idea that every major platform shift produces two waves of products. The first wave ports familiar patterns into the new medium. The second wave reinvents entirely.

The mobile era had its flashlight apps. The web had its brochure-ware. Generative AI has its chat boxes. And right now, the industry is drowning in them.

The premise feels intuitive: if language models process language, why not make the interface a text box? But this conflates the capability of a model with the optimal interface for a human trying to accomplish something. Those are completely different problems.

The Walmart experiment is the clearest evidence yet

Walmart logo
Image Source: Pinterest

In late 2025, Walmart partnered with OpenAI to offer roughly 200,000 products through ChatGPT's Instant Checkout feature. The pitch was compelling: customers could discover and buy products entirely within the AI interface, without ever leaving the chat. This was the dream of agentic commerce made real.

The results were not encouraging.

  • 3× Lower conversion rate inside ChatGPT vs. Walmart's own website
    (Walmart EVP Daniel Danker, via Wired, March 2026)
  • 66% Drop in conversions when checkout moved into the AI chat interface
    (MarTech / Search Engine Land, March 2026)
  • ~1.18% Estimated in-chat conversion rate vs. 2.5–3% industry average
    (BuildMVPFast analysis, April 2026)

But Walmart didn't stop there, they ran the experiment the right way too by building Sparky, their own AI assistant. Sparky was embedded inside ChatGPT, but the checkout was powered by Walmart.com as a merchant. Its users spent more, converted better, and the whole thing felt more like a Walmart experience.

Bar chart comparing Walmart.com, ChatGPT instant checkout, and industry average conversion rates
Credit to Simon Taylor and Wired

OpenAI has since phased out Instant Checkout entirely, replacing it with a model where merchants embed their own checkout flows inside the AI interface, meaning the transaction happens in a more merchant branded environment.

So in a twist - this is not a failure story! That's the right architecture emerging from a wrong initial attempt. The most interesting thing about the Walmart experiment isn't that chat wasn't the ideal interface for everyone. It's that the industry course-corrected after this to presenting merchant branded UI. We are also surprisingly close to the control group conversion with only a few iterations.

Discovery may be moving into AI interfaces, but consumer conversion still requires brands to control the experience.

Why chat fails for most consumer tasks

Think about what makes a great shopping experience. You need trust signals: return policies, delivery estimates, seller ratings, price-match guarantees, loyalty point balances, reviews. You need comparison tools. You need images that load fast and specs that are scannable. You need a checkout flow your muscle memory already knows.

None of this happens naturally in a chat thread. When you move it there, you're asking users to hunt for information that good UI surfaces automatically. The cognitive load is higher, the trust cues are thinner, and the friction is greater. All in service of an interface paradigm that was never optimized for decisions that involve money.

This isn't just a commerce problem. Try configuring a complex SaaS product through chat. Try configuring a complex SaaS product through chat. You'd need to describe every toggle, every permission, every edge case in words. Try editing a design to pixel-perfect spec through a text prompt: "move it slightly to the left" is doing a lot of work when a drag handle exists. The moment a task has multiple parameters, constraints, and decision points, chat becomes an exhausting interface.

The right answer is hybrid and context-dependent

A comparison between a chat interface and native UI
Artifact Generated By: Claude

None of this means chat is useless. Chat is excellent for open-ended exploration, ambiguous starting points, and tasks where the user doesn't yet know what they need. "Help me figure out what kind of sofa to get" is a great chat task. "Add to cart and check out" is not.

The products that will win are those that use a little chat to establish context and intent, then shift into purpose-built UI to execute. Think of it as a handoff: conversation to configuration. The AI does the work of understanding what you want; the interface does the work of making it easy to understand, confirm, and complete.

In April 2026, Amazon, Meta, Microsoft, Google, Shopify, Stripe, Walmart, and dozens of others joined the Universal Commerce Protocol — an open standard that lets AI agents interact directly with merchant backends without owning the transaction. The best analogy is the shipping container: it standardized the interface without replacing the merchant, made the whole system interoperable, and didn't care who made the goods. UCP does the same thing for agentic commerce. The agent handles discovery. The merchant handles the close. They talk to each other through a shared protocol, and nobody has to cede the customer relationship to make it work.

Chat got people in the door, and now comes the more interesting problem: what do the rooms look like to make everyone stay?

References