On a foggy stretch of San Francisco’s retail corridor, Andon Market has become a kind of live experiment—part boutique, part software demo, and part stress test for the idea that an artificial intelligence agent can do more than recommend products. It can order them, restock them, and shape what customers see when they walk in.
The store is being promoted as the first retail boutique run by an AI agent. But the early public reaction hasn’t been “wow, it’s perfectly optimized.” Instead, it’s been a more human verdict: the inventory feels oddly random, and there are, in the words of multiple shoppers, too many candles.
That detail—candles, again and again—has turned into a symbol. Not because candles are inherently a problem, but because they point to something larger: when an AI system is tasked with running a store, it doesn’t just manage logistics. It also makes choices about taste, variety, and what “good retail” looks like. And if those choices aren’t aligned with customer expectations, the mismatch becomes visible immediately.
What Andon Market appears to be showing, at least so far, is that retail is not only a supply chain problem. It’s a cultural and behavioral problem. It’s about rhythm: what people want today, what they might want next week, and how a store signals that it understands them. An AI agent can be excellent at certain kinds of optimization—predicting demand, minimizing waste, responding quickly to trends—but it may still struggle with the softer parts of retail that humans learn through experience.
In other words: the store isn’t failing because the AI can’t operate. It’s failing, or at least wobbling, because operating a store is not the same thing as running one well.
A store that behaves like a conversation
Andon Market’s premise is straightforward: an AI agent manages the store’s inventory decisions. The system is described as autonomous, meaning it doesn’t simply generate a shopping list once and then step aside. It continuously updates what should be stocked, presumably based on signals such as sales patterns, browsing behavior, time of day, and possibly external data like local events or seasonal trends.
That continuous decision-making is where the “randomness” comes from. In a traditional retail setup, buyers and merchandisers build a plan. They decide what the store will be known for, what it will carry consistently, and what it will rotate. Even when they experiment, they do it within a framework: a category mix, a brand identity, and a set of guardrails.
An AI agent, by contrast, may treat the store like a dynamic environment to be explored. If the system is designed to learn—testing hypotheses about what sells—it may intentionally vary inventory to gather information. That can look like randomness to customers, especially when the variation clusters around a single category.
Candles are a perfect example of why this matters. Candles are easy to stock, easy to display, and often have broad appeal. They’re also a category where small changes in assortment can create big differences in perceived “curation.” If the AI agent is experimenting, it might over-index on candles because they provide fast feedback: they sell, they’re low-friction purchases, and they generate data quickly. The result is a store that feels like it’s learning in public—except the learning is happening at the expense of variety.
When the agent learns faster than it calibrates
There’s a difference between “learning” and “stabilizing.” A system can learn quickly from early signals, but it still needs to converge toward a stable product mix that matches customer expectations.
If Andon Market’s AI agent is optimizing for short-term outcomes—such as immediate sales velocity or click-to-buy conversion—it may inadvertently prioritize categories that respond well to its current strategy. Candles may be one of those categories. They might also be one of the categories where the agent can find many options that fit broad descriptors (scented, unscented, seasonal, aesthetic) without needing deep knowledge of niche preferences.
But retail customers don’t experience optimization metrics. They experience the shelf.
A shelf full of candles doesn’t just mean “candles are selling.” It can also mean the store is failing to communicate breadth. It can make the store feel stuck in a loop: the same vibe, repeated too often. Even if the candles are different—different scents, different brands—the repetition can still read as a lack of judgment.
This is where the AI agent’s objective function becomes crucial. If the system is rewarded for maximizing certain measurable outcomes, it may ignore other outcomes that are harder to quantify. For example:
1) Customer perception of variety
2) Brand identity consistency
3) The sense that the store “gets” the neighborhood
4) Long-term loyalty rather than immediate conversion
Humans can weigh these factors intuitively. An AI agent needs them translated into signals. If those signals are missing, weak, or delayed, the agent will optimize what it can measure—and customers will notice what it can’t.
The “too many candles” complaint is therefore not just a quirky observation. It’s a clue about how the system is currently balancing exploration and exploitation.
Exploration is necessary; retail is unforgiving
In machine learning terms, exploration means trying different options to learn what works. Exploitation means leaning into what already seems to work. Many systems start with exploration and gradually shift toward exploitation as they gather enough data.
But retail is not a lab. Every day the store is open, customers are forming impressions. If the AI agent explores too aggressively, the store can feel inconsistent. If it exploits too early, it can get trapped in a narrow pattern.
Candles may be the visible symptom of a deeper issue: the agent might be exploring categories that yield quick feedback, while under-exploring categories that require longer-term trust-building. For instance, customers might be less likely to buy certain items on impulse—specialty skincare, niche stationery, or higher-priced gifts—unless the store has built credibility in those categories over time.
If the AI agent is impatient—if it expects rapid returns—it may keep returning to categories that behave like “easy wins.” Candles are easy wins. They’re also easy to display. They’re also easy to reorder. And they’re easy to treat as a default.
That’s not necessarily malicious. It’s what happens when a system is trying to solve a problem with incomplete information.
The neighborhood effect: retail is local, not generic
San Francisco is not just a market; it’s a set of micro-markets. Different blocks have different rhythms. Different customer groups show up at different times. A store can be “successful” in one neighborhood and feel off in another, even if the product mix is objectively good.
An AI agent may not fully internalize neighborhood nuance unless it has strong local signals. It might rely on generalized patterns—what sells in similar stores, what sells online, what sells during certain seasons. Those patterns can be useful, but they can also flatten the store’s identity.
If Andon Market is drawing from broader retail data, it might be importing a template that doesn’t match the specific tastes of the customers walking through its door. Candles, again, are a common template item. They’re popular across many demographics and contexts. They’re also a safe bet for a store that wants to appear “cozy” or “aesthetic.”
But the moment the store becomes too candle-heavy, the template stops feeling like curation and starts feeling like defaulting.
Customers don’t just buy products; they buy the feeling that a store is intentional. When the inventory looks like it’s following a generic script, the store loses that emotional edge.
The operational side: autonomy doesn’t remove constraints
Even if the AI agent is making the decisions, it still operates within real-world constraints: supplier availability, lead times, minimum order quantities, shipping costs, and shelf space. These constraints can distort the agent’s behavior.
For example, if certain suppliers offer candles in bulk or have better fulfillment reliability, the agent might repeatedly choose them—not because it prefers candles, but because candles are the easiest category to restock quickly and consistently. If the system is penalized for stockouts, it may prefer categories that can be replenished reliably.
Similarly, if the store’s ordering system is integrated in a way that makes some categories easier to update than others, the agent might lean into what it can execute smoothly.
This is one reason why “random inventory” can be misleading. The inventory might not be random at all. It might be the output of a constrained optimization process that customers interpret as randomness because they don’t see the constraints.
The candle surplus could be a logistical artifact as much as a strategic choice.
What customers are really reacting to: the loss of human taste
There’s a deeper cultural question behind the candle complaints: what happens when a store loses the human layer of taste?
Retail has always been partly about expertise—merchandising, buying, and the ability to curate. Even when humans make mistakes, their mistakes often reflect a coherent worldview. They might overstock a trend, but the trend usually has a narrative. The store feels like it has a point of view.
An AI agent’s point of view can be harder to detect. It might be optimizing for outcomes rather than aesthetics. It might be balancing competing objectives in ways that don’t translate into a coherent shelf.
So customers may not know what the store is trying to be. They only know what it is currently doing: stocking candles, sometimes repeatedly, while other categories feel underrepresented.
That’s why the story has caught attention. It’s not just about candles. It’s about whether AI-run retail can produce the kind of curated coherence that shoppers associate with good stores.
The “AI agent” label raises expectations—and scrutiny
When a company calls something “AI-run,” it invites a particular kind of expectation. People assume the system is sophisticated, intentional, and capable of learning quickly. They also assume it will be guided by human oversight, even if that oversight is indirect.
If the
