The AI economy has always been described as a race to build smarter models. But the conversation at the Milken Global Conference in Beverly Hills—where five leaders who touch nearly every layer of the AI supply chain sat down with TechCrunch—suggests something more uncomfortable: the bottlenecks are no longer confined to research labs or even to software engineering. They’re showing up in the physical world, in the economics of infrastructure, and possibly in the underlying architectural assumptions that have guided the industry for years.
What makes the discussion notable is that it didn’t treat “AI constraints” as a single problem with a single fix. Instead, the panel framed today’s slowdown and uncertainty as an ecosystem issue—one where chips, power, data center capacity, and even the basic design choices behind modern AI systems can each become a limiting factor. In other words, the wheels aren’t coming off because one component fails. They’re coming off because multiple components are being asked to do more than the system was designed to handle, and because the industry’s default path may not be the best path forward.
Chip supply and compute constraints: the bottleneck that keeps reappearing
If you want a single thread that runs through almost every AI supply chain story, it’s compute. Not compute in the abstract—compute as a commodity with real-world scarcity: the right chips, in the right quantities, delivered on time, with enough supporting components to keep training and inference pipelines running.
The panel emphasized that chip availability isn’t just about whether a company can buy accelerators. It’s about whether the entire compute stack can scale together. Even when demand is high, supply chains don’t expand instantly. Manufacturing capacity takes time to ramp. Packaging and advanced interconnects can become chokepoints. And the “last mile” of getting systems deployed—integrating hardware, validating performance, and ensuring reliability—can be slower than the headline numbers suggest.
This is why chip shortages and compute constraints remain such a persistent theme. The industry often talks about AI progress as if it’s primarily limited by algorithmic breakthroughs. But the panel’s framing made clear that scaling is constrained by what can be manufactured, shipped, and operationalized. When compute is scarce, it changes behavior: companies prioritize certain workloads, delay experiments, and shift toward optimization strategies that reduce training cost rather than maximize model size. That can be rational in the short term, but it also means the pace of innovation becomes tied to industrial logistics.
There’s also a subtler point: compute constraints influence not only how big models can get, but what kinds of models are feasible. If the cost of training is too high, teams may gravitate toward architectures that are easier to train efficiently, or toward approaches that reuse existing models and focus on fine-tuning. That shapes the direction of the market. Over time, the industry can end up optimizing for what is available rather than what is theoretically best.
Infrastructure challenge: the data center problem is bigger than people think
Even if chips were magically abundant, the panel suggested that the next gating factor is infrastructure—specifically, the ability to build and power the data centers that AI workloads require.
Data centers are not just buildings with servers. They’re complex systems involving power delivery, cooling, networking, physical security, and operational staffing. AI workloads are particularly demanding because they tend to be both compute-intensive and time-sensitive. Training runs can be scheduled like industrial processes: if you miss a window, you lose money. That creates pressure to ensure capacity is not only available, but reliable and predictable.
The panel’s discussion highlighted how infrastructure constraints can become the real limiter even when chip supply improves. Power availability is often the first headline issue, but it’s not the only one. There are also constraints around grid interconnection timelines, transformer capacity, cooling efficiency, and the ability to deploy high-density racks without compromising thermal stability. Networking bandwidth and latency matter too, especially for distributed training where communication overhead can become a hidden tax on performance.
This is where the “wheels” metaphor becomes literal. A compute cluster can be built only as fast as the surrounding infrastructure allows. And unlike software, which can be iterated quickly, data center capacity is capital-intensive and slow to scale. That means the AI economy can experience a mismatch between demand signals and physical reality: companies may plan for rapid scaling, only to discover that the bottleneck is power, space, or deployment timelines.
The result is a kind of economic friction. When infrastructure is scarce, the cost of running AI rises. When costs rise, the business case for certain applications weakens. And when the business case weakens, investment patterns shift. This is not just a technical issue; it’s a market-shaping force.
New approaches to compute and data: looking beyond ground-based infrastructure
One of the more forward-looking topics raised in the conversation was orbital data centers—an idea that sounds futuristic until you consider the underlying motivation. The motivation is straightforward: if terrestrial infrastructure is constrained, then the industry will explore alternatives that change the geometry of the problem.
Orbital data centers are not a near-term replacement for today’s hyperscale facilities. But the panel’s inclusion of the concept signals something important: the industry is actively searching for ways to meet demand when traditional scaling paths hit physical limits. The question isn’t only “Can we build more data centers?” It’s “Can we build them differently?”
There are multiple reasons orbital concepts attract attention. One is resilience and distribution. Another is the possibility of reducing certain types of latency or enabling new data collection and processing workflows. But perhaps the most compelling reason is that orbital infrastructure could, in theory, bypass some terrestrial constraints—especially those related to land use, power delivery, and grid interconnection delays.
Of course, orbital infrastructure introduces its own constraints: launch costs, maintenance complexity, bandwidth limitations, and regulatory hurdles. It also raises questions about how AI workloads would be partitioned between space and ground. Would training happen in orbit? Or would orbit primarily support data ingestion and preprocessing, with heavy training still occurring on Earth? The panel’s discussion didn’t need to settle these details to make the broader point: the industry is thinking beyond the assumption that all AI compute must live on the ground.
That shift matters because it changes planning horizons. If you believe the bottleneck is permanent, you start designing around it. If you believe it’s temporary, you wait. Orbital ideas represent a willingness to treat infrastructure constraints as structural rather than episodic.
The architecture question: is the foundation itself due for a rethink?
Perhaps the most provocative theme was the possibility that the foundational architecture powering today’s AI may not be optimal for what comes next. This is where the conversation moves from supply chain mechanics to strategic design.
Modern AI systems—especially large-scale training pipelines—have been shaped by a set of assumptions that worked well for the last wave of progress. Those assumptions include how models are trained, how data is curated and fed into training loops, how compute is allocated across tasks, and how inference is served at scale. The industry has also converged on certain hardware-software co-optimization strategies: specific ways of mapping workloads onto accelerators, specific parallelization techniques, and specific memory and communication patterns.
But the panel’s framing suggests that convergence might be a trap. When the industry standardizes too quickly, it can lock in inefficiencies. If the standard architecture is optimized for a particular cost structure—say, cheap training relative to inference, or abundant compute relative to data—then changes in the cost structure can expose weaknesses.
For example, if compute becomes expensive and scarce, then architectures that require massive training runs may become less attractive. Teams might prefer approaches that reduce training cost, increase reuse, or shift more work to inference-time adaptation. Similarly, if data becomes harder to obtain or more expensive to process, then architectures that depend on large-scale data ingestion may face diminishing returns.
There’s also the question of whether the industry’s current approach to scaling is aligned with the next phase of AI value. Scaling laws have been a powerful guide, but they don’t guarantee that “more of the same” remains the best strategy indefinitely. At some point, the marginal gains from scaling can slow, while the marginal costs continue to rise. That’s when architectural innovation becomes more than a research topic—it becomes a survival strategy.
The panel’s “architecture question” can be read as a warning: the AI economy may be trying to run a future workload on a past blueprint. If so, the wheels won’t come off because the models stop working. They’ll come off because the economics of building and deploying them stop making sense.
A unique take: the AI economy is becoming a manufacturing economy
One way to interpret the panel’s themes is to see AI not as a purely digital product, but as a manufacturing economy with software outputs. In manufacturing, bottlenecks are normal. You don’t blame a factory for having constraints; you redesign the process when constraints become binding.
Chips are the raw materials. Data centers are the production lines. Power and networking are the utilities. Orbital concepts are alternative factories. And architecture is the process design—the method by which inputs are transformed into outputs.
When you view AI this way, the “wheels coming off” metaphor becomes less dramatic and more diagnostic. The system is doing what it was built to do, but the demand profile has changed. The industry is now operating under constraints that were either underestimated or treated as temporary.
This manufacturing lens also explains why the panel’s conversation spanned such different topics. Chip shortages, infrastructure limits, and architectural debates are not separate stories. They’re different layers of the same production system. If one layer is constrained, it forces changes in the others. If the constraints persist, the system eventually needs redesign.
What happens next: a shift from peak scaling to constraint-aware scaling
The most practical takeaway from the panel is that the AI economy is likely to move toward constraint-aware scaling. That doesn’t mean AI development stops. It means the industry becomes more disciplined about where it spends compute, how it schedules workloads, and how it chooses architectures.
In the near term, expect
