Sakana AI CTO Llion Jones Calls for New Directions in AI Research, Moves Away from Transformers

In a bold and unprecedented move, Llion Jones, the Chief Technology Officer of Sakana AI and one of the original architects of the transformer architecture, has announced his decision to step away from the very technology that has become synonymous with modern artificial intelligence. Speaking at the TED AI conference in San Francisco, Jones delivered a candid critique of the current state of AI research, arguing that the field has become dangerously narrow and overly reliant on transformers, the architecture he co-developed in 2017 through the seminal paper “Attention Is All You Need.”

Transformers have revolutionized the landscape of AI, powering major models such as ChatGPT and Claude, and enabling systems capable of generating human-like text, images, and more. Despite their success, Jones expressed deep concern that the overwhelming focus on this single architectural approach is stifling creativity and innovation within the AI community. He pointed out that while there has never been more investment, talent, and interest in AI, the pressure to deliver results has led researchers to prioritize safe, publishable projects over riskier, potentially transformative ideas.

Jones articulated a paradox: the influx of resources into AI research has not fostered creativity but rather constrained it. Researchers are increasingly preoccupied with the fear of being “scooped” by competitors working on similar ideas, leading to a homogenization of research efforts. This environment, he argued, damages the scientific process, as scholars rush to publish papers, often sacrificing depth and originality for the sake of expediency. He likened the current state of AI research to an “exploration versus exploitation” dilemma, where the industry is exploiting existing knowledge without adequately exploring new possibilities.

Reflecting on the period before transformers emerged, Jones recalled how researchers were caught in a cycle of incremental improvements to recurrent neural networks, the previous dominant architecture. The arrival of transformers rendered much of that work obsolete, raising the question of how many researchers today might be similarly wasting time on minor tweaks when a groundbreaking innovation could be just around the corner. He warned that the AI community risks repeating this pattern by concentrating too heavily on a single architecture and merely permuting it, rather than seeking out the next significant breakthrough.

To illustrate his point, Jones shared insights into the conditions that allowed the transformer architecture to flourish in the first place. He described the project as “very organic, bottom up,” born from informal discussions and spontaneous brainstorming sessions rather than top-down directives or management pressure. This freedom to explore and innovate, he noted, is largely absent in today’s research environment, where even highly compensated researchers may feel compelled to pursue low-hanging fruit rather than daring to explore wild or speculative ideas.

At Sakana AI, Jones is actively working to recreate the exploratory environment that fostered the development of transformers. He advocates for a culture of open collaboration and knowledge sharing, encouraging researchers to turn up the “explore dial” and pursue ideas that might not otherwise see the light of day. One notable initiative at Sakana is the “Continuous Thought Machine,” a project that incorporates brain-like synchronization into neural networks. This innovative concept was born from an employee’s pitch, which might have faced skepticism and pressure in a more traditional research setting. At Sakana, however, Jones provided the freedom to explore, resulting in a successful project that garnered attention at the prestigious NeurIPS conference.

Jones’s perspective carries significant weight, given his pivotal role in shaping the current AI landscape. His decision to distance himself from transformers—an architecture that has defined his career—underscores the urgency of his message. He is not dismissing the value of ongoing transformer research; rather, he emphasizes that the current technology’s success may inadvertently hinder the search for better alternatives. “If the current technology was worse, more people would be looking for better,” he remarked, highlighting the need for a shift in focus within the AI community.

As the industry grapples with signs of diminishing returns from simply scaling up existing transformer models, Jones’s call for exploration over exploitation resonates deeply. Leading researchers have begun to openly discuss the limitations of the current paradigm, suggesting that architectural innovations—not just increased scale—will be essential for continued progress toward more capable AI systems. Jones’s warning suggests that finding those innovations may require dismantling the very incentive structures that have driven AI’s recent boom.

The competitive landscape of AI research, characterized by fierce rivalry among labs and rapid publication cycles, often prioritizes secrecy over collaboration. Jones argues that this environment is detrimental to the collective advancement of the field. He envisions a future where researchers can openly share findings, fostering a spirit of collaboration rather than competition. “Genuinely, from my perspective, this is not a competition,” he concluded. “We all have the same goal. We all want to see this technology progress so that we can all benefit from it.”

The stakes are high as the AI community stands at a crossroads. The next transformer-scale breakthrough could be just around the corner, pursued by researchers empowered to explore new avenues. Alternatively, it could remain unexplored while thousands of researchers race to publish incremental improvements on an architecture that, as Jones aptly put it, he is “absolutely sick of.”

In this context, Jones’s departure from transformers serves as a clarion call for the AI community to embrace a more exploratory mindset. The potential for transformative breakthroughs lies not only in refining existing technologies but also in daring to imagine entirely new paradigms. As the industry continues to evolve, the challenge will be to balance the demands of immediate results with the necessity of long-term exploration and innovation.

Ultimately, Jones’s journey reflects a broader narrative within the AI landscape—a recognition that the path to true advancement may require stepping away from familiar comforts and venturing into uncharted territory. As the AI community contemplates its future, the lessons drawn from Jones’s experiences may serve as a guiding light, illuminating the way toward a more creative and collaborative approach to research and development.

In conclusion, Llion Jones’s decision to move away from transformers is not merely a personal choice; it is a profound statement about the future of AI research. His insights challenge the status quo and urge the community to reconsider its priorities. By fostering an environment that encourages exploration and collaboration, the AI field can unlock new possibilities and pave the way for the next generation of groundbreaking innovations. As we stand on the brink of a new era in artificial intelligence, the call for a shift in focus is both timely and necessary, reminding us that the greatest advancements often arise from the courage to explore the unknown.