NVIDIA Unveils Rubin CPX GPU Designed for Million-Token AI Workloads at AI Infra Summit

At the recent AI Infra Summit held in Santa Clara, NVIDIA made a significant announcement that is poised to reshape the landscape of artificial intelligence processing. The company unveiled its latest GPU, the Rubin CPX, which is specifically designed to handle massive-context AI workloads, including applications that require processing million-token coding and long-form video generation. This launch marks a pivotal moment for NVIDIA as it continues to solidify its position as a leader in the AI hardware space.

The Rubin CPX is not just another addition to NVIDIA’s lineup; it represents a new class of GPU that is purpose-built for the demands of modern AI applications. Jensen Huang, NVIDIA’s CEO, emphasized the transformative potential of the Rubin CPX, stating, “Just as RTX revolutionized graphics and physical AI, Rubin CPX is the first CUDA GPU purpose-built for massive-context AI, where models reason across millions of tokens of knowledge at once.” This statement encapsulates the essence of what the Rubin CPX aims to achieve: enabling AI models to process and understand vast amounts of information simultaneously, thereby enhancing their reasoning capabilities.

One of the standout features of the Rubin CPX is its ability to accelerate attention mechanisms three times faster than its predecessor, the GB300 NVL72 systems. This improvement is crucial for applications that require longer context sequences, as it allows for quicker processing without sacrificing output quality. The integration of the Rubin CPX into the Vera Rubin NVL144 CPX system further amplifies its capabilities, delivering an astounding eight exaflops of AI performance, 100 terabytes of memory, and a staggering 1.7 petabytes per second of memory bandwidth. Such specifications are unprecedented and set a new benchmark for what is possible in AI processing.

The implications of the Rubin CPX extend beyond mere performance metrics. NVIDIA has positioned this GPU as a game-changer for various industries, particularly those involved in advanced coding and generative video tools. Early adopters of the Rubin CPX include companies like Cursor, Magic, and Runway, each leveraging the GPU’s capabilities to enhance their respective offerings. For instance, Cursor plans to utilize the Rubin CPX to improve developer productivity, while Runway aims to support advanced generative video workflows. Cristóbal Valenzuela, CEO of Runway, highlighted the GPU’s potential by stating, “This means creators, from independent artists to major studios, can gain unprecedented speed, realism, and control in their work.” Meanwhile, Magic is focusing on developing software agents capable of reasoning across 100-million-token contexts without the need for additional fine-tuning.

The versatility of the Rubin CPX makes it suitable for a wide range of use cases. It is designed to tackle complex tasks such as analyzing codebases with over 100,000 lines or processing more than an hour of high-definition video. These capabilities are essential for organizations looking to harness the power of AI in their operations, whether for software development, content creation, or data analysis.

In conjunction with the launch of the Rubin CPX, NVIDIA also introduced its AI Factory reference designs. This initiative serves as a blueprint for building giga-scale AI data centers, integrating compute, cooling, power, and simulation into a unified system. The AI Factory aims to move beyond traditional data center designs, optimizing every watt of energy to contribute directly to intelligence generation. NVIDIA has partnered with industry leaders such as Siemens Energy, Schneider Electric, GE Vernova, and Jacobs to bring this vision to life. Ian Buck, NVIDIA’s vice president of the data center business unit, articulated the goal of the AI Factory, stating that it seeks to create a more efficient and effective infrastructure for AI workloads.

On the benchmarking front, NVIDIA showcased impressive results from its Blackwell Ultra GPUs during the event. The company published MLPerf Inference v5.1 results that demonstrated record performance across various models, including DeepSeek-R1, Llama 3.1 405B, and Whisper. Notably, the DeepSeek-R1 model achieved an astonishing rate of over 5,800 tokens per second per GPU in offline testing, representing a 4.7x improvement over previous Hopper-based systems. These benchmarks highlight the advancements in AI processing capabilities that the Rubin CPX and Blackwell Ultra GPUs bring to the table.

NVIDIA’s claims regarding the economic potential of the Rubin CPX are equally compelling. The company estimates that every $100 million invested in Rubin CPX infrastructure could generate up to $5 billion in token revenue. This projection underscores the financial viability of investing in cutting-edge AI technology, particularly for businesses looking to capitalize on the growing demand for AI-driven solutions.

As the AI landscape continues to evolve, the introduction of the Rubin CPX signifies a critical step forward in addressing the challenges associated with large-scale AI workloads. The ability to process millions of tokens efficiently opens up new possibilities for innovation across various sectors, from entertainment and media to software development and beyond. With the anticipated availability of the Rubin CPX at the end of 2026, organizations have a unique opportunity to prepare for the next wave of AI advancements.

In conclusion, NVIDIA’s launch of the Rubin CPX GPU at the AI Infra Summit marks a significant milestone in the evolution of AI hardware. By focusing on massive-context AI workloads and providing unparalleled performance capabilities, the Rubin CPX is set to redefine what is possible in the realm of artificial intelligence. As companies begin to adopt this technology, we can expect to see transformative changes in how AI is utilized across industries, paving the way for a future where intelligent systems can process and reason with vast amounts of information seamlessly. The journey towards this future is just beginning, and the Rubin CPX is poised to lead the charge.