Nvidia unveils next-generation Rubin platform, inference costs 10 times lower than Blackwell, plans to ship in the second half of the year
Nvidia launched its next-generation Rubin AI platform at CES, marking its commitment to an annual update cycle in the field of artificial intelligence (AI) chips. The platform, featuring an integrated design of six new chips, delivers significant leaps in inference costs and training efficiency, with initial deliveries to customers expected in the second half of 2026.
On Monday the 5th Eastern Time, Nvidia CEO Jensen Huang stated in Las Vegas that all six Rubin chips have returned from manufacturing partners and have passed some key tests, progressing as scheduled. He pointed out that "the AI race has begun, and everyone is striving to reach the next level." Nvidia emphasized that the system operating costs based on Rubin will be lower than the Blackwell version, as the same results can be achieved with fewer components.
Microsoft and other major cloud computing providers will be among the first customers to deploy the new hardware in the second half of the year. Microsoft's next-generation Fairwater AI superfactory will be equipped with Nvidia Vera Rubin NVL72 rack-level systems, scalable to hundreds of thousands of Nvidia Vera Rubin superchips. CoreWeave will also be one of the first suppliers to offer Rubin systems.
The launch of this platform comes at a time when some on Wall Street express concerns about increased competition facing Nvidia and question whether AI spending can be maintained at its current pace. However, Nvidia maintains a long-term bullish outlook, believing the total market size could reach trillions of dollars.
Performance Boost Targets Next-Generation AI Needs
According to Nvidia's announcement, the Rubin platform's training performance is 3.5 times that of the previous generation Blackwell, while its performance in running AI software is five times higher. Compared to the Blackwell platform, Rubin can lower the cost of generating inference tokens by up to 10 times and reduce the number of GPUs required to train mixture-of-experts (MoE) models by four times.
The new platform features the Vera CPU with 88 cores, delivering double the performance of its predecessor. This CPU is specifically designed for proxy inference and is the most energy-efficient processor in large-scale AI factories, featuring 88 custom Olympus cores, full Armv9.2 compatibility, and ultra-fast NVLink-C2C connectivity.
The Rubin GPU is equipped with a third-generation Transformer engine, featuring hardware-accelerated adaptive compression and providing 50 petaflops of NVFP4 computing power for AI inference. Each GPU delivers 3.6TB/s of bandwidth, while the Vera Rubin NVL72 rack offers 260TB/s of bandwidth.
Chip Testing Progresses Smoothly
Jensen Huang revealed that all six Rubin chips have returned from manufacturing partners and have passed key tests showing they can be deployed as scheduled. This statement indicates that Nvidia is maintaining its leading position as a top manufacturer of AI accelerators.
The platform incorporates five major innovative technologies: sixth-generation NVLink interconnect technology, Transformer engines, confidential computing, RAS engines, and Vera CPU. The third-generation confidential computing technology makes the Vera Rubin NVL72 the first rack-level platform to provide data security protection across CPU, GPU, and NVLink domains.
The second-generation RAS engine spans GPU, CPU, and NVLink, featuring real-time health checks, fault tolerance, and proactive maintenance functions to maximize system productivity. The rack adopts a modular, cable-free tray design, enabling assembly and maintenance speeds 18 times faster than Blackwell.
Broad Ecosystem Support
Nvidia stated that Amazon's AWS, Google Cloud, Microsoft, and Oracle Cloud will be among the first to deploy Vera Rubin-based instances in 2026, with cloud partners CoreWeave, Lambda, Nebius, and Nscale following suit.
OpenAI CEO Sam Altman said: "Intelligence scales with compute. As we add more compute, models become more powerful, able to solve more difficult problems, and have greater impact for people. Nvidia's Rubin platform helps us continue to scale this progress."
Anthropic co-founder and CEO Dario Amodei said that Nvidia's "Rubin platform efficiency improvements represent infrastructure advances that enable longer memory, better reasoning, and more reliable outputs."
Meta CEO Mark Zuckerberg stated that Nvidia's "Rubin platform promises a step-change in performance and efficiency, which is necessary to bring state-of-the-art models to billions of people."
Nvidia also stated that Cisco, Dell, HPE, Lenovo, and Supermicro are expected to launch various servers based on Rubin products. AI labs such as Anthropic, Cohere, Meta, Mistral AI, OpenAI, and xAI are looking forward to leveraging the Rubin platform to train larger and more powerful models.
Product Details Announced Ahead of Schedule
Commentary notes that Nvidia disclosed details of new products earlier this year than in previous years, as part of its efforts to maintain industry reliance on its hardware. Typically, Nvidia provides in-depth product details at its annual GTC event held in San Jose, California, each spring.
For Jensen Huang, CES is just another stop in his marathon of event appearances. At various events, he announces products, partnerships, and investments, all aimed at driving the deployment of AI systems.
The new hardware unveiled by Nvidia also includes networking and connectivity components, which will become part of the DGX SuperPod supercomputers, while also being available as modular standalone products for customers. This performance upgrade is essential, as AI has shifted toward more specialized model networks that not only filter massive inputs but also address specific problems through multi-stage processes.
Nvidia is advancing AI applications across the entire economy, including robotics, healthcare, and heavy industry. As part of this effort, Nvidia announced a series of tools designed to accelerate the development of autonomous vehicles and robots. Currently, most spending on Nvidia-based computers comes from the capital expenditure budgets of a few customers, including Microsoft, Google Cloud under Alphabet, and AWS under Amazon.
Disclaimer: The content of this article solely reflects the author's opinion and does not represent the platform in any capacity. This article is not intended to serve as a reference for making investment decisions.
You may also like
BRD Stablecoin Ties Brazil’s Sovereign Debt to On-Chain Yield
Ethereum bumps blob capacity as it gears for Fusaka upgrade
