Quick Facts
- Category: Technology
- Published: 2026-05-04 17:32:12
- Guide to LiteLLM CVE-2026-42208 SQL Injection Exploited within 36 Hours of Di...
- 7 Key Features of Amazon Bedrock Guardrails' Cross-Account Safeguards for Centralized AI Safety
- Enhancing Deployment Reliability at GitHub with eBPF
- How Meta’s Adaptive Ranking Model Revolutionizes Ad Serving at Scale
- Go 1.25 Introduces Flight Recorder for Real-Time Execution Tracing
NVIDIA and Google Cloud have been working together for over a decade to build a full-stack AI platform that spans everything from performance-optimized libraries to enterprise cloud services. This partnership helps developers, startups, and enterprises bring agentic and physical AI from research into real-world production—whether it's AI agents managing complex workflows or robots and digital twins on factory floors. At Google Cloud Next in Las Vegas, they announced major advancements to their AI Hypercomputer infrastructure, designed to support the next frontier of AI workloads. Below, we explore key questions about this collaboration and its latest innovations.
- What is the core focus of the NVIDIA and Google Cloud partnership?
- What new infrastructure was announced at Google Cloud Next?
- How does the A5X with Vera Rubin improve AI performance?
- What options does Google Cloud offer with NVIDIA Blackwell GPUs?
- What advancements were made for agentic AI?
- How does this collaboration support physical AI applications?
What is the core focus of the NVIDIA and Google Cloud partnership?
The partnership between NVIDIA and Google Cloud is centered on building a comprehensive, integrated AI platform that spans every layer of technology. This includes performance-optimized libraries, frameworks, and enterprise-grade cloud services. The goal is to enable developers, startups, and enterprises to move agentic and physical AI from research into production. Agentic AI refers to systems that can manage complex workflows autonomously, while physical AI includes robots and digital twins used in industrial settings. By co-engineering across chips, systems, and software, the two companies aim to provide flexible, scalable infrastructure that can handle everything from frontier models to open models, all while optimizing for performance, cost, and sustainability. This collaboration has been ongoing for over a decade, and the latest announcements at Google Cloud Next mark a significant milestone in expanding their AI Hypercomputer for AI factories.

What new infrastructure was announced at Google Cloud Next?
At Google Cloud Next, the companies announced the new A5X bare-metal instances powered by NVIDIA Vera Rubin NVL72 rack-scale systems. These instances represent the next generation of AI infrastructure, designed to handle the most demanding workloads. The A5X instances use NVIDIA ConnectX-9 SuperNICs combined with next-generation Google Virgo networking. This architecture supports scaling up to 80,000 NVIDIA Rubin GPUs within a single site cluster and up to 960,000 NVIDIA Rubin GPUs in a multisite cluster. Additionally, Google Cloud previewed Gemini on Google Distributed Cloud running on NVIDIA Blackwell and Blackwell Ultra GPUs, as well as confidential VMs with NVIDIA Blackwell GPUs. These new offerings provide customers with a range of options for training, tuning, and serving AI models, from frontier models to agentic and physical AI workloads.
How does the A5X with Vera Rubin improve AI performance?
The A5X instances powered by NVIDIA Vera Rubin NVL72 rack-scale systems deliver significant performance improvements through extreme codesign across chips, systems, and software. According to the announcement, they achieve up to 10x lower inference cost per token and 10x higher token throughput per megawatt compared to the prior generation. This is made possible by the tight integration of the NVIDIA Vera Rubin architecture with Google Cloud's infrastructure. The system uses advanced networking technologies like NVIDIA ConnectX-9 SuperNICs and Google Virgo networking to scale efficiently. For customers running large AI workloads, this means they can achieve higher performance while reducing both costs and energy consumption. The ability to scale to tens of thousands of GPUs in a single cluster or nearly a million across multiple sites further enhances the capability to handle the largest AI models and workloads.
What options does Google Cloud offer with NVIDIA Blackwell GPUs?
Google Cloud offers a broad portfolio of NVIDIA Blackwell-based solutions to meet diverse customer needs. This includes A4 VMs with NVIDIA HGX B200 systems for standard AI workloads, A4X VMs with NVIDIA GB200 NVL72 rack-scale systems for higher performance, and A4X Max with NVIDIA GB300 NVL72 systems for maximum capability. Additionally, there are fractional G4 VMs with NVIDIA RTX PRO 6000 Blackwell Server Edition GPUs for entry-level or smaller-scale use cases. Customers can choose from a single rack with up to 72 Blackwell GPUs using fifth-generation NVIDIA NVLink and NVLink 5 Switch, or scale out to tens of thousands of Blackwell GPUs across multiple interconnected racks. The portfolio also includes options for using just one-eighth of a GPU, allowing customers to right-size their acceleration capabilities for any workload, from small-scale inference to large-scale training.

What advancements were made for agentic AI?
For agentic AI, Google Cloud announced the preview of Google Gemini on Google Distributed Cloud running on NVIDIA Blackwell and Blackwell Ultra GPUs. This enables customers to deploy advanced AI agents that can manage complex workflows, automate tasks, and interact with users in natural language. Additionally, Google Cloud integrated its Gemini Enterprise Agent Platform with NVIDIA Nemotron open models and the NVIDIA NeMo framework. Nemotron provides open-source models that can be fine-tuned for specific agentic tasks, while NeMo offers tools for building, customizing, and deploying large language models. This combination allows developers to create sophisticated agentic AI systems that can reason, plan, and execute multi-step actions. The new infrastructure ensures these agents run efficiently, with low latency and high throughput, making them suitable for enterprise applications like customer service, IT automation, and business process optimization.
How does this collaboration support physical AI applications?
Physical AI—such as robots, autonomous vehicles, and digital twins—requires real-time processing, simulation, and control. The NVIDIA and Google Cloud partnership supports these applications by providing the necessary infrastructure and tools. The A5X instances with Vera Rubin GPUs offer the massive compute power needed for training and deploying physical AI models. Google Cloud's AI Hypercomputer integrates with NVIDIA's platforms like NVIDIA Omniverse for digital twin simulation and NVIDIA Isaac for robotics. The high-performance networking and scaling capabilities enable simulation environments that can model complex factory floors or entire cities. Additionally, confidential VMs with NVIDIA Blackwell GPUs ensure data security for sensitive industrial applications. By combining Google Cloud's managed services with NVIDIA's hardware and software, customers can develop, test, and deploy physical AI solutions more efficiently, bridging the gap between simulation and real-world operation.