Huang Ren-Tsun GTC2026 Full Speech: AI Demand Reaches Trillions of Dollars, Computing Power Leaps 350 Times, OpenClaw Enables Every Company to Become AaaS

動區BlockTempo

2026-03-17 06:02:42

FET14,62%

NVIDIA CEO Jensen Huang announced at GTC 2026 that the demand will be at least $1 trillion by 2027, and outlined NVIDIA’s next decade blueprint with Token factories, CUDA flywheels, Vera Rubin monster systems, and OpenClaw intelligent operating systems.
(Background: Huang’s GTC speech on “DLSS 5, NemoClaw” ignited AI tokens: FET surged 20%, NEAR, Worldcoin hit recent highs)
(Additional context: China’s Ministry of State Security warns about “Lobster Farming”: OpenClaw contains four major security risks, your devices may be compromised)

Table of Contents

Toggle

Opening: The 20-year flywheel effect of CUDA
Token Factory Economics: Data centers no longer store files, they produce tokens
Vera Rubin: 350x in two years—this isn’t Moore’s Law, it’s a different curve
The real purpose of Groq acquisition: making the fast faster, the expensive more costly
DLSS 5: The graphics industry’s GPT moment
OpenClaw: The OS of the AI era
Future where every engineer has a token budget
Physical AI and the robot army
Next-gen: Feynman architecture + space data centers

On March 16, 2026, NVIDIA’s GTC conference opened, and Jensen Huang took the stage and said something that silenced the audience: “Last year I said there was a $500 billion high-confidence demand. Now, at this very moment, I see the number at least $1 trillion. And I am certain the actual demand is even higher.”

This statement caused NVIDIA’s stock to rise over 4.3% that day. But Huang wasn’t just reporting numbers; he spent the entire speech explaining where that $1 trillion comes from and why it’s still not enough.

Opening: The 20-year CUDA flywheel effect

The starting point of the speech was NVIDIA’s core business—CUDA. This year marks its 20th anniversary, and Huang describes CUDA as NVIDIA’s “100% strategic logic.”

In plain language: CUDA is the technology that allows developers to program GPUs. When it launched twenty years ago, no one was sure it would succeed, but NVIDIA invested most of its resources to push through. Looking back now, that decision created an almost uncopyable moat—hundreds of millions of CUDA-enabled GPUs worldwide, tens of thousands of open-source projects relying on it, and every cloud provider integrating it.

Huang calls this a “flywheel”: large installed base → attracts developers → developers create new algorithms → breakthroughs lead to new markets → new markets expand the installed base → the flywheel keeps spinning. Even better, NVIDIA continuously updates software optimizations; six years ago, the Ampere architecture GPUs in the cloud are still seeing rising rental prices—because applications running on them are increasing and becoming more valuable.

Token Factory Economics: Data centers no longer store files, they produce tokens

This is the core concept of Huang’s speech and the key logic behind the $1 trillion demand.

Simply put: in the past, data centers were “warehouses” storing your files and data; in the future, they will be “factories” producing the fundamental units of AI—tokens (the smallest units of AI thinking and speech).

Huang explains that each data center is limited by power. A 1 GW (gigawatt) facility will never become 2 GW—this is a physical law. So the core competition becomes: with the same electricity, who can produce the most tokens? The one with the highest tokens per watt throughput and lowest production cost wins.

Tokens will also be tiered priced, like business class and economy:

Free tier (high throughput, low speed)
Mid-tier (about $3 per million tokens)
High-tier (about $6 per million tokens)
Ultra-fast tier (about $45 per million tokens)
Hyper-speed tier (about $150 per million tokens)

In other words, the same GPU’s power is allocated across different service levels—higher throughput and faster speed mean more profit. Huang estimates that compared to the previous Hopper architecture, the new Grace Blackwell system can generate five times the revenue under the same power budget.

Vera Rubin: 350x in two years—this isn’t Moore’s Law, it’s a different curve

Huang says, referencing the previous Hopper, he can hold up a chip to show; but when it comes to Vera Rubin, people think of the entire system.

The numbers speak: in the same 1 GW data center, token generation rate increased from 22 million/sec to 700 million/sec—a 350-fold increase in two years. In contrast, Moore’s Law during the same period only yields about 1.5x improvement.

What does this monster system look like? Vera Rubin is a 100% liquid-cooled design that eliminates traditional cables; racks that used to take two days to install now take only two hours. Microsoft Azure has confirmed the first Vera Rubin rack is online.

The real purpose of Groq acquisition: making the fast faster, the expensive more costly

NVIDIA’s integration of Groq’s technology isn’t to replace their own GPUs but to enable “asymmetric inference separation”—in simple terms, splitting AI inference into two stages, each optimized with the best tool.

Groq chips feature large amounts of ultra-fast SRAM (500MB), very responsive but small memory, suitable for the final token output step. Vera Rubin chips have large memory (288GB), suitable for early-stage heavy computation and caching.

NVIDIA uses Dynamo software to connect these: pre-filling and attention decoding are handled by Vera Rubin, while latency-sensitive token generation is handled by Groq. They are tightly coupled via Ethernet, reducing overall latency by about half.

Huang also offers a configuration tip: for high-throughput workloads, use 100% Vera Rubin; for high-value code generation, allocate about 25% of the data center capacity to Groq. The LP30 chips from Groq are mass-produced by Samsung, with shipments starting in Q3.

DLSS 5: The graphics industry’s GPT moment

Huang says, ten years ago, GeForce brought AI to the world; now, AI is reshaping computer graphics. He calls this new tech “Neural Rendering” or DLSS 5.

The core idea: combine traditional deterministic 3D graphics (structured, precisely controllable) with probabilistic generative AI aesthetics. Structured data ensures controllability, AI makes it look stunningly realistic. Huang states that this “structured data + generative AI” fusion will repeatedly appear across industries.

OpenClaw: The OS of the AI era

Peter Steinberger developed OpenClaw, which Huang calls “the most popular open-source project in human history, surpassing Linux’s achievements in just a few weeks.”

What is OpenClaw? Simply put: it enables AI agents to manage resources, call tools, read/write files, schedule tasks, and break big problems into smaller ones for sub-agents—like an OS running programs on a computer, but for AI agents in your enterprise IT environment.

Huang says, “Every SaaS company will become an AaaS company.” In other words, software firms will no longer just sell tools but “AI agent services that help you do things.”

But enterprise versions face a challenge: agents can access sensitive data and execute code, which must be strictly controlled. NVIDIA has introduced NeMo Claw, an enterprise reference design with policy engines and privacy routers, ensuring secure deployment inside companies.

Every engineer will have a token budget in the future

Huang makes a concrete workplace prediction: “In the future, every engineer in a company will have an annual token budget. Their salary might be hundreds of thousands of dollars, and I will give them a token quota worth half their salary, amplifying their output tenfold. The amount of tokens allocated at onboarding will become a new topic in Silicon Valley recruiting.”

This is not just a metaphor; he believes it will be a future standard for enterprise competitiveness: how much compute power you give engineers determines how much value they can create. Every company will be both a token user and a token producer.

Physical AI and the robot army

Huang states that digital intelligences operate in the digital world, while physical AI are embodied intelligences—robots. At this GTC, 110 robots appeared, representing nearly all global robot R&D companies.

In autonomous driving, Huang announced BYD, Hyundai, Nissan, and Geely joined the NVIDIA RoboTaxi Ready platform, with a combined annual output of 18 million vehicles, alongside Mercedes-Benz, Toyota, and GM. NVIDIA also announced a partnership with Uber to deploy and connect RoboTaxi Ready vehicles in multiple cities.

In the finale, Disney’s Olaf snowman robot took the stage, using Jetson chips as brains, learning to walk in Omniverse, and adapting to the real world with Newton physics solvers. Huang joked with Olaf, “I thought you’d be taller. I’ve never seen such a short snowman.”

Next-generation: Feynman architecture + space data centers

Toward the end, Huang “teased” the next-generation Feynman architecture, which will support both copper wiring and co-packaged optical (CPO) at the same level of scalability.

More ambitiously, he mentioned “Vera Rubin Space-1”—a data center computer deployed in space, extending AI compute beyond Earth.

Huang summarized the entire speech in four points: the inference inflection point is here, the AI factory era begins, the OpenClaw agent revolution, and the physical AI scale deployment. A trillion dollars is just the beginning.

View Original

Disclaimer: The information on this page may come from third parties and does not represent the views or opinions of Gate. The content displayed on this page is for reference only and does not constitute any financial, investment, or legal advice. Gate does not guarantee the accuracy or completeness of the information and shall not be liable for any losses arising from the use of this information. Virtual asset investments carry high risks and are subject to significant price volatility. You may lose all of your invested principal. Please fully understand the relevant risks and make prudent decisions based on your own financial situation and risk tolerance. For details, please refer to Disclaimer.

Comment

0/400

No comments