Jensen expects $1 trillion in chip orders. Here's what he's actually selling. | BullCity AI

Written by Daniel | Apr 15, 2026 4:52:57 PM

Yes, it's Thursday again. At this point I think we should stop calling it a mistake and start calling it a pattern. BullCity AI: the Wednesday newsletter that consistently ships on Thursday. We're rebranding the lateness as a feature. You're welcome.

This week, Jensen Huang did what Jensen Huang does best: stood on a stage for two hours and made a trillion dollars sound reasonable. NVIDIA's GTC keynote dropped Monday, the Anthropic/Pentagon saga added 150 retired judges to its cast, and All Things AI is now four days away. Here's the full download.

🟢 GTC Keynote Recap: $1 Trillion, Seven Chips, and a Disney Robot

Jensen Huang's two-hour keynote at SAP Center. The headline number: NVIDIA now expects $1 trillion in purchase orders for Blackwell and Vera Rubin chips through 2027 - double the $500 billion forecast from last fall.

What actually shipped:

Vera Rubin is real and in production. Seven new chips, five rack types, one giant AI supercomputer. A full Vera Rubin POD spans 40 racks, 1,152 Rubin GPUs, nearly 20,000 NVIDIA dies, and 1.2 quadrillion transistors delivering 60 exaflops. 100% liquid-cooled, installs in two hours. Ships H2 2026.

The Groq reveal. NVIDIA unveiled the Groq 3 LPU (Language Processing Unit) - a purpose-built inference chip that pairs with Rubin GPUs. The GPU handles the compute-heavy "prefill" phase; the Groq chip handles the speed-critical "decode" phase. Together they deliver 35x higher inference throughput per megawatt. Ships Q3 2026.

NemoClaw and the agentic OS. Jensen compared NemoClaw (NVIDIA's enterprise-ready wrapper for OpenClaw) to what Windows was for personal computers. One command installs the full agent stack: OpenClaw framework, Nemotron models, and OpenShell runtime with security guardrails.

Feynman is the 2028 roadmap. Next architecture after Vera Rubin. Built on 1.6nm process.

Physical AI is here. An Olaf robot from Frozen walked on stage, fully autonomous, trained in NVIDIA simulations with Disney. BYD, Hyundai, Nissan, and Geely all building Level 4 autonomous vehicles on NVIDIA's Drive Hyperion. Uber launching NVIDIA-powered robotaxi fleets across 28 cities by 2028.

My take: The Groq integration is the real story. For two years, the industry debate was "who will challenge NVIDIA in inference?" NVIDIA's answer: we'll just buy the best inference company and bolt it onto our GPU platform. That 35x throughput-per-megawatt number matters because the constraint on AI scaling right now isn't intelligence, it's power.

⚖️ Anthropic v. Pentagon: The Coalition Gets Bigger

The March 24 hearing is now five days away. The list backing Anthropic keeps expanding: - 150 retired federal and state judges (Republican and Democrat appointees) filed an amicus brief arguing the Pentagon "misinterpreted the statute and violated the necessary procedures" - 22 former high-ranking military officials (including former secretaries of Air Force, Army, and Navy) called the designation "retribution against a private company that has displeased the leadership" - Major tech industry groups representing Pentagon contractors warned procurement "becomes contingent on political favor rather than the rule of law"

On the other side, DOJ argues Anthropic's terms of service "have become unacceptable to the executive branch," and Anthropic's ability to potentially disable or alter its model during warfighting is "an unacceptable risk to national security."

Federal agencies are in limbo - nearly three weeks after Trump's Truth Social directive, some agencies ripped out Claude within hours, others are still reviewing, several received no formal guidance beyond the post.

⚡ Quick Hits

  • NVIDIA's "Nemotron Alliance" - Jensen hosted 11 open model leaders on stage: Mistral, Perplexity, Cursor, Thinking Machines Lab, LangChain. NVIDIA announced Nemotron 4, the next open foundation model.
  • Mistral Small 4 dropped - Combining capabilities of Magistral, Pixtral, and Devstral into a unified multimodal model. Reasoning, vision, and code in one package.
  • Anthropic published its largest user study ever - 81,000 Claude users surveyed. Largest multilingual qualitative study of its kind.
  • Morgan Stanley projects a 9-18 gigawatt net U.S. power shortfall through 2028 - Companies already converting Bitcoin mining operations into AI data centers.
  • CUDA turned 20 - The installed base of hundreds of millions of CUDA-enabled GPUs creates a self-reinforcing flywheel.

🧠 One Thing I'm Thinking About

Inference just became the main event. "It's way past training," Jensen said. "The inference inflection point has arrived."

The compute needed for inference has increased roughly 10,000x as AI moves from simple chatbot responses to multi-step reasoning, agentic workflows, and agent-to-agent communication. An AI agent talking to another AI agent needs tokens at 1,500+ per second.

For builders: the thing you're optimizing for is about to change. Training cost per token will keep declining. Inference cost per token is where the next wave of innovation will be. Start thinking about your inference economics now.

📅 All Things AI 2026 — 4 Days Out

March 23-24 in Durham, NC. 4,000+ attendees, 80+ speakers across 6 tracks, workshops on Day 1 (AI for DevOps, Business Pros, Agents). IBM Generative Computing Lounge. Speakers from Netflix, Red Hat, SAS, Fidelity, U.S. Bank. Parking $7 at Durham Centre Garage.