NVIDIA’s Play: Redefining the GPU as an AI Factory
At NVIDIA’s annual shareholder meeting on June 25, Jensen Huang did something more significant than reporting record numbers — he reframed the entire AI infrastructure narrative. Revenue grew 65 percent to $216 billion, with data center revenue hitting $194 billion, up 68 percent. But the number that mattered most wasn’t a financial metric.
Huang described data centers as “factories that manufacture tokens,” where each token — the basic unit of AI output — is a unit of profit. His logic is straightforward: when AI can do useful work, tokens have value. When tokens generate profit, demand for compute accelerates.
He positioned this build-out not as a short-term cycle but as infrastructure on the scale of electricity grids or the internet: “measured in decades,” and possibly the largest infrastructure build in human history. The question NVIDIA is answering isn’t “will GPUs sell?” but “will customers make money off them?” Huang’s answer: yes, because tokens are now revenue.
NVIDIA’s advantage isn’t just the chip. It’s the CUDA ecosystem, the 7,000+ applications, and the installed base. But by redefining tokens as “units of intelligence” and data centers as “factories,” Huang is trying to turn AI infrastructure from a capital expense into a long-term economic asset. That’s not just a product pitch. It’s a valuation defense.

OpenAI’s Play: The Spicy Chip That Cuts the Bill
On the same day, OpenAI and Broadcom unveiled Jalapeño — a custom inference ASIC designed from scratch for large language model workloads. Named after a mild chili, it’s OpenAI’s first “Intelligence Processor,” and it’s not for sale.
The headline number is the timeline: nine months from design to tape-out, compared to the industry-standard 18-24 months. OpenAI used its own models to accelerate parts of the chip design and optimization process — effectively, AI helped design the hardware that will run AI.
Early testing shows Jalapeño delivers “substantially better” performance per watt than current state-of-the-art systems. Broadcom CEO Hock Tan said the chip is comparable to NVIDIA’s Blackwell and Google’s TPU. It’s designed to reduce data movement, balance compute and memory, and achieve utilization closer to theoretical peak performance. Engineering samples are already running production workloads like GPT-5.3-Codex-Spark in the lab at target frequency and power.
The strategic logic is clear. Every ChatGPT prompt and API call costs OpenAI money. Jalapeño is aimed at inference — the everyday cost of running models — not training. With inference costs estimated to be cut by about 50 percent, the chip is a direct play on OpenAI’s margin structure. And by building its own silicon, OpenAI joins Google, Amazon, Meta, and Microsoft in reducing dependency on NVIDIA’s 75-78 percent gross margin, replacing it with a 30-35 percent ASIC margin from a vendor like Broadcom.

Qualcomm’s Play: A Data Center Pivot
Also on June 25, Qualcomm outlined its data center strategy — a notable pivot from its mobile roots. The company announced its Dragonfly data center platform, with Meta and Microsoft as customers. Its Dragonfly CPU is planned to support Meta’s next-generation server clusters.
Qualcomm’s financial target is ambitious: over $15 billion in data center revenue by fiscal 2029, with non-handset business reaching $40 billion and mobile dropping to about one-third of QCT revenue. The company also acquired AI software firm Modular, signaling it understands the challenge isn’t just hardware — it’s the software ecosystem that NVIDIA’s CUDA has locked down for years.
The three strategies represent different positions on the same playing field. NVIDIA is defending its ecosystem. OpenAI is building its own. Qualcomm is trying to enter the game. But all three are responding to the same shift: AI inference is becoming the real market.

What the Three Strategies Reveal
Dimension | NVIDIA | OpenAI | Qualcomm |
|---|---|---|---|
Core Narrative | Token factories, AI as infrastructure | Full-stack AI, model-to-chip integration | Data center pivot, software-first |
Key Customer | All AI builders | Itself (not for sale) | Meta, Microsoft |
Ecosystem Moat | CUDA + 7000 apps | Model-hardware co-design | Modular acquisition |
Competitive Threat | Commoditization of compute | Reducing reliance on NVIDIA’s margins | Entering a crowded field |
NVIDIA is trying to make its chips irreplaceable by redefining compute as “token production.” It’s not about the next GPU. It’s about the next data center being an AI factory.
OpenAI is trying to escape NVIDIA’s margin structure by making its own inference silicon. Jalapeño isn’t about competing with NVIDIA on performance — it’s about controlling the cost of serving its own models.
Qualcomm is trying to enter a market that’s already crowded, but it has a clear anchor customer in Meta and a software acquisition that could help address the ecosystem gap.
The real signal in all three announcements is that inference is becoming the primary economic battleground. Training still requires massive GPU clusters, but inference runs constantly, generating revenue and costs at scale. Whoever can do inference cheaper, faster, and with lower power consumption will define the next phase of AI infrastructure.
P.S. Jalapeño is a mild chili. OpenAI named its first chip after the entry-level pepper. The next one might be spicier. But the real heat isn’t in the name — it’s in the message. When a model company builds its own chip, it’s signaling that the AI stack is too important to leave to someone else. NVIDIA knows it. OpenAI proved it. Qualcomm is betting on it. The inference war just got its opening salvo.