Business

NVIDIA, OpenAI, and Qualcomm Just Drew Three Very Different Battle Lines in AI Inference

NVIDIA reframes GPUs as token factories, while OpenAI and Broadcom unveil Jalapeño, a custom ASIC slashing inference costs by 50%. Meanwhile, Qualcomm stakes its own claim in AI inference.

By Jeff Editorial | 4 min read
NVIDIA, OpenAI, and Qualcomm Just Drew Three Very Different Battle Lines in AI Inference

NVIDIA’s Play: Redefining the GPU as an AI Factory

At NVIDIA’s annual shareholder meeting on June 25, Jensen Huang did something more significant than reporting record numbers — he reframed the entire AI infrastructure narrative. Revenue grew 65 percent to $216 billion, with data center revenue hitting $194 billion, up 68 percent. But the number that mattered most wasn’t a financial metric.

Huang described data centers as “factories that manufacture tokens,” where each token — the basic unit of AI output — is a unit of profit. His logic is straightforward: when AI can do useful work, tokens have value. When tokens generate profit, demand for compute accelerates.

He positioned this build-out not as a short-term cycle but as infrastructure on the scale of electricity grids or the internet: “measured in decades,” and possibly the largest infrastructure build in human history. The question NVIDIA is answering isn’t “will GPUs sell?” but “will customers make money off them?” Huang’s answer: yes, because tokens are now revenue.

NVIDIA’s advantage isn’t just the chip. It’s the CUDA ecosystem, the 7,000+ applications, and the installed base. But by redefining tokens as “units of intelligence” and data centers as “factories,” Huang is trying to turn AI infrastructure from a capital expense into a long-term economic asset. That’s not just a product pitch. It’s a valuation defense.

NVIDIA, OpenAI, and Qualcomm Just Drew Three Very Different Battle Lines in AI Inference
Jensen Huang

OpenAI’s Play: The Spicy Chip That Cuts the Bill

On the same day, OpenAI and Broadcom unveiled Jalapeño — a custom inference ASIC designed from scratch for large language model workloads. Named after a mild chili, it’s OpenAI’s first “Intelligence Processor,” and it’s not for sale.

The headline number is the timeline: nine months from design to tape-out, compared to the industry-standard 18-24 months. OpenAI used its own models to accelerate parts of the chip design and optimization process — effectively, AI helped design the hardware that will run AI.

Early testing shows Jalapeño delivers “substantially better” performance per watt than current state-of-the-art systems. Broadcom CEO Hock Tan said the chip is comparable to NVIDIA’s Blackwell and Google’s TPU. It’s designed to reduce data movement, balance compute and memory, and achieve utilization closer to theoretical peak performance. Engineering samples are already running production workloads like GPT-5.3-Codex-Spark in the lab at target frequency and power.

The strategic logic is clear. Every ChatGPT prompt and API call costs OpenAI money. Jalapeño is aimed at inference — the everyday cost of running models — not training. With inference costs estimated to be cut by about 50 percent, the chip is a direct play on OpenAI’s margin structure. And by building its own silicon, OpenAI joins Google, Amazon, Meta, and Microsoft in reducing dependency on NVIDIA’s 75-78 percent gross margin, replacing it with a 30-35 percent ASIC margin from a vendor like Broadcom.

NVIDIA, OpenAI, and Qualcomm Just Drew Three Very Different Battle Lines in AI Inference
Jalapeño was delivered to OpenAI CEO Sam Altman and President Greg Brockman by Broadcom President and CEO Hock Tan and President Charlie Kawwas

Qualcomm’s Play: A Data Center Pivot

Also on June 25, Qualcomm outlined its data center strategy — a notable pivot from its mobile roots. The company announced its Dragonfly data center platform, with Meta and Microsoft as customers. Its Dragonfly CPU is planned to support Meta’s next-generation server clusters.

Qualcomm’s financial target is ambitious: over $15 billion in data center revenue by fiscal 2029, with non-handset business reaching $40 billion and mobile dropping to about one-third of QCT revenue. The company also acquired AI software firm Modular, signaling it understands the challenge isn’t just hardware — it’s the software ecosystem that NVIDIA’s CUDA has locked down for years.

The three strategies represent different positions on the same playing field. NVIDIA is defending its ecosystem. OpenAI is building its own. Qualcomm is trying to enter the game. But all three are responding to the same shift: AI inference is becoming the real market.

NVIDIA, OpenAI, and Qualcomm Just Drew Three Very Different Battle Lines in AI Inference
Qualcomm‘s CEO

What the Three Strategies Reveal

Dimension

NVIDIA

OpenAI

Qualcomm

Core Narrative

Token factories, AI as infrastructure

Full-stack AI, model-to-chip integration

Data center pivot, software-first

Key Customer

All AI builders

Itself (not for sale)

Meta, Microsoft

Ecosystem Moat

CUDA + 7000 apps

Model-hardware co-design

Modular acquisition

Competitive Threat

Commoditization of compute

Reducing reliance on NVIDIA’s margins

Entering a crowded field

NVIDIA is trying to make its chips irreplaceable by redefining compute as “token production.” It’s not about the next GPU. It’s about the next data center being an AI factory.

OpenAI is trying to escape NVIDIA’s margin structure by making its own inference silicon. Jalapeño isn’t about competing with NVIDIA on performance — it’s about controlling the cost of serving its own models.

Qualcomm is trying to enter a market that’s already crowded, but it has a clear anchor customer in Meta and a software acquisition that could help address the ecosystem gap.

The real signal in all three announcements is that inference is becoming the primary economic battleground. Training still requires massive GPU clusters, but inference runs constantly, generating revenue and costs at scale. Whoever can do inference cheaper, faster, and with lower power consumption will define the next phase of AI infrastructure.


P.S. Jalapeño is a mild chili. OpenAI named its first chip after the entry-level pepper. The next one might be spicier. But the real heat isn’t in the name — it’s in the message. When a model company builds its own chip, it’s signaling that the AI stack is too important to leave to someone else. NVIDIA knows it. OpenAI proved it. Qualcomm is betting on it. The inference war just got its opening salvo.

Advertisement

CRAZE

Use CRAZE to turn this article into a faster answer: pull the summary, surface the key term, or jump straight to the next story in this thread.

Article