Jalapeño Cuts Inference Costs by 50%
Jalapeño is a custom ASIC built specifically for LLM inference — the process of running trained models to generate responses. OpenAI designed the architecture from scratch, Broadcom handled silicon implementation and networking, and Celestica integrated the board and rack systems.
Engineering samples are already running production workloads in the lab, including GPT-5.3-Codex-Spark. Early testing shows performance per watt “substantially better” than current state-of-the-art systems. Broadcom CEO Hock Tan said Jalapeño‘s performance is comparable to NVIDIA Blackwell and Google TPU. The chip could also cut inference costs by about 50%, according to early tests.
Nine Months Is a Warning Shot to the Industry
From blank page to tape-out: nine months. The industry standard for a high-performance ASIC is typically 18 to 24 months. OpenAI and Broadcom only announced their partnership in October 2025. By June 2026, they had a working chip.
Key factors behind the speed: AI-assisted design, the hiring of Richard Ho (formerly a senior director at Google’s TPU team for nine years), and co-located teams working directly with Broadcom‘s silicon engineers through the entire process.
OpenAI Is Becoming an Infrastructure Company
Jalapeño is not just a chip. It’s OpenAI‘s first step toward controlling its own hardware stack. The chip is the first in a planned series, with initial deployment targeted by the end of 2026 and expansion planned for the following years. It’s optimized for inference — not training — which means OpenAI still depends on NVIDIA for training, but inference is where the day-to-day operating costs live.
Every ChatGPT response, every Codex query, every API call currently costs OpenAI money. If Jalapeño delivers the claimed 50% cost reduction, that‘s a direct path to margin improvement. While built for OpenAI’s workloads, the chip is designed to work with all LLMs — not just OpenAI‘s own models.
OpenAI’s move into custom silicon echoes Google‘s TPU strategy and Amazon’s Trainium push. The difference: OpenAI is doing it while still spending heavily on NVIDIA, AMD, AWS Trainium, and Cerebras. It‘s not replacing its supply chain. It’s diversifying it.
P.S. The name Jalapeño is a Mexican chili — one of the mildest varieties. OpenAI named its first chip after the entry-level pepper. The subtext: this is just the start. The next one might be spicier.