OpenAI Unveils Jalapeño: Custom AI Chip for Faster, Cheaper Inference

In a significant move toward vertical integration, OpenAI and Broadcom announced the Jalapeño, OpenAI's first custom "Intelligence Processor" optimized specifically for large language model (LLM) inference.

The Challenge

As AI inference scales to millions of users, costs dominate. Traditional GPU-based approaches waste energy. Jalapeño targets this inefficiency directly.

Design

Built with Broadcom and manufactured by Celestica, the chip prioritizes:

Cost efficiency: Lower per-inference expense than equivalent Nvidia setups
Throughput optimization: Batched inference optimized for LLMs
Memory bandwidth: Tight integration reduces data movement

Timeline

OpenAI has initial samples and is testing internally. Deployment to API users planned for late 2026.

Impact

Broadcom shares rose ~3% on the announcement
Signals OpenAI's commitment to hardware control
Part of a multi-generation platform strategy

If successful, Jalapeño reshapes AI services economics worldwide.

Source: Bloomberg | Broadcom