Imagine you're at a poker table where NVIDIA's been raking in the chips for years, dominating the AI hardware game with its GPUs. Suddenly, a bold new player slides into the seat with a massive Cerebras Wafer Scale Engine—a chip the size of an entire silicon wafer, packing 4 trillion transistors and promising to crush inference workloads 10-20x faster than GPU clusters. That's the hand Cerebras is playing right now, and on May 11, 2026, they just upped the ante by supersizing their IPO to raise up to $4.8 billion at $150-160 per share after seeing 20x oversubscription from hungry investors.[1][2]
This isn't just hype; it's a signal flare for the AI hardware boom. With OpenAI—yes, that OpenAI, backed by a multi-year $20+ billion deal for 750 megawatts of compute—as a marquee customer, and Sam Altman himself an early personal investor, Cerebras is positioning the Wafer Scale Engine (WSE-3) as the go-to NVIDIA alternative for the inference era.[3][2] Revenue exploded to $510 million in 2025 (up 76% from $290 million in 2024), with a $24.6 billion backlog lighting the path ahead.[4] If you're tracking the AI arms race, this is your wake-up call: the shift from training to real-time inference is here, and Cerebras is built for it.
In this deep dive, we'll unpack the IPO fireworks, the tech that's got Wall Street salivating, why inference chips are the next gold rush, and what it all means for NVIDIA's throne. Buckle up—WikiWayne's got the insider scoop.
The IPO Blockbuster: From $3.5B to $4.8B Overnight
Cerebras didn't just file for an IPO; they engineered a spectacle. After confidentially pulling a 2025 attempt amid regulatory hiccups (think CFIUS reviews on UAE ties), they refiled in April 2026, initially targeting 28 million shares at $115-125 for up to $3.5 billion at a $26.6 billion valuation.[5] But demand was insane—orders reportedly 20x the available shares—forcing a rapid upsizing announced May 11: 30 million shares at $150-160, netting up to $4.8 billion (or $5.5B with greenshoe) at a fully diluted $48.8 billion valuation.[1]
Key IPO Stats at a Glance:
| Metric | Details |
|---|---|
| Shares Offered | 30M Class A (plus 4.5M over-allotment)[2] |
| Price Range | $150-160 (midpoint $155)[1] |
| Gross Proceeds | Up to $4.8B[1] |
| Valuation (Fully Diluted) | ~$48.8B (2x Feb 2026's $23B private round)[1] |
| Ticker/Exchange | CBRS on Nasdaq (pricing ~May 13, debut May 14)[2] |
| Underwriters | Goldman Sachs, Allen & Co., etc. (per filings) |
| Use of Proceeds | Capex, working capital, RSU taxes (~$330M), potential M&A[2] |
This makes it 2026's largest tech IPO so far, dwarfing others amid AI fever. But it's not without risks: customer concentration (G42/MBZUAI drove 86% of 2025 revenue), net losses ($482M in 2024, profit $238M in 2025), and TSMC dependency.[2] Still, with hyperscalers racing for inference edge, investors are betting big. See our guide on AI IPOs in 2026 for more on the wave.
Cracking the Wafer Scale Engine: Cerebras' Secret Weapon
Forget dicing wafers into tiny GPUs—Cerebras etches the whole 300mm silicon wafer into one colossal chip. The WSE-3 (powering the CS-3 system) is the world's largest AI processor: 46,225 mm² (57x NVIDIA H100), 4 trillion transistors, 900,000 AI cores, 44GB on-chip SRAM, and 21 PB/s bandwidth—7,000x more than GPU packages.[6][7]
WSE-3 vs. NVIDIA B200 (Key Specs):
| Feature | WSE-3 | NVIDIA B200 | Edge |
|---|---|---|---|
| Size | 46,225 mm² | ~800 mm² | 58x larger[8] |
| Transistors | 4T | ~208B | 19x more |
| Cores | 900K AI | ~14K | 52x more[6] |
| On-Chip Memory | 44GB SRAM | ~192GB HBM (package) | 880x bandwidth[6] |
| Bandwidth | 21 PB/s | 8 TB/s | 2,625x[8] |
| Peak AI Compute | 125 PFLOPS | ~20 PFLOPS | 6x+ |
| Power per System | ~25kW (CS-3) | 1kW+ per GPU | Lower TCO[9] |
Why does this matter? AI inference is memory-bound, not compute-bound. GPUs shuffle data across HBM/NVLink, burning power and latency. WSE keeps everything on-silicon—no interconnect hell. Result: 10-70x faster inference, 1/3 power/cost vs. DGX B200.[10][11] Cerebras Cloud APIs serve Llama/Qwen at 2,000+ tokens/sec (30x ChatGPT).[9]
The CS-3 system? A 1-rack beast (~$2-3M) training 24T-param models solo or scaling linearly to exaFLOPS clusters. Mayo Clinic, Argonne, GSK rave: "hundreds of times faster" for drug discovery.[9]
OpenAI's Bet and the Customer Powerhouse
Sam Altman invested early (2016), then OpenAI locked a $20B+ multi-year deal (Dec 2025/Jan 2026): 750MW Cerebras compute for inference, including Codex-Spark at 1,000 tokens/sec (15x prior, powering real-time code gen).[12] OpenAI loaned $1B working capital, snagging warrants for ~10% equity if scaled.[13]
Top Customers (2025 Revenue Share):
- G42: 24% (down from 85% in 2024 post-CFIUS)[2]
- MBZUAI: 62%
- OpenAI/AWS: Massive backlog drivers ($24.6B total)[4]
Others: Meta (Scout at 2k tokens/sec), AlphaSense, Notion, LiveKit. AWS integrates CS-3 for inference; partnerships with Microsoft, Mayo. Diversifying from UAE, but concentration risks loom.[9]
Check our deep dive on OpenAI's hardware strategy for the full saga.
Inference Boom: Why Cerebras Eats NVIDIA's Lunch (For Now)
Training's GPU turf (NVIDIA 90%+ share), but inference—running models at scale—is exploding: $672B market by 2029 (28% CAGR). Latency kills UX; users ditch slow bots. WSE-3's on-chip magic delivers sub-second reasoning, 20x GPU speed for agents/copilots.[14]
Benchmarks:
- GPT-OSS-120B: 3,000+ tokens/sec (5x Blackwell)[15]
- Llama 3.1 70B: Fits one CS-3 (vs. racks of H100s)
- Cancer models: 100s x faster[9]
NVIDIA's response? Blackwell clusters. But Cerebras scales sans sharding headaches. Power edge: CS-3 sips vs. GPU farms. If inference hits 70% of spend (analyst bets), Cerebras wins niches like real-time AI.
Financial Fireworks and Road Ahead
2025 Financial Snapshot:
Revenue: $510M (+76% YoY)
- Hardware: $358M
- Cloud/Services: $152M (rising fast)
Gross Profit: $199M (39% margin)
Net Income: $238M (vs. -$482M 2024)
Backlog: $24.6B
Cash Burn: Minimal ($10M ops)
Path: Scale data centers, CS-4 on 3nm, cloud dominance. Risks? TSMC yields, competition (Groq, AMD), OpenAI dependency. But $4.8B war chest buys runway.
Projections: $950M+ 2026 revenue. At 50x sales, $50B+ post-IPO pop? Plausible in AI mania.
Risks: Not All Wafer-thin Mints
- Concentration: Few customers = volatility[2]
- Losses/Unprofitability: Path to GAAP profits unclear
- Execution: Wafer yields, cooling for CS-3 clusters
- Competition: NVIDIA's moat, Groq's LPUs
- Geopolitics: TSMC/China risks
Still, momentum's real.
FAQ
What is the Cerebras Wafer Scale Engine, and how does it differ from NVIDIA GPUs?
The WSE-3 is a full-wafer chip (46k mm², 4T transistors) with everything on-die for massive bandwidth. GPUs cluster small dies; WSE eliminates data movement bottlenecks, excelling in inference (10-70x faster).[7]
Why did Cerebras upsize its IPO to $4.8B?
20x demand from AI investors craving inference plays. Price jumped $150-160/share, shares to 30M, valuation doubling to $48.8B.[1]
Is OpenAI a major Cerebras customer, and what's Sam Altman's role?
Yes—$20B+ deal for 750MW inference compute; powers Codex-Spark. Altman: Early investor; OpenAI warrants for 10% equity potential.[3]
Will Cerebras challenge NVIDIA's dominance?
In inference? Absolutely—niche wins today, scale tomorrow. Training? Complementary. $24.6B backlog says market agrees.[4]
Ready to bet on wafer-scale AI? Will Cerebras' IPO soar past $200/share, or is it GPU overkill? Drop your take below!
