OpenAI’s Codex Spark Puts Cerebras on the Inference Map
OpenAI’s GPT-5.3-Codex-Spark, the first model running on Cerebras’ wafer-scale silicon, delivers 1,000+ tokens per second for real-time coding. The move signals a broader shift toward heterogeneous AI compute and the end of GPU monoculture in inference.
