Does $NBIS now have the fastest inference in the world on $NVDA hardware?


Nebius acquired Eigen AI for $643M in cash and shares, bringing Eigen’s inference and post-training optimization directly into Nebius Token Factory
In NVIDIA’s GTC 2026 keynote, Eigen AI ranked #1 in output speed for Kimi K2.5 Reasoning, while Nebius Fast was almost tied with it
Nebius Fast also ranks first for inference speed on $NVDA hardware for ChatGPT’s open-source model, gpt-oss-120B
Moreover, Eigen ranked as the #1 GPU-based provider across 25 open-source models on Artificial Analysis, excluding ASIC providers, under default 10K input settings. It is also the fastest provider for Qwen3 Coder 480B, with 255.8 t/s, ahead of Google Vertex at 169.2 t/s and Amazon at 121.3 t/s
That means Eigen is about 51% faster than Google Vertex and more than 2x faster than Amazon on that benchmark
━━━━━━━━━━━━━━━━━━━━
While the acquisition cost seems high, if Eigen can really improve $NBIS inference performance, even slightly, it will have a compounding long-term effect on earnings and competitive positioning that will most likely more than pay for itself
━━━━━━━━━━━━━━━━━━━━
Nebius owns the GPU cloud, while Eigen improves how efficiently those GPUs generate tokens. On the same NVIDIA hardware, performance is not only about capex. It is about GPU utilization, model optimization, batching, latency, memory management, and custom kernels
Eigen’s stack focuses on areas like quantization, KV-cache optimization, sparsity, speculative decoding, custom CUDA and Triton kernels, continuous batching, and runtime optimization
If Nebius can generate more inference throughput from the same NVIDIA hardware, it improves revenue capacity, cost per token, and gross margin without requiring proportional capex increases
$NBIS is on its way to becoming a multi-dozen-billion annual revenue company, meaning even a few percentage points of inference improvement can translate into hundreds of millions in savings
━━━━━━━━━━━━━━━━━━━━
Open-source models are moving fast. Kimi, Qwen, DeepSeek, GLM, Llama, Nemotron, MiniMax, and other models require constant optimization to stay competitive
By integrating Eigen, Nebius can also release optimized versions faster and make Token Factory more attractive for developers and enterprise customers
EIGEN-1.6%
post-image
post-image
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • 1
  • Repost
  • Share
Comment
Add a comment
Add a comment
Saidur48
· 5h ago
2026 GOGOGO 👊
Reply0
  • Pin