NVIDIA BlueField-4 Powers New Class of AI-Native Storage Infrastructure for the Next Frontier of AI
NVIDIA has announced that the NVIDIA BlueField®-4 data processor, part of its full-stack NVIDIA BlueField platform, is powering the NVIDIA Inference Context Memory Storage Platform, a new category of AI-native storage infrastructure designed to support the next phase of AI development.
As AI models expand to trillions of parameters and increasingly rely on multi-step reasoning, they produce massive volumes of context data, stored in the form of key-value (KV) caches, which are essential for maintaining accuracy, continuity, and a high-quality user experience.
However, KV caches cannot be retained on GPUs for extended periods without creating performance bottlenecks for real-time inference, particularly in multi-agent environments. As a result, AI-native workloads require a scalable infrastructure purpose-built to store and share this data efficiently.
The NVIDIA Inference Context Memory Storage Platform addresses this need by extending effective GPU memory capacity, enabling high-speed data sharing across nodes, increasing token throughput by up to 5x, and delivering up to 5x higher power efficiency compared with conventional storage solutions.
“AI is revolutionizing the entire computing stack — and now, storage,” said Jensen Huang, founder and CEO of NVIDIA. “AI is no longer about one-shot chatbots but intelligent collaborators that understand the physical world, reason over long horizons, stay grounded in facts, use tools to do real work, and retain both short- and long-term memory. With BlueField-4, NVIDIA and our software and hardware partners are reinventing the storage stack for the next frontier of AI.”
Key capabilities of the NVIDIA BlueField-4-powered platform include:
- NVIDIA Rubin cluster-level KV cache capacity, delivering the scale and efficiency required for long-context, multi-turn agentic inference.
- Up to 5x greater power efficiency than traditional storage.
- Smart, accelerated sharing of KV cache across AI nodes, enabled by the NVIDIA DOCA™ framework and tightly integrated with the NVIDIA NIXL library and NVIDIA Dynamo software to maximize tokens per second, reduce time to first token and improve multi-turn responsiveness.
- Hardware-accelerated KV cache placement managed by NVIDIA BlueField-4 eliminates metadata overhead, reduces data movement and ensures secure, isolated access from the GPU nodes.
- Efficient data sharing and retrieval enabled by NVIDIA Spectrum-X™ Ethernet serves as the high-performance network fabric for RDMA-based access to AI-native KV cache.
Storage innovators including AIC, Cloudian, DDN, Dell Technologies, HPE, Hitachi Vantara, IBM, Nutanix, Pure Storage, Supermicro, VAST Data and WEKA are among the first building next-generation AI storage platforms with BlueField-4, which will be available in the second half of 2026.