A validated reference architecture from NetApp, Lenovo, and NVIDIA OVX
In today’s business landscape, enterprises face the challenge of implementing scalable and secure AI infrastructures. The NVIDIA OVX validated NetApp® AIPodtm
In partnership with Lenovo, NetApp AIPod is an NVIDIA OVX validated solution that simplifies the deployment of AI technologies for enterprises. This system provides scalability and performance for the evolving needs of enterprises venturing into AI applications, such as generative AI. Equipped with NVIDIA OVX-certified Lenovo ThinkSystem servers with NVIDIA L40S GPUs, and NetApp capacity flash storage, the NetApp AIPodis designed to grow with your AI requirements offering fine tuning and inference capabilities for the RAG (Retrieval augmented generation) use cases.
The NetApp AIPod is truly versatile. Whether it’s enriching customer interactions, enhancing data analysis efficiency, or fostering innovation, this platform helps democratizes AI by making cutting edge technology available to businesses of all sizes. It's particularly adept at enhancing generative AI workflows and streamlining data analysis processes, thereby fostering innovation. Additionally, the AIPod's integration of RAG technology with AIPod propels general AI use cases forward, enabling more sophisticated, contextually aware responses and decision-making capabilities that are crucial for a broad spectrum of industries.
The comprehensive validation report outlines a prevalidated and preintegrated solution that’s designed for simplicity and scalability, tailored to meet the needs of generative AI workloads. This solution features a reference architecture that includes Lenovo’s NVIDIA OVX certified ThinkSystem servers, NetApp AFF storage, NVIDIA Spectrum networking, and the NVIDIA AI Enterprise software platform, all optimized for peak performance. It provides a step-by-step guide on setting up the NetApp AIPod with Lenovo for the NVIDIA OVX validated system, which is capable of performing RAG inferencing built on a large language model (LLM) foundation. The report also offers setup instructions, configuration details, and best practices to assist customers and partners in selecting the appropriate NetApp storage solution for their needs.
As organizations grow, they often face the dual challenges of scaling workloads and ensuring consistent data availability. The solution detailed in the report demonstrates how to overcome these hurdles with an architecture that scales to meet both computational and data management demands. It enables customers to start with a simple RAG configuration at an attractive entry point while still achieving the required storage key performance indicators (KPIs).
A key component of this solution is the multi-turn RAG pipeline, which is essential for creating sophisticated AI conversational agents that are capable of engaging in extended, complex dialogues. This component is ideal for customer service bots, virtual assistants, or AI companions that cover a vast array of topics. In practice, the RAG pipeline query demonstrated impressive results, achieving 97% NVIDIA Tensor core utilization and 5.24 gigabytes of memory usage with consistent submillisecond latency on NetApp storage, delivering a seamless and efficient AI conversational experience.
Using AI technology can be expensive in terms of energy consumption. The NetApp AIPod tackles thos issue by focusing on improving energy efficiency and lowering costs, which makes investing in AI more feasible for businesses by offering an economically sound solution.
The solution underwent rigorous validation, including GPU burn tests, to make sure that it operates the RAG-based system efficiently and with minimal energy consumption. The objective was to make GPUs ready for optimal performance, focusing on temperature stability to prevent thermal throttling or shutdown risks due to excessive heat, and to avoid condensation and erratic performance when cold. The tests also aimed to identify silent failures—GPUs that fail without obvious warning signs, which can lead to incorrect results and silent data corruption. Achieving reliable and precise GPU computations was paramount. Impressively, the solution achieved 100% Tensor Core utilization across all four NVIDIA L40S GPUs, indicating a robust setup for peak performance.
The NetApp AIPod, fortified by the advanced NetApp ONTAP® built-in security framework, delivers robust protection for your data and financial assets. This comprehensive security approach means that your information is safeguarded at every layer of the system, leveraging the industry-leading capabilities of ONTAP to provide an added level of defense against threats. Businesses can confidently implement this solution with confidence knowing that their AI endeavors are shielded by the expertise of NetApp, Lenovo and NVIDIA, each a leader in their domain.
"Lenovo, in collaboration with NetApp and NVIDIA, is revolutionizing enterprise AI with comprehensive solutions designed to ignite and scale your AI initiatives. The AIPod offers everything needed to power your AI factory—from cutting-edge AI-ready compute and storage to advanced data management and software tools. This isn't just about robust technology; it's the gateway to unleashing AI's full potential for businesses across the globe." Robert Daigle, Director, Global AI Business, Lenovo
“The combination of the high-performance, energy-efficient NVIDIA AI stack with advanced technologies from NetApp and Lenovo culminates in a powerful platform for deploying AI applications. The NetApp AIPod will help enterprises scale their data-centric AI operations with confidence and simplify the process of building the infrastructure required for powering enterprise AI.” – Satheesh Iyer , product manager at NVIDIA
In an evolving landscape of AI technologies, having infrastructure that is adaptable and upgradeable is crucial. The NetApp AIPod with Lenovo ThinkSystem for NVIDIA OVX is designed to adapt to changes, helping businesses to keep up with the advancements in AI and effectively address emerging business challenges.
To learn more, visit:
Sriram Sagi is a principal product manager for FlexPod. He joined NetApp in 2022 with 15+ years of experience in enterprise products. Before NetApp, Sriram led product and technology teams and shipped multiple products. He has bachelor’s and master’s degrees in engineering and an MBA from Duke University.