The ready-to-use AI-ready stack featuring FlexPod® AI is designed to meet the needs of enterprise generative AI (GenAI) applications. It combines Cisco computing power, NetApp® storage solutions, NVIDIA GPUs, and a range of AI software to create a validated infrastructure platform that enables businesses to develop and implement custom chatbots, virtual assistants, and applications through retrieval-augmented generation (RAG) technology.
FlexPod AI is tailored to support large language models (LLMs) from platforms like Hugging Face and NVIDIA AI Enterprise. We test these models in our reference architecture setup, demonstrating how businesses can confidently deploy AI solutions with integration across different LLMs. During the fine-tuning and inference stage, we perform several benchmarking tests, such as latency, throughputs, token sizes, framebuffer, power usage, and training and validation loss. FlexPod AI customers can use these benchmarking results to choose the most efficient model for their GenAI use cases.
The use of private LLM augmentation for GenAI by FlexPod AI enables companies to enhance or customize LLMs for AI applications. By incorporating trained models from NVIDIA AI Enterprise and Hugging Face repositories, organizations can create tailored AI solutions that meet their unique requirements, improving the effectiveness and efficiency of their GenAI applications. These capabilities are especially beneficial in the government, manufacturing, healthcare, and finance sectors. FlexPod AI also excels in fine-tuning and inference scenarios because of its ability to leverage RAG for accurate and contextually relevant results.
FlexPod supports container platforms such as Red Hat OpenShift, SUSE Rancher, and various open-source Kubernetes solutions. This flexibility allows AI applications to be deployed across a range of environments, maximizing the use of existing infrastructure.
The integration of FlexPod with RedHat OpenShift and NVIDIA AI Enterprise Suite is seamless:
FlexPod AI also supports container creation with NVIDIA drivers, the CUDA-Operator, and the CUDA Toolkit based on SUSE Linux Enterprise Base Container Images (SLE BCI). Additionally, NetApp Astra™ Trident storage orchestrator can be deployed and configured for advanced storage solutions.
FlexPod AI is a revolutionary turnkey solution for enterprises looking to harness the power of GenAI. By integrating Cisco compute, NetApp storage, NVIDIA GPUs, and advanced AI software, FlexPod AI provides a comprehensive, validated infrastructure platform. For tasks such as summarizing text, answering questions, or generating images, FlexPod AI offers unparalleled support for a wide range of AI use cases. It’s the go-to solution for modern enterprises aiming to deploy sophisticated AI applications with ease and confidence.
Sriram Sagi is a principal product manager for FlexPod. He joined NetApp in 2022 with 15+ years of experience in enterprise products. Before NetApp, Sriram led product and technology teams and shipped multiple products. He has bachelor’s and master’s degrees in engineering and an MBA from Duke University.