A new capability, part of BlueXP workload factory, makes it easy for customers to develop their GenAI applications with Amazon Bedrock and FSx for ONTAP
Today, NetApp is pleased to announce a new capability that enables customers to build a knowledge platform using generative AI with the foundation models delivered via Amazon Bedrock and enterprise data on NetApp® ONTAP®.
In the rapidly evolving landscape of artificial intelligence, generative AI (GenAI) has emerged as a groundbreaking technology that is reshaping how businesses improve employee productivity, provide personalized experiences, enhance customer experience, boost content creation, advance research, and many more use cases. For most enterprises, the key to unlocking the value of GenAI is to connect the foundation models to enterprise data in a simple, secure, and cost-effective manner.
NetApp workload factory for AWS lets customers deploy and manage Knowledge Bases while extending the data management capabilities of FSxN for ONTAP storage service to GenAI infrastructure. Knowledge Bases form the underlying infrastructure to securely connect ONTAP data sources to embeddings and language models available in Amazon Bedrock, as well as search for and retrieve the relevant information to augment model responses using the Retrieval Augmented Generation (RAG) framework. Developers can use FSx for ONTAP Knowledge Bases to build applications such as virtual assistants for Q&A, contextual content creation, semantic search, and more. Such applications can generate more relevant and accurate responses based on knowledge derived from customer’s proprietary datasets.
Organizations of all sizes are experimenting with or already using GenAI to increase efficiency, accelerate innovation, and create differentiated products and experiences. GenAI foundation models are trained on public or generic datasets and have little knowledge or context of an enterprise, its business, processes, and data. Therefore, unless the models are retrained with enterprise or private data, they can respond unpredictably and inaccurately with false or outdated information. They may even hallucinate, offering severely inaccurate responses.
Training foundation models with custom data however is a complex, time-consuming, and cost-intensive process. The RAG framework offers a cost-effective method to extend the intelligence of these models by adding current and trustworthy enterprise data or knowledge sources. Developers can use RAG in their applications to connect to existing data sources and augment the responses of foundation models with knowledge derived from proprietary enterprise data.
Although RAG offers substantial benefits in customizing models, managing enterprise data for RAG frameworks can add cost and complexity. Privacy and security are top concerns in keeping enterprise data secure within enterprise boundaries, blocking unauthorized access and preventing sensitive information from leaking into the model’s intelligence or applications. Customers want to avoid creating new data silos for GenAI applications, and they don’t want to increase the cost and complexity of managing, securing, and protecting their data. They also want to use their existing on-premises data sources with the foundation models in Amazon Web Services; however, they struggle to do so efficiently and cost effectively.
Before today, if customers wanted to integrate their unstructured data located on ONTAP with Amazon Bedrock, they would need to copy the data to Amazon S3. By moving data to Amazon S3, they would not be able to use the ONTAP data security, management, and efficiency capabilities that they have been using to simplify managing vast amounts of data and to lower total cost of ownership (TCO).
With just a few clicks, BlueXP workload factory now enables customers with GenAI projects to connect to Amazon Bedrock and ONTAP data sources to create and manage FSx for ONTAP Knowledge Bases. This capability enables customers to deploy infrastructure for RAG without having to modify how they manage their unstructured data.
FSx for ONTAP Knowledge Bases consist of one or more ONTAP data sources available via SMB or NFS protocols; configuration settings of embedding and language models; and a collection of vector embeddings created and stored in a vector database (LanceDB) that uses FSx for ONTAP for persistent storage. Customers can ingest their documents for search and retrieval and use existing user access permissions (for SMB shares) to make sure that applications such as Q&A chatbots respond to user queries based on data that the user has access to.
Enabled via an AI engine compute instance, Knowledge Bases are deployed alongside FSx for ONTAP within the customer’s virtual private cloud (VPC), thus keeping enterprise data secure, encrypted, and entirely under customer control. These Knowledge Bases are periodically synchronized to changes in source data to ensure that the GenAI applications respond based on the latest data and access controls.
NetApp customers can use SnapMirror® or FlexCache® with FSx for ONTAP to efficiently bring on-premises data sources into AWS and use them with Amazon Bedrock models. Thousands of customers use FSx for ONTAP high performance and highly available storage that supports the familiar capabilities of ONTAP, such as multiprotocol access, point-in-time Snapshot™ copies, FlexClone® volumes for data cloning, and storage efficiency features such as deduplication, compression, compaction, and thin provisioning to optimize costs. By using FSx for ONTAP for both the source enterprise data and the knowledge bases, customers can now extend the familiar ONTAP benefits to the generative AI data infrastructure.
Customers can manage the FSx for ONTAP Knowledge Bases using the workload factory user interface or APIs and integrate with their applications using popular frameworks such as LangChain or workload factory APIs.
We’re excited to bring this capability to market, enabling our customers to use FSx for ONTAP not only as their data management platform, but now also as a knowledge platform by using the power of generative AI. This is just the beginning; we look forward to bringing many more updates to BlueXP workload factory throughout the year.
Learn more and get started with BlueXP workload factory for AWS.
The blog is co-written by Puneet Dhawan and Yuval Kalderon.
Puneet is a Senior Director of Product Management at NetApp where he leads product management for FSx for NetApp ONTAP service offering with AWS with specific focus on AI and Generative AI solutions. Before joining NetApp, Puneet held multiple product leadership roles at Amazon Web Services (AWS) and Dell Technologies in areas of hybrid cloud infrastructure, cloud storage, scale-out and distributed systems, high performance computing and enterprise solutions, etc. In those roles he led product vision and strategy, roadmap planning and execution, partnerships, and go-to-market strategy.