With Astra Control and the NetApp DataOps Toolkit, development and data science teams can finally ditch their legacy software stacks and move to a cloud-native setup.
Kubernetes Container Storage Interface drivers are great. They bring persistent storage into the cloud-native world by greatly simplifying provisioning, snapshots, and clones. In fact, we were a pioneer in this area with the NetApp® Astra™ Trident provisioner, the initial version of which was released way back in 2016. However, there are things that CSI drivers can’t do. For example, because of the architecture of Kubernetes itself, you can’t clone a volume to a different namespace by using a Kubernetes CSI driver. This is a major limitation when the volume that you want to clone contains something like a developer or data scientist workspace.
For developers and data scientists who work with large repositories, built artifacts, and datasets, the ability to clone workspace volumes almost instantaneously can greatly accelerate the development lifecycle. For many years, development teams, including NetApp’s own internal development team, have been using NetApp FlexClone® technology to reduce build-test processes from multiple days to just a few hours or even minutes.
Development teams and data science teams are increasingly interested in moving their build-test and model training environments to Kubernetes in order to drive efficiency and avoid vendor lock-in. When they make this migration, it is absolutely necessary for them to be able to take their clone-based processes with them so that they don’t lose all of the lifecycle efficiencies that they have achieved over the years. In fact, the lack of ability to clone a volume to a different namespace is often a showstopper. It is considered a best practice for different developers to work in different namespaces. This is done for security reasons, and there is typically no compromising when it comes to security. If clones are going to be included in the development lifecycle, the ability to clone workspaces to different namespaces is vital.
This is where NetApp Astra Control saves the day. Astra Control is an application-aware data management solution that manages, protects, and moves data-rich Kubernetes workloads both in public clouds and on premises. Astra Control enables data protection, disaster recovery, and migration for your Kubernetes workloads by leveraging NetApp’s industry-leading data management technology for snapshots, backups, replication, and cloning. Astra Control is available as a fully managed cloud service, called Astra Control Service, or as self-managed on-premises software deployment, called Astra Control Center.
One key capability that Astra Control enables is the ability to clone an application and its associated volume or volumes to a brand-new namespace. This means that developer and data scientist workspaces can now be cloned to new, isolated namespaces. Developers and data scientists can work with these newly cloned workspaces within these new namespaces without being given access to the original namespace, thus preserving the security model of Kubernetes. The best part? Astra Control features a robust REST API interface that can be used to incorporate these clones into automated build-test and CI/CD processes. There is even an Astra Control Python SDK for Python-based workflows.
The NetApp DataOps Toolkit now integrates certain features from Astra Control. Why does this matter? Well, the DataOps toolkit includes some pretty cool JupyterLab workspace management capabilities, including the ability to almost instantaneously clone a JupyterLab workspace running on Kubernetes. These capabilities currently use the Astra Trident CSI driver under the hood, so they suffer from the same CSI driver limitation around cloning to a different namespace that we have already discussed.
(As an aside, if you aren’t familiar with JupyterLab workspaces, they are the preferred working environment of data scientists. To learn more, check out the Project Jupyter site.)
This is where the new Astra Control integration really shines. If you have Astra Control managing the applications in your Kubernetes cluster, you can use the DataOps Toolkit to rapidly clone a JupyterLab workspace to a brand new namespace. It’s as easy as running one simple command from the terminal or, for Python-based workflows, making one simple function call.
Now, with Astra Control and the NetApp DataOps Toolkit, development and data science teams can finally ditch their legacy software stacks and move to a cloud-native setup. They can reduce costs by taking advantage of Kubernetes’ efficiencies and standardization while continuing to accelerate productivity through the use of FlexClone technology. So what are you waiting for? Check out Astra Control and the NetApp DataOps Toolkit today! If you’re a data scientist, also check out NetApp’s AI solutions. The DataOps Toolkit’s Astra Control integration is compatible with many of them.
Mike is a Technical Marketing Engineer at NetApp focused on MLOps and Data Pipeline solutions. He architects and validates full-stack AI/ML/DL data and experiment management solutions that span a hybrid cloud. Mike has a DevOps background and a strong knowledge of DevOps processes and tools. Prior to joining NetApp, Mike worked on a line of business application development team at a large global financial services company. Outside of work, Mike loves to travel. One of his passions is experiencing other places and cultures through their food.