My first experience with machine learning (ML) at NetApp was almost a decade ago. It started with a seemingly simple problem: How can we synthesize the massive amount of data that we had been collecting through NetApp® AutoSupport® to identify and test our customers' most common configs? The limitations of taking the average of a hundred thousand systems quickly became obvious: There isn’t a single "average config," but rather 8 to 10 “sweet spots” that vary by workload and deployment. A virtual desktop environment looks very different from an EDA environment. Taking the average just blurs all the relevant and interesting insights one could get from differentiating those distinct workloads.
The approach we took was using k-means clustering, something that I hadn't thought about since college. And we would soon learn to dislike that algorithm.
Although it seemed to be a straightforward engineering science project, our problem quickly accumulated pragmatic challenges, including:
We got the job done, but if we hadn’t had a dedicated team of 5 or 6 engineers who had deep algorithmic knowledge, the project wouldn't have succeeded.
Looking back, it’s funny to know that if we were to tackle this problem again today, it would take days, not months like it did the first time. The combined power of the hybrid cloud and the data fabric has created an environment that is incredibly powerful at short-cutting this type of project.
We recently tackled a similar problem around statistics data, and some of the common, free tools available through popular public clouds did 95% of the heavy lifting. Instead of taking months to plan, prototype, replan, and execute, this new project was completed in a week. As engineers, we need to be constantly learning, anticipating new tech trends, and taking smart risks through innovation. Problems that used to take entire teams to solve can now be solved in a fraction of the time—as long as we are embracing the future.
Machine learning is an incredibly powerful tool. The democratization of compute through access to public clouds, combined with the availability of powerful, simplified algorithms offered as a service by public clouds, allows an opportunity for everyone to use these modern techniques. As developers, we are pushing for as many people to be fluent in these tools as possible. This is not a specialized role, and it’s not a magical multiyear journey to proficiency. ML fluency is something that all engineers need to understand how and where to apply to their jobs. Learning on and enhancing ML tools is crucial for engineers to accelerate productivity and stay relevant. At NetApp we're looking to apply these tools everywhere we can. Ultimately, adopting ML and AI helps NetApp develop faster, test better, and deliver products of higher quality.
Interested in AI and ML? Come join the team!
Matt Cornwell is a senior technical director in the Hybrid Cloud Engineering group at NetApp, where he leads strategy for the Release, Quality, and DevOps groups. Matt has played a central role in the more than tenfold improvement in ONTAP field quality over the last 7 years, and he has been one of the primary architects of the hybrid cloud testing strategy across ONTAP. In this role, he guides teams across the company on their quality strategy, infusing a customer-first mindset into the development lifecycle.