Digital transformation is often called digital disruption. It’s hard to pinpoint any major technology shifts that have happened over the years that haven’t left some damage in their wake. But how can you evolve your tech stack when that potential damage could include mission-critical services?
This was the dilemma faced by BAERO, NetApp's internal DevOps/Test infrastructure team, which provides tools and services for developers to build and test their code before submitting it. BAERO’s role is to help developers move faster, catch quality issues as early as possible, and protect NetApp® products from regressions. Over the years, we've built a large amount of software to support this mission. But as NetApp adds new products and features, this software must evolve to support these new environments.
For example, NetApp ONTAP® is now part of Amazon FSX, but to deliver this feature, ONTAP developers must be able to run, test, and debug new features in AWS before they are released. To support these requirements, BAERO extended services and infrastructure to work with Amazon FSX. In general, the faster BAERO can add support, the sooner NetApp developers can automatically catch regressions in their code.
BAERO is faced with a tricky balance: We have to move fast, but at the same time, we can’t break the infrastructure that NetApp development teams are using 24/7. Today, BAERO infrastructure code is a mixture of relatively new python and battle-hardened perl code and libraries written years ago but extended as needed. Unfortunately, the perl code can make it harder to support NetApp's new features and products, because perl's library ecosystem isn't as rich as python's, and perl has fewer developers who are eager to work in it.
Python allows teams to move faster and to better support the next generation of NetApp products. But can we translate our perl-based codebase into python without breaking mission-critical services along the way?
The release of OpenAI/Codex to the public in August of 2021 introduced the possibility of using artificial intelligence and machine learning to help translate code between languages. For BAERO, it could help translate our perl codebase over to python. But not all products live up to expectations; we needed to test it to believe. Would Codex hold up in real-world situations? Could it translate our codebase faster and easier than a group of developers? The only way to find out was through trial and (hopefully no) error.
For the test, we picked 'run_utest.pl', a utility that is responsible for executing unit-tests and interpreting and returning results. It’s also a script that has evolved beyond how it was originally designed. The original code was extended as needed to add core-dump analysis, fuzzing support, code coverage support, or on-the-spot diagnosis when particular rare failures happened. As result, it became a big complicated perl script, with years of real-world runtime behind it, and therefore it was somewhat terrifying to experiment with. Any translation to a new language would need to be done in a way that didn't risk the correctness of the script, because the script enforces quality by running unit-test and is used internally by every developer, hundreds of times a day.
On our first use of the Codex translation, it became clear that Codex has its pros and cons: It was great at some translations but very bad at others. Using it required having a developer in the middle verifying the correct translations and amending any missteps. That said, it’s easier to validate new python script than to write it from scratch, which made for a promising outcome.
In essence, Codex is a very fast but imperfect translator. We needed to figure out how to use it safely to speed up our translation project. When we were finished, the translated code needed to work perfectly (or close to it) from the start. Because 'run_utest.pl' is used so heavily in a normal build, it was straightforward to validate the “sunny-day” situations. However, the many corner cases and error paths that the perl version handles (and that were validated when written) are much more difficult. Those must work in the python translation when we deploy. There’s no room for error.
We focused on reducing or eliminating the risk in the new python translation. Ideally, we would have had a suite of tests that would exercise both the error paths and the sunny-day situations. Unfortunately, there were no backing tests on the original code, and stability was enforced by hand-testing new code and then watching for issues after new versions were deployed.
We attacked the translation risk in the follow ways.
In the end, the new version has much better testing than the original, and the act of exercising all of the code found many error-path translation problems that did not manifest in the sunny-day situation.
As of today, the port is feature complete and can run the entire unit-test workflow. Work continues on driving up unit-test coverage (>80% versus 0% prior) before it is deployed. One surprise was a bug in the original perl implementation; the perl code was swallowing specific types of exit code. The problem is hiding in the CURRENT unit-test infrastructure; it’s a problem that is fairly rare but is real. Originally it looked like a translation problem, but upon investigation, the python version was stricter than the perl version. Finding this problem alone is probably justification enough for the translation project.
With high unitest coverage, thorough integration testing (validated side-by-side with perl output), and many iterations of execution, we've deployed the python version of run_utest for ~5% of the unit-test targets.
We'll slowly grow this number and move to 100% after fixing unit-tests with the errors hidden by the original perl version of run_utest.
After a very positive experience with OpenAI/Codex, we're ready to go all in. OpenAI/Codex has literally changed what we believe is possible for the BAERO development team to do. In the past, we would have written new projects in python, while maintaining the perl infrastructure until a particular piece became untenable...and then rewrite that in python. With OpenAI/Codex in our development toolbox, we now have a long list of infrastructure software that we're going translate proactively, with a blueprint for making the project successful.
In the end:
OpenAI/Codex will help BAERO move faster.
OpenAI/Codex will help NetApp's developers test new features sooner.
OpenAI/Codex will help NetApp ship higher-quality software to our customers at a faster pace.
Perl->Python is just the start. OpenAI/Codex has the potential to unlock and accelerate new NetApp features, scalability, and performance in the NetApp products themselves.
While OpenAI/Codex is just getting started, you should start experimenting it soon, here. By managing risk with the right tests/process, OpenAI/Codex can already accelerate the translation of legacy code into new languages.
Phil Ezolt is a Senior Software Engineer/Architect for Netapp in Pittsburgh. After 16-years in various Netapp roles, today, he and his team passionately apply AIops/DevOps/Quality tools to Netapp software development. He has a BS in Electrical Computer Engineering from Carnegie Mellon, and an MS in Computer Science from Harvard, and wrote a book about Linux performance tools ('Optimizing Linux Performance'). He lives in Pittsburgh with his wife, Sarah, 3 kids, and 2 dogs. In his spare time, he plays boardgames, 3d prints, and (with Sarah) teaches STEAM classes + coaches Odyssey of the Mind for her non-profit, SteamStudio
https://www.thesteamstudio.com/