Artificial intelligence (AI) has been slower to take off in the field of pathology than in other areas of medicine. Other than in research settings, the road to fully digitized clinical pathology departments incorporating whole slide imaging (WSI), convolutional neural networks (CNNs), and cost-effective, high-performance computing and data storage is a work in progress. The technology is still emerging and evolving, best practices surrounding AI vary, and data issues can create roadblocks.
However, progress is being made. The digitization trend in pathology is accelerating. Productivity gains, as well as new insights into the detection of cancer and other abnormalities, are helping to increase overall enthusiasm for digitization.
When an organization is ready to make the leap to digitize their pathology slides and workflows, there are many available solutions to improve outcomes. One is the capability for telepathology, to distribute and centralize workloads dynamically as needed. Another solution is computational pathology, with many proven machine learning approaches that are improving the accuracy and automation of slide analysis. Also, CNNs are an advanced way to build decision-making workflows in digital pathology.
Although pathologists are the only ones who can make a cancer diagnosis, CNNs help increase the accuracy and efficiency of a diagnosis. And they help doctors identify benign or normal tissue more quickly, which can reduce the need for human intervention. To support CNNs in practice, organizations need a variety of hardware, software, and infrastructure, and the advances have been rapid.
Advances in the computational power and memory bandwidth of GPUs are continually reducing the compute-related bottlenecks of computational pathology. If data access needs are not met, storage can become a bottleneck, and compute nodes might starve for input data without being able to use resources to their full potential.
To support such high-performance I/O requirements, organizations can use BeeGFS, a parallel HPC file system. NetApp® E-Series storage with BeeGFS gives you consistent, near-real-time access to your data. To prevent bottlenecks and to support continuous high-performance workloads like AI, BeeGFS transparently spreads data across multiple servers and their back-end storage. And in addition to being open source, BeeGFS comes with graphical administration and monitoring, unlike complex legacy open-source parallel file systems.
To see how high-performance and low-latency NetApp E-Series storage systems facilitate WSI analysis with Apache Spark and BeeGFS, you can find the setup instructions and code used for this demonstration on GitHub.
The generation of high-resolution digital images and the intricate, complex patterns required for disease recognition provides important opportunities to apply AI in pathology for better patient outcomes. To learn about NetApp AI in healthcare, see Unlock the potential of AI in healthcare.
Mike McNamara is a senior product and solution marketing leader at NetApp with over 25 years of data management and cloud storage marketing experience. Before joining NetApp over ten years ago, Mike worked at Adaptec, Dell EMC, and HPE. Mike was a key team leader driving the launch of a first-party cloud storage offering and the industry’s first cloud-connected AI/ML solution (NetApp), unified scale-out and hybrid cloud storage system and software (NetApp), iSCSI and SAS storage system and software (Adaptec), and Fibre Channel storage system (EMC CLARiiON).
In addition to his past role as marketing chairperson for the Fibre Channel Industry Association, he is a member of the Ethernet Technology Summit Conference Advisory Board, a member of the Ethernet Alliance, a regular contributor to industry journals, and a frequent event speaker. Mike also published a book through FriesenPress titled "Scale-Out Storage - The Next Frontier in Enterprise Data Management" and was listed as a top 50 B2B product marketer to watch by Kapos.