Sign in to my dashboard Create an account
Menu

Adversarial attacks: Using AI to increase security for the public sector

train station
Contents

Share this page

Dejan Kocic
Dejan Kocic

The amount of data available to intelligence and security agencies today is astounding. As data continues to grow, so do the challenges of protecting and using it efficiently so that it can be turned into actionable insights that strengthen national security. The possibilities of using AI to improve public sector agencies are endless—if you have the right infrastructure to help organize unstructured data from multiple sources, quickly process it, and protect it against adversarial attacks, no matter where it lives.

Building your AI framework for security

Artificial intelligence and machine learning frameworks rely on a mind-boggling amount of data. Public sector data is usually distributed, comes from multiple sources, is constantly changing, and comes in variety of formats: audio, video, images, logs, and more.

Effective AI models for the public sector need to be trained by using large amounts of data as input. That data needs to be gathered, organized, labeled, structured, and prepared so that AI models can use it. The accuracy and performance of AI and ML models depend on the quantity and quality of training data; this is the most essential element in the entire AI workflow. In terms of national security, ID verification models using computer vision and trained on 100 passports would not perform as well or be as accurate as models trained on 10,000 or 100,000 passports.

Preventing adversarial attacks

The National Security Commission on Artificial Intelligence reports that “a very small percentage of current AI research goes toward defending AI systems against adversarial efforts.” Adversarial ML and AI attacks are very real, and they happen on a large scale. The problem is so big that Microsoft teamed up with the MITRE Corporation to create MITRE ATLAS (Adversarial Threat Landscape for Artificial-Intelligence Systems). ATLAS is “a knowledge base of adversary tactics, techniques, and case studies for machine learning (ML) systems based on real-world observations.”

Adversarial AI and ML attacks often target data or applications used for data preparation and training. As mentioned earlier, this data is one of the most essential elements in the AI workflow because it directly affects the outcome of the AI model. If that data is altered or manipulated in any way by an attack, the adverse results may not be detected before inference and production, when it’s usually too late to remedy the problem.

A multiprong strategy is required to protect against these attacks. Training data goes through at least three stages—data gathering, labeling, and training—before inference and production. To protect datasets against data poisoning, insider and outsider threats, model evasion, and model stealing attacks, organizations need to establish data integrity and traceability throughout the process (data at rest, in motion, and in processing). This protection can be achieved through Zero Trust environments by encrypting data in all stages, creating tamper-proof datasets, and implementing multifactor authentication combined with insider threat detection tools.

The National Institute of Biomedical Imaging and Bioengineering of the NIH has conducted large studies to improve the interpretation of tumors by using AI and deep learning. The National Cancer Institute of the NIH also conducts research into AI-aided imaging for cancer prevention, diagnosis, and monitoring.

The following adversarial example shows how an image that was classified by an AI system as a benign tumor can be manipulated so that the same AI system classifies it as malignant, although the changes are not perceptible to human eye.

Misclassification of tumors using adversarial training data attacks

Considering the prevalence of these attacks in public sector IT systems, Microsoft and MITRE have classified them into the 7 categories in the Adversarial ML Threat Matrix.

adversarial attack chart

Here is a summary of the AI and ML attacks described in the matrix.

  1. Reconnaissance. By examining publicly available information about a company, like blogs, patent filings, tweets, and research papers, an attacker can get a good idea of the AI model and how it is probably going to work. This information can be used to craft techniques for model evasion and model stealing attacks.

  2. Initial access. Poisoning and backdoor attacks require account access, and the attacker has to manipulate the training data and/or the ML model under design.

  3. Execution. User input is looped in real time to model execution so that user input can be adversely manipulated.

  4. Persistence. Poisoning and backdoor attacks require account access, and the attacker has to manipulate the training data and/or the ML model under design.

  5. Model evasion. The attacker carefully crafts perturbed input, the so-called adversarial examples, to mislead the targeted ML model into outputting an incorrect prediction. 

  6. Exfiltration. The attacker replaces training data or a model file with a dataset that also contains malicious files. This attack can turn a harmless-looking training dataset into an AI inferencing model with disastrous results (autonomous vehicle collision, malignant tumors being classified as benign, and so on).

  7. Impact and model stealing attacks. These attacks are conducted in operational phase. By querying the targeted model, the attacker can generate an approximation of the original model; or the attacker may be able to obtain model parameters by exploiting system vulnerabilities. Both of these approaches allow the attacker to conduct strong evasion attacks on the targeted model. Model stealing attacks also cause concern about intellectual property theft.

Protecting your data across the hybrid cloud

With NetApp® AI solutions for the public sector, you get built-in data protection, compliance, and secure access for your distributed, diverse, and dynamic data on premises and across clouds. NetApp enables you to integrate, organize, protect, and secure your data pipeline from edge to core to cloud. With solutions like NetApp Cloud Insights, NetApp DataOps Toolkit, and NetApp SnapLock® compliance software, we help public sector agencies manage, organize, and use their data while helping to defend against adversarial AI and ML attacks. With these threats averted, governments can use the power of AI to make quick and confident decisions that strengthen national security.

To learn more, visit our AI for public sector webpage.

Dejan Kocic

Dejan is a visionary and a leader whose innovative and out of box thinking have earned him the reputation of creative solutions wizard. Dejan is part of several SNIA (Storage Networking Industry Association) committees, and he is regularly invited to chair conferences and to present on the latest technologies.

Currently, Dejan is Sr. Product Manager at NetApp and is leading initiatives related to Ontap and Hybrid Cloud adoption, amongst other things. Dejan has over 25 years of industry experience in Storage, Cloud, HPC, and AI technologies, and he also holds an MBA and Masters in Information Technology degrees.

View all Posts by Dejan Kocic

Next Steps

Drift chat loading