What is model poisoning?

A poisoning attack happens when the adversary is able to inject bad data into your model’s training pool, and hence get it to learn something it shouldn’t. The most common result of a poisoning attack is that the model’s boundary shifts in some way, creating undesirable variability in the model outputs.

Data poisoning attack examples

The first examples of poisoning attacks date as far back as 2004 and 2005, where they were performed to evade spam classifiers.

Poisoning attacks come in two types — those targeting your ML’s availability, and those targeting its integrity (also known as “backdoor” attacks).

Availability attack

The first attacks were of the availability type. Such attacks aim to inject so much bad data into your system that whatever boundary your model learns basically becomes useless.Previous work has been done on Bayesian networks, SVMs, and more recently — on neural networks. For example, Steinhardt reported that, even under strong defenses, a 3% training data set poisoning leads to an 11% drop in accuracy. Others proposed back-gradient approaches for generating poisons and even used an eutoencoder as an attack generator.

Backdoor attack

These are much more sophisticated and in fact want to leave your classifier functioning exactly like it should — with just one exception: a backdoor. A backdoor is a type of input that the model’s designer is not aware of, but that the attacker can leverage to get the ML system to do what they want. A popular example of this technique is the remote code injection in a machine learning system. This can lead to private information leakage or a more adverse situation of system breakdown.

Adversarial access

In poisoning attacks, however, there is an important dimension that defines the attacker’s capability — adversarial access . Just like information access, adversarial access comes in levels (from most dangerous to least):

Logic corruption
Data manipulation
Data injection
Transfer learning

Model poisoning attacks on federated learning

The threat of model poisoning attacks on federated learning initiated by a single, non-colluding malicious agent where the adversarial objective is to cause the model to mis-classify a set of chosen inputs with high confidence. There are a number of strategies to carry out this attack, starting with simple boosting of the malicious agent’s update to overcome the effects of other agents’ updates.

To increase attack stealth, an alternating minimization strategy, which alternately optimizes for the training loss and the adversarial objective. This is followed by parameter estimation for the benign agents’ updates to improve on attack success. The results indicate that even a highly constrained adversary can carry out model poisoning attacks while simultaneously maintaining stealth, thus highlighting the vulnerability of the federated learning setting and the need for effective defense strategies.

References

Devron is a next-generation federated learning and data science platform that enables decentralized analytics. Learn more about our solutions, read more of our knowledge base articles, about our federated learning platform, or schedule a demo with us today.

View All Knowledge Base Questions

What is model poisoning?

Data poisoning attack examples

Availability attack

Backdoor attack

Adversarial access

Model poisoning attacks on federated learning

References

Table of Contents

Improve Data Science Agility

Platform

Solutions

Learn

About

Connect