Poisoning Attacks - AI Glossary

📖

terms

Poisoning attack

Attack technique where malicious data is injected into the training set to degrade model performance or introduce specific vulnerabilities.

📖

terms

Backdoor injection

Method consisting of inserting specific triggers into training data to create latent malicious behavior that activates only in the presence of these triggers.

📖

terms

Label flipping attack

Poisoning strategy where the attacker intentionally modifies the labels of training data to mislead the model and compromise its classification accuracy.

📖

terms

Causal poisoning attack

Sophisticated poisoning approach that manipulates causal relationships between features to targetedly influence the model's predictions.

📖

terms

Robustness defense

Set of techniques aiming to make AI models resistant to poisoning attacks by limiting the impact of malicious data on learning.

📖

terms

Training anomaly detection

Process of identifying and eliminating abnormal or potentially malicious data points in the training set before or during learning.

📖

terms

Robust cross-validation

Enhanced validation technique that evaluates model stability across different data partitions to detect potential malicious contamination.

📖

terms

Learning with noisy data

Learning paradigm designed to maintain optimal performance despite the presence of corrupted or intentionally modified data in the training set.

📖

terms

Dataset Purification

Systematic process of cleaning training data to identify and eliminate potentially poisoned samples before model training.

📖

terms

Certifiable Model

AI model architecture capable of providing mathematical guarantees on its resistance to poisoning attacks under defined conditions.

📖

terms

Induced Overfitting Attack

Poisoning technique that forces the model to overlearn specific patterns introduced by the attacker, compromising its generalization capability.

📖

terms

Filtering Defense

Protection mechanism that applies statistical or heuristic filters to eliminate suspicious data before their integration into the learning process.

📖

terms

Targeted Poisoning

Poisoning attack designed to specifically compromise predictions for certain classes or particular inputs while preserving overall performance.

📖

terms

Indiscriminate Poisoning

Attack aiming to globally degrade model performance without specific targeting, often by introducing systematic noise into training data.

📖

terms

Incremental Retraining Defense

Protection strategy that continuously updates the model with new validated data while monitoring performance drifts that may indicate poisoning.

📖

terms

Inverse Backpropagation Attack

Advanced technique where the attacker calculates the optimal modifications to apply to training data to maximize impact on the final model weights.

📖

terms

External validation defense

A protection approach that uses independent and uncompromised validation sets to detect performance degradations caused by poisoning.

AI Glossary