AI Glossary
The complete dictionary of Artificial Intelligence
Poisoning attack
Attack technique where malicious data is injected into the training set to degrade model performance or introduce specific vulnerabilities.
Backdoor injection
Method consisting of inserting specific triggers into training data to create latent malicious behavior that activates only in the presence of these triggers.
Label flipping attack
Poisoning strategy where the attacker intentionally modifies the labels of training data to mislead the model and compromise its classification accuracy.
Causal poisoning attack
Sophisticated poisoning approach that manipulates causal relationships between features to targetedly influence the model's predictions.
Robustness defense
Set of techniques aiming to make AI models resistant to poisoning attacks by limiting the impact of malicious data on learning.
Training anomaly detection
Process of identifying and eliminating abnormal or potentially malicious data points in the training set before or during learning.
Robust cross-validation
Enhanced validation technique that evaluates model stability across different data partitions to detect potential malicious contamination.
Learning with noisy data
Learning paradigm designed to maintain optimal performance despite the presence of corrupted or intentionally modified data in the training set.
Dataset Purification
Systematic process of cleaning training data to identify and eliminate potentially poisoned samples before model training.
Certifiable Model
AI model architecture capable of providing mathematical guarantees on its resistance to poisoning attacks under defined conditions.
Induced Overfitting Attack
Poisoning technique that forces the model to overlearn specific patterns introduced by the attacker, compromising its generalization capability.
Filtering Defense
Protection mechanism that applies statistical or heuristic filters to eliminate suspicious data before their integration into the learning process.
Targeted Poisoning
Poisoning attack designed to specifically compromise predictions for certain classes or particular inputs while preserving overall performance.
Indiscriminate Poisoning
Attack aiming to globally degrade model performance without specific targeting, often by introducing systematic noise into training data.
Incremental Retraining Defense
Protection strategy that continuously updates the model with new validated data while monitoring performance drifts that may indicate poisoning.
Inverse Backpropagation Attack
Advanced technique where the attacker calculates the optimal modifications to apply to training data to maximize impact on the final model weights.
External validation defense
A protection approach that uses independent and uncompromised validation sets to detect performance degradations caused by poisoning.