Thuật ngữ AI
Từ điển đầy đủ về Trí tuệ nhân tạo
White-Box Attacks
Attacks where the adversary has complete knowledge of the target model's architecture and parameters.
Black-Box Attacks
Attacks performed without internal knowledge of the model, solely through interactions with its inputs/outputs.
Evasion Attacks
Subtle perturbations of input data to deceive the model during inference.
Poisoning Attacks
Injection of malicious data into the training set to compromise the model.
Model Extraction Attacks
Theft of parameters or functionality of a proprietary model through repeated queries.
Membership Inference Attacks
Determining whether a specific data point was part of the training set.
Adversarial Training Defense
Training the model on generated adversarial examples to improve its robustness.
Défense par Détection d'Attaques
Mécanismes pour identifier et rejeter les entrées potentiellement adversariales.
Gradient Masking Defense
Techniques masking gradients to prevent optimization-based attacks.
Attacks on Computer Vision
Attacks specifically designed to deceive image classification and object detection models.
Attacks on NLP
Subtle textual perturbations to fool natural language processing models.
Transfer Attacks
Attacks generated on a source model but effective against different target models.
Randomization Defense
Introduction of stochasticity into the inference process to disrupt attacks.
Attacks on Audio Models
Imperceptible sound perturbations designed to fool speech recognition systems.
Robustness Evaluation
Metrics and benchmarks for quantifying model resistance to adversarial attacks.