Model Robustness
Adversarial Example Detection
Set of techniques aimed at automatically identifying potentially manipulated inputs before they are processed by the main model. These systems often use meta-classifiers or statistical analyses of activations.
← 뒤로