Evaluation and Metrics
Toxicity
Metric evaluating the likelihood that a model generates offensive, hateful, discriminatory, or harmful content. It is typically measured by specialized classifiers trained on corpora of texts annotated for their toxicity.
← Geri