Alignment and Safety
Safety Alignment
Set of techniques aimed at ensuring language models avoid generating harmful, dangerous, or inappropriate content while maintaining their overall performance.
← Kembali