Alignment and Safety
Model Jailbreaking
Attack techniques designed to bypass model safety and alignment mechanisms, forcing them to generate normally restricted or prohibited content.
← Quay lạiAttack techniques designed to bypass model safety and alignment mechanisms, forcing them to generate normally restricted or prohibited content.
← Quay lại