Writing backwards can trick an AI into providing a bomb recipe

AI models have safeguards in place to prevent them creating dangerous or illegal output, but a range of jailbreaks have been found to evade them. Now researchers show that writing backwards can trick AI models into revealing bomb-making instructions.

SINSIN
Oct 18, 2024 - 21:00
 0  5
Writing backwards can trick an AI into providing a bomb recipe
AI models have safeguards in place to prevent them creating dangerous or illegal output, but a range of jailbreaks have been found to evade them. Now researchers show that writing backwards can trick AI models into revealing bomb-making instructions.

What's Your Reaction?

like

dislike

love

funny

angry

sad

wow

SIN ScienceX Information Network (SIN) | ScienceX Innovations