Jose Antonio Lanz Publicado em 13/11/2025 às 20:10

This One Weird Trick Defeats AI Safety Features in 99% of Cases

This One Weird Trick Defeats AI Safety Features in 99% of Cases AI researchers from Anthropic, Stanford, and Oxford have discovered that making AI models think longer makes them easier to jailbreak—the opposite of what everyone assumed. The prevailing assumption ... [5718 symbols]

Últimos artigos

Artigos relacionados