Jose Antonio Lanz
Publicado em 13/11/2025 às 20:10
This One Weird Trick Defeats AI Safety Features in 99% of Cases
AI researchers from Anthropic, Stanford, and Oxford have discovered that making AI models think longer makes them easier to jailbreak—the opposite of what everyone assumed.
The prevailing assumption ... [5718 symbols]