AI researcher claims he’s already bypassed Anthropic’s Fable 5 guardrails

“Pliny the Liberator,” says he has been “cleverly finding the holes in the fence that the thought police missed,” in the newly launched Fable 5.

An artificial intelligence and cybersecurity researcher claims to have jailbroken Anthropic’s latest AI model, Claude Fable 5, within just 48 hours of it being launched. 

“Pliny the Liberator,” a well-known figure in the AI community, said on Wednesday he “liberated” Fable 5, launched on Tuesday as a safety-tuned version of the more powerful Mythos model that Anthropic said was too dangerous to release widely.

He used various techniques, including a jailbroken version of Opus 4.8, to bypass the built-in safeguards that Anthropic installed on the model to prevent users from asking it for potentially harmful information, such as drug-making formulas or hacking instructions. 

Read more

Leave a Reply

Your email address will not be published. Required fields are marked *

Please enter CoinGecko Free Api Key to get this plugin works.