Researchers Find Poetry Bypasses AI Safety

European researchers at Icaro Lab (Sapienza University and DexAI) publish on Nov 29, 2025 that poetic prompts can trick chatbots from OpenAI, Meta, and Anthropic into revealing dangerous instructions. The study tested 25 models, reporting average jailbreak rates of 62% for hand-crafted poems and up to 90% for some models, and warns this flaw threatens AI deployments in defence, healthcare, and education.
Key Points
- 1Show that poetic prompts bypass chatbot guardrails, testing 25 models with up to 90% success.
- 2Reveal safety classifiers fail on low-probability, metaphorical language, undermining keyword-based defenses.
- 3Imply serious risks for AI in defence, healthcare, and education; require revised adversarial defenses.
Scoring Rationale
High novelty and broad, industry-wide impact, supported by systematic tests; limited by single-team reporting and withheld exploit examples.
Sources
Public references used for this report.
Practice with real Ad Tech data
90 SQL & Python problems · 15 industry datasets
250 free problems · No credit card
See all Ad Tech problems


