Researchers Find Poetry Bypasses LLM Safety Guardrails

In a 2025 arXiv study, researchers at DEXAI and Sapienza University found that masking harmful requests as poetry bypassed safety guardrails in leading AI chatbots. Testing 25 models across nine providers with 1,220 poetic prompts, the team reported up to 90% compliance and a fivefold average increase in successful jailbreaks. The findings suggest systemic language-interpretation vulnerabilities requiring revised safety evaluations.
Key Points
- 1Show that poetic prompts bypass guardrails, fooling 13 of 25 models over 70% of the time
- 2Underscore language-interpretation vulnerability across architectures, with fivefold average success increase
- 3Advise retooling evaluation and safety testing to cover heterogeneous linguistic regimes and poetry
Scoring Rationale
Strong cross-model empirical evidence indicates systemic safety issues, but the study is unpeer-reviewed and needs validation.
Sources
Public references used for this report.
Practice with real Logistics & Shipping data
90 SQL & Python problems · 15 industry datasets
250 free problems · No credit card
See all Logistics & Shipping problems