Cisco Finds Multi-turn Attacks Break Frontier AI Models

Help Net Security reports that Cisco's AI threat intelligence team evaluated 15 closed flagship models from OpenAI, Anthropic, Google, Amazon, and xAI using about 30,000 single-turn prompts and nearly 7,000 multi-turn attacks across more than 1,400 conversations. The testing showed multi-turn attack success rates as high as 88%, and produced different model rankings and failure profiles compared with single-turn benchmarks. Per the report, Gemini 3 Pro rose from about 18% single-turn ASR to 73% under multi-turn pressure, Grok 4.1 Fast (non-reasoning) topped the cohort at 88%, OpenAI's model jumped roughly ninefold to nearly 25%, and Anthropic's family moved from low single-digit single-turn ASRs to 11%-16% in multi-turn tests. The report quotes Cisco's head of AI threat and security research: "How secure is this model against real-world attack scenarios?"
What happened
Help Net Security reports that Cisco's AI threat intelligence team ran paired single-turn and multi-turn adversarial evaluations across 15 closed flagship models from OpenAI, Anthropic, Google, Amazon, and xAI, using roughly 30,000 single-turn prompts and nearly 7,000 multi-turn attacks spanning over 1,400 conversations. The report finds multi-turn attack success rates rising substantially relative to single-turn tests, with cohort highs of 88% ASR reported by Grok 4.1 Fast in its non-reasoning configuration, and large absolute gaps for many models.
Technical details
Per the Cisco report as covered by Help Net Security, single-turn and multi-turn regimes produced different rankings and different tail-risk failure maps. Reported examples include Gemini 3 Pro climbing from about 18% to 73% ASR under iterative attacks, an OpenAI model increasing roughly ninefold to nearly 25%, and Nova 2 Lite showing a relatively high single-turn ASR but the lowest multi-turn ASR at about 8%. More than half the models tested showed an absolute gap of at least 15 points between single-turn and multi-turn results.
Industry context
Editorial analysis: Industry-standard safety benchmarks often use single-turn prompts and refusal tests. The Cisco findings, as reported, indicate those benchmarks can understate exposure when attackers use persistence, reframing, persona adoption, and incremental escalation across turns. This pattern is consistent with prior academic and red-team literature showing adaptive adversaries increase success by iterating on refusals and by building context over multiple messages.
Implications for practitioners
Editorial analysis: For teams evaluating model risk, the Cisco results underscore that single-turn ASR metrics alone may not reflect real-world adversarial resilience. Observers comparing models should consider multi-turn adversarial testing, attacker adaptivity, and tail-risk metrics when assessing deployment suitability for safety-critical or externally facing flows.
What to watch
Editorial analysis: Observers will look for:
- •independent replication of Cisco's multi-turn protocol across more models and open-source implementations
- •vendor or third-party releases of standardized multi-turn red-team benchmarks
- •documentation of mitigation strategies that address persistence and context-building attacks rather than only single-turn refusal tuning. The Cisco report asks the direct question quoted from the head of its AI threat and security research: "How secure is this model against real-world attack scenarios?"
Scoring Rationale
The report covers systematic, multi-turn red-teaming across major flagship models and shows large gaps versus single-turn metrics, which is notable for practitioners responsible for operational safety. The story is significant but not a paradigm shift.
Practice with real Ad Tech data
90 SQL & Python problems · 15 industry datasets
250 free problems · No credit card
See all Ad Tech problems
