Security & Riskadversarial attacksmodel robustnesssafety evaluationcisco

Cisco Finds Multi-turn Attacks Break Frontier AI Models

|May 28, 2026|By LDS Team

7.6

Relevance Score

Cisco Finds Multi-turn Attacks Break Frontier AI Models — Photo: img.helpnetsecurity.com · rights & takedowns

Help Net Security reports that Cisco's AI threat intelligence team evaluated 15 closed flagship models from OpenAI, Anthropic, Google, Amazon, and xAI using about 30,000 single-turn prompts and nearly 7,000 multi-turn attacks across more than 1,400 conversations. The testing showed multi-turn attack success rates as high as 88%, and produced different model rankings and failure profiles compared with single-turn benchmarks. Per the report, Gemini 3 Pro rose from about 18% single-turn ASR to 73% under multi-turn pressure, Grok 4.1 Fast (non-reasoning) topped the cohort at 88%, OpenAI's model jumped roughly ninefold to nearly 25%, and Anthropic's family moved from low single-digit single-turn ASRs to 11%-16% in multi-turn tests. The report quotes Cisco's head of AI threat and security research: "How secure is this model against real-world attack scenarios?"

What happened

Help Net Security reports that Cisco's AI threat intelligence team ran paired single-turn and multi-turn adversarial evaluations across 15 closed flagship models from OpenAI, Anthropic, Google, Amazon, and xAI, using roughly 30,000 single-turn prompts and nearly 7,000 multi-turn attacks spanning over 1,400 conversations. The report finds multi-turn attack success rates rising substantially relative to single-turn tests, with cohort highs of 88% ASR reported by Grok 4.1 Fast in its non-reasoning configuration, and large absolute gaps for many models.

Technical details

Per the Cisco report as covered by Help Net Security, single-turn and multi-turn regimes produced different rankings and different tail-risk failure maps. Reported examples include Gemini 3 Pro climbing from about 18% to 73% ASR under iterative attacks, an OpenAI model increasing roughly ninefold to nearly 25%, and Nova 2 Lite showing a relatively high single-turn ASR but the lowest multi-turn ASR at about 8%. More than half the models tested showed an absolute gap of at least 15 points between single-turn and multi-turn results.

Industry context

Editorial analysis: Industry-standard safety benchmarks often use single-turn prompts and refusal tests. The Cisco findings, as reported, indicate those benchmarks can understate exposure when attackers use persistence, reframing, persona adoption, and incremental escalation across turns. This pattern is consistent with prior academic and red-team literature showing adaptive adversaries increase success by iterating on refusals and by building context over multiple messages.

Implications for practitioners

Editorial analysis: For teams evaluating model risk, the Cisco results underscore that single-turn ASR metrics alone may not reflect real-world adversarial resilience. Observers comparing models should consider multi-turn adversarial testing, attacker adaptivity, and tail-risk metrics when assessing deployment suitability for safety-critical or externally facing flows.

What to watch

Editorial analysis: Observers will look for:

•independent replication of Cisco's multi-turn protocol across more models and open-source implementations
•vendor or third-party releases of standardized multi-turn red-team benchmarks
•documentation of mitigation strategies that address persistence and context-building attacks rather than only single-turn refusal tuning. The Cisco report asks the direct question quoted from the head of its AI threat and security research: "How secure is this model against real-world attack scenarios?"

Key Points

1Paired single-turn and multi-turn testing revealed materially different model rankings, exposing weaknesses single-turn benchmarks miss.
2Adaptive, multi-turn adversaries can raise attack success rates dramatically, producing cohort highs like 88% in Cisco's tests.
3Practitioners assessing model safety should treat single-turn ASR as an incomplete indicator and push for multi-turn evaluation protocols.

Scoring Rationale

The report covers systematic, multi-turn red-teaming across major flagship models and shows large gaps versus single-turn metrics, which is notable for practitioners responsible for operational safety. The story is significant but not a paradigm shift.

MoreLLMs news

Sources

Public references used for this report.

1 source

helpnetsecurity.comFrontier AI models collapse under multi-turn AI attacks, Cisco finds

Practice with real Ad Tech data

90 SQL & Python problems · 15 industry datasets

Used by DS/ML engineers at top companies

Active Search Campaigns by BudgetEasy

High CPC Clicks & Poor Landing PagesMedium

Campaign ROAS by Attribution ModelHard

250 free problems · No credit card

See all Ad Tech problems

What happened

Technical details

Industry context

Implications for practitioners

What to watch

Editorial analysis: Observers will look for:

•independent replication of Cisco's multi-turn protocol across more models and open-source implementations
•vendor or third-party releases of standardized multi-turn red-team benchmarks
•documentation of mitigation strategies that address persistence and context-building attacks rather than only single-turn refusal tuning. The Cisco report asks the direct question quoted from the head of its AI threat and security research: "How secure is this model against real-world attack scenarios?"

Key Points

1Paired single-turn and multi-turn testing revealed materially different model rankings, exposing weaknesses single-turn benchmarks miss.

2Adaptive, multi-turn adversaries can raise attack success rates dramatically, producing cohort highs like 88% in Cisco's tests.

3Practitioners assessing model safety should treat single-turn ASR as an incomplete indicator and push for multi-turn evaluation protocols.

Cisco Finds Multi-turn Attacks Break Frontier AI Models

What happened

Technical details

Industry context

Implications for practitioners

What to watch

Key Points

Scoring Rationale

Sources

More AI & Data Science News

Ghost Font Uses Motion to Confound AI Vision

AegisAI Raises $36 Million to Expand AI Email Security

Delaware Court Lets Google AI Defamation Case Proceed

OpenAI Explores APIs for Deeper ChatGPT Wearable Integrations

Cisco Finds Multi-turn Attacks Break Frontier AI Models

What happened

Technical details

Industry context

Implications for practitioners

What to watch

Key Points

Scoring Rationale

Sources

More AI & Data Science News

Ghost Font Uses Motion to Confound AI Vision

AegisAI Raises $36 Million to Expand AI Email Security

Delaware Court Lets Google AI Defamation Case Proceed

OpenAI Explores APIs for Deeper ChatGPT Wearable Integrations