Vendors Use AI to Uncover and Patch Dozens of Flaws

Axios and The Register report that Palo Alto Networks scanned more than 130 products using frontier AI models, including Claude Mythos and OpenAI models, and uncovered 75 vulnerabilities that the company has patched, versus its typical 5-10 finds per month (Axios, The Register). SecurityWeek reports that Microsoft's MDASH system discovered 16 vulnerabilities that were addressed in a recent Patch Tuesday, including four rated critical. Axios reports none of the Palo Alto findings were being actively exploited in the wild. Commentators including Zero Day Initiative's Dustin Childs and reporting in ComputerWeekly link the spike in disclosures to the growing use of powerful AI bug-hunting tools and warn of higher operational burdens for patch triage and deployment.
What happened
Axios reports that Palo Alto Networks scanned more than 130 products with frontier AI models, including Claude Mythos and OpenAI cyber-focused models, and identified 75 vulnerabilities that have been patched, up from a typical 5-10 monthly finding rate, according to Axios. The Register similarly reports the company used Anthropic's Mythos among other models to scan its codebase and disclosed 26 CVEs as part of the findings. Axios adds that none of the vulnerabilities identified by Palo Alto Networks were observed to be exploited in the wild.
SecurityWeek reports that Microsoft's MDASH system, which orchestrates more than 100 specialized agents across frontier and distilled models, discovered 16 of the vulnerabilities fixed in the latest Patch Tuesday release; Microsoft said four of those were rated critical. ComputerWeekly reports that Microsoft's April Patch Tuesday update overall contained over 160 distinct flaws and that industry discussion has linked the spike to wider use of AI tools such as Anthropic's Project Glasswing and Claude Mythos.
Technical details
SecurityWeek describes MDASH as a multi-stage, agentic pipeline that runs preparation, scanning, validation, deduplication, and proof-construction agents; Microsoft reported that in internal tests MDASH recovered 96% and 100% of confirmed vulnerabilities on two heavily audited components and performed strongly on the public CyberGym benchmark, per SecurityWeek. Axios reports Palo Alto Networks told reporters that in internal testing the frontier models generated working exploits more than 70% of the time and that the company observed an average false-positive rate of about 30%.
Industry context
ComputerWeekly reports Anthropic has framed Project Glasswing (and a Mythos preview) as capable of discovering and, in some cases, developing exploits for zero-day flaws, prompting tight access controls around the tools. VentureBeat and other reporting earlier this year documented incidents where adversaries compromised AI security tools to exfiltrate data; VentureBeat warned those compromises could escalate once agentic tools gain write access to infrastructure. The Register quotes Dustin Childs of Trend Micro's Zero Day Initiative saying the immediate effect of AI-driven discovery will be more patches and more work for administrators, and that patch trust may erode if fixes break production.
Editorial analysis - technical context
Industry observers note that frontier models are improving at multi-step reasoning and chaining discrete findings into exploit paths, which raises the volume and severity of candidate findings. Companies that have tested these systems report higher recall for hidden bugs but also nontrivial false-positive rates and a need for human validation. For practitioners, that combination increases the importance of robust triage pipelines: vulnerability deduplication, exploitability validation, regression testing, and staged deployment workflows.
What to watch
Watch for wider adoption of guarded access programs (the kinds of controls ComputerWeekly describes around Project Glasswing) and for vendor disclosures that quantify false-positive rates and exploitability proofs. Observers should also follow whether disclosure volumes remain elevated as defenders scale human-in-the-loop validation, and whether patch deployment rates lag behind discovery rates, a gap that sources warn could increase operational risk.
Bottom line
Axios, The Register, SecurityWeek, ComputerWeekly, and VentureBeat document a clear increase in vulnerability discoveries after applying frontier AI models to vendor code. The technical gains in discovery come with operational friction: higher triage load, nontrivial false positives, and questions about secure handling of agentic tools, per the cited reporting.
Scoring Rationale
Frontier AI materially changes vulnerability discovery capacity and exploit-generation ability; practitioners must reassess tooling and triage workflows. The story affects many security teams and vendor disclosure volumes, making it industry-significant but not a single historic event.
Practice interview problems based on real data
1,500+ SQL & Python problems across 15 industry datasets — the exact type of data you work with.
Try 250 free problems
