LLM Agents Generate QuickJS Exploit Chains

A researcher ran experiments using Opus 4.5 and GPT-5.2 that produced over 40 distinct exploits for a zero-day QuickJS vulnerability across six scenarios, with code and write-ups published on GitHub. GPT-5.2 solved every scenario and Opus 4.5 solved all but two, with typical runs limited to 30M tokens (about $30) and the hardest task requiring ~50M tokens and three hours, implying token throughput could industrialize offensive cyber capabilities.
Scoring Rationale
High practical novelty and broad industry impact, tempered by single-source, non-peer-reviewed experimental evidence and limited replication.
Practice with real Logistics & Shipping data
90 SQL & Python problems · 15 industry datasets
250 free problems · No credit card
See all Logistics & Shipping problems

