LLMs Show Limited Vulnerability Patching Ability
On Dec. 11, 2025, researchers tested LLMs from OpenAI, Meta, DeepSeek, and Mistral to see if they could automatically fix vulnerable Java functions in a single attempt. The experiments evaluated two vulnerability groups and found inconsistent success: models repaired some bugs but often produced incorrect or incomplete patches. The results suggest LLMs can assist but require human review and specialized tooling for reliable patching.
Key Points
- 1Tested LLMs from OpenAI, Meta, DeepSeek, Mistral on fixing vulnerable Java functions in one attempt
- 2Found models succeeded inconsistently, showing strengths on some vulnerability types but failing on others
- 3Indicates practitioners need human review and targeted tooling; LLM outputs are not yet reliable patches
Scoring Rationale
Notable study across multiple LLMs gives practical insight, but limited novelty and single-study scope reduce impact.
Sources
Public references used for this report.
Practice with real Logistics & Shipping data
90 SQL & Python problems · 15 industry datasets
250 free problems · No credit card
See all Logistics & Shipping problems

