Researchers Find Vulnerabilities in AI-Generated Code

Georgia Tech researchers scanned over 43,000 security advisories and identified 74 confirmed cases where AI-assisted development introduced vulnerabilities into code, including 14 critical and 25 high-severity issues. The finding links common generative tools such as `Claude`, `Gemini`, and `GitHub Copilot` to repeated insecure patterns like command injection, authentication bypass, and server-side request forgery. The team's detection pipeline uses metadata signatures, co-author tags, and bot emails to trace who introduced the buggy commit, and plans to add behavioral models that identify AI-written code from naming, structure, and error handling. The research highlights a systemic supply-chain and scale risk from mass adoption of code generation tools and calls for feedback loops and tooling to surface which models, prompts, and workflows create the most exposure.
What happened
Georgia Tech researchers in the Systems Software & Security Lab scanned over 43,000 public security advisories and identified 74 confirmed cases where AI-assisted development introduced vulnerabilities, including 14 critical and 25 high-severity issues. The flagged failures trace back to widespread usage of generative coding tools such as `Claude`, `Gemini`, and `GitHub Copilot`, showing that model repetition can amplify risk across many repositories.
Technical details
The team's detection pipeline, called the radar, correlates vulnerability entries with git histories to identify the introducing commit. Detection currently relies on metadata signals like co-author tags, bot emails, and tool-specific signatures, plus heuristics to map advisory errors to code patterns. Common vulnerability classes observed include:
- •Command injection
- •Authentication bypass
- •Server-side request forgery (SSRF)
The researchers emphasize that generative models tend to repeat the same insecure constructs. Because millions of developers may use the same underlying models and prompts, discovering a single exploitable pattern enables broad scanning and exploitation.
Future work and limitations
Metadata-based tracing misses sanitized or edited commits where signatures were removed. The next development is behavioral detection: building models that identify AI-written code from variable naming, function structure, error handling patterns, and stylistic fingerprints. The team is also expanding verification pipelines and ingesting more vulnerability databases to reduce sampling bias.
Context and significance
This research converts an anecdotal worry into quantified evidence: AI-assisted coding is not just noisy or inefficient, it can introduce systematically repeating, high-impact security bugs. That creates a new kind of software supply-chain risk where model-level defects propagate cross-project. The finding intersects with ongoing debates on prompt hygiene, model benchmarking for safety, and vendor responsibilities for secure-by-default code generation.
What to watch
Practitioners should track development of behavioral detectors and integration of AI-origin flags into SCA tools and CI pipelines. Security teams need to prioritize scanning for model-derived patterns and push vendors for safer generation defaults, hardened templates, and provenance metadata that survives commits.
Scoring Rationale
This research exposes a notable, practical risk from mass adoption of code generation tools: repeated model errors create scalable attack surfaces. It is technically important for practitioners but not a single-vendor catastrophic event.
Practice interview problems based on real data
1,500+ SQL & Python problems across 15 industry datasets — the exact type of data you work with.
Try 250 free problems


