Jenkins Project Develops Domain-Specific LLM For Diagnostics

Chirag Gupta was selected for Google Summer of Code 2025 to build a domain-specific LLM for Jenkins using ci.jenkins.io data. The project will fine-tune models on roughly 400,000 build logs and metadata to improve diagnosis of build failures, with planned preprocessing, QA pair generation, and initial fine-tuning runs during the coding period. Mentors include Kris Stern and others.
Key Points
- 1Fine-tunes an LLM using ci.jenkins.io build logs and metadata (≈400,000 files) for Jenkins diagnostics
- 2Addresses noisy logs, token inflation, and diverse artifacts to improve model relevance for real-world CI/CD failures
- 3Enables practitioners to accelerate build-failure diagnosis via preprocessing pipelines, QA pairs, and targeted fine-tuning
Scoring Rationale
Credible, actionable GSoC project applying LLM fine-tuning to real Jenkins logs, limited by early-stage, single-project scope.
Sources
Public references used for this report.
Practice with real Logistics & Shipping data
90 SQL & Python problems · 15 industry datasets
250 free problems · No credit card
See all Logistics & Shipping problems

