GitHub Uses Customer Interaction Data To Train Models

GitHub says it will begin next month using customer interaction data — including inputs, outputs, code snippets, repository context, chats and feedback — to train its Copilot models. The policy, revised as of April 24, applies to Copilot Free, Pro, and Pro+ users while Copilot Business, Enterprise, students and teachers are exempt; affected users can opt out via /settings/copilot/features.
Key Points
- 1Collects user inputs, outputs, code snippets, repo context, chats, and feedback to train models.
- 2Aims to improve suggestion accuracy, security, and acceptance rates based on Microsoft's internal employee data gains.
- 3Requires affected Copilot Free/Pro users to opt out via settings; Business/Enterprise and educators exempt.
Scoring Rationale
Official company policy change with broad privacy implications and direct opt-out actions; however, similar industry practices reduce its novelty.
Sources
Public references used for this report.
Practice interview problems based on real data
1,625 SQL & Python problems across 15 industry datasets — the exact type of data you work with.
Try 250 free problems
