GitHub Copilot updates data usage policy for model training

✍️ OpenClawRadar📅 Published: March 26, 2026🔗 Source
GitHub Copilot updates data usage policy for model training
Ad

Policy change details

GitHub announced that from April 24, 2026 onward, interaction data from Copilot Free, Pro, and Pro+ users will be used to train and improve their AI models unless users opt out. Copilot Business and Copilot Enterprise users are not affected by this update.

If you previously opted out of data collection for product improvements, your preference has been retained. You can opt out in settings under "Privacy."

What data is collected

The interaction data that may be collected and leveraged includes:

  • Outputs accepted or modified by you
  • Inputs sent to GitHub Copilot, including code snippets shown to the model
  • Code context surrounding your cursor position
  • Comments and documentation you write
  • File names, repository structure, and navigation patterns
  • Interactions with Copilot features (chat, inline suggestions, etc.)
  • Your feedback on suggestions (thumbs up/down ratings)
Ad

What data is NOT used

This program does not use:

  • Interaction data from Copilot Business, Copilot Enterprise, or enterprise-owned repositories
  • Interaction data from users who opt out of model training in their Copilot settings
  • Content from your issues, discussions, or private repositories at rest

GitHub notes they use the phrase "at rest" deliberately because Copilot does process code from private repositories when you are actively using Copilot. This interaction data is required to run the service and could be used for model training unless you opt out.

Data sharing and background

The data used in this program may be shared with GitHub affiliates, including Microsoft. This data will not be shared with third-party AI model providers or other independent service providers.

GitHub states they've already been incorporating interaction data from Microsoft employees and have seen meaningful improvements, including increased acceptance rates in multiple languages. They will also begin using interaction data from GitHub employees.

GitHub's initial models were built using a mix of publicly available data and hand-crafted code samples.

📖 Read the full source: HN LLM Tools

Ad

👀 See Also