Linux kernel maintainer reports sudden shift in AI-generated bug report quality

Greg Kroah-Hartman, a long-term Linux kernel maintainer, reports a significant change in AI-generated bug reports for the Linux kernel. About a month ago, reports transitioned from what he calls 'AI slop'—obviously wrong or low-quality security reports—to legitimate, useful findings.
The inflection point
Kroah-Hartman notes that 'something happened a month ago, and the world switched. Now we have real reports.' This shift isn't limited to the Linux kernel—he says all open source projects are seeing similar legitimate AI-generated reports. Security teams across major open source projects are informally discussing this change, with everyone experiencing the same transition.
Impact on different projects
The Linux kernel team, being larger and more distributed, can handle the increased volume of reports. Kroah-Hartman states: 'For the kernel, we can handle it. We're a much larger team, very distributed, and our increase is real – and it's not slowing down.' However, he implies smaller projects have less capacity to absorb this sudden flood of plausible AI-generated bug reports.
AI's current role in kernel development
AI is currently showing up more as a reviewer and assistant than as a full author of Linux kernel code, though that line is starting to blur. Kroah-Hartman conducted his own experiment with AI-generated patches: 'I did a really stupid prompt. I said, 'Give me this,' and it spit out 60: 'Here's 60 problems I found, and here's the fixes for them.' About one-third were wrong, but they still pointed out a relatively real problem, and two-thirds of the patches were right.'
He notes that even the working patches still needed human cleanup, better changelogs, and integration work. For 'simple little error conditions, properly detecting error conditions,' AI could already generate dozens of usable patches today.
Tooling response
The increase in AI-generated reports has spurred integration of AI into the kernel's review infrastructure. A key tool is Sashiko, originally developed at Google and now donated to the Linux Foundation. Kroah-Hartman says: 'We need to be able to have an easy way to review some of these patches that come in ways that cut down on our load.' The tool is 'out there, running on almost all kernel patches' and publicly visible, with integration into review tools underway.
📖 Read the full source: HN AI Agents
👀 See Also

GM Lays Off 600 IT Workers, Hires AI-Focused Engineers for Agent and Model Development
General Motors cut 600 IT employees (~10% of the department) to hire workers with AI-native skills: agent development, data engineering, cloud engineering, prompt engineering.

Open-source models match or beat Claude Opus 4.6 on benchmarks
DeepSeek V3.2, DeepSeek R1, Kimi K2.5, and MiniMax M2.5 outperform Claude Opus 4.6 on 4 out of 5 major benchmarks including MMLU-Pro, speed, tool use, and reasoning, while being significantly cheaper.

ThermoQA: Open Benchmark for Engineering Thermodynamics Tests LLMs on 293 Calculation Problems
ThermoQA is an open benchmark with 293 engineering thermodynamics problems across three tiers, testing LLMs on exact numerical calculations. Claude Opus 4.6 leads with 94.1% composite score, while DeepSeek-R1 shows highest run-to-run variance at ±2.5%.

Three Critical Gaps in OpenClaw for Production AI Agents
A developer identifies three missing capabilities in OpenClaw that prevent AI agents from functioning as true employees: auditability, granular action control, and instruction resolution.