HN data confirms arXiv paper share dropping, LLM hype peak behind us?

Dylan Castillo wanted to confirm whether he was seeing fewer arXiv papers on Hacker News front pages, so he used Claude to run a quick analysis against the BigQuery HN dataset. The results show a clear trend: the share of arXiv stories on HN has been declining sharply in the last few months.
He also looked at historical peaks. The first peak in 2019 was driven by deep learning papers — 41% of the top 100 upvoted arXiv posts that year were about deep learning. The 2023–2026 period saw an even heavier AI focus: 59% of the top 100 upvoted arXiv stories were about LLMs or AI. In 2019 the standout papers included MuZero (161 pts), EfficientNet (119 pts), XLNet (79 pts), the PyTorch NeurIPS paper (113 pts), and Chollet's “On the Measure of Intelligence” (80 pts).
For the 2023–2026 period, Castillo asked Claude to guess which papers will age well. The picks: DeepSeek-R1 (1,351 pts, open recipe for o1-style reasoning via RL), Generative Agents (391 pts, the “Smallville” paper), The Era of 1-bit LLMs / BitNet b1.58 (1,040 pts), Differential Transformer (562 pts), and the LK-99 cluster (2,408 + 1,690 pts combined, a landmark in open-science replication). The full analysis includes charts for topic distribution and the arXiv share over time.
📖 Read the full source: HN LLM Tools
👀 See Also

AI Usage in Development Hits 93%, Yet Productivity Gains Stagnate at 10%
The use of AI coding assistants is prevalent among developers, with 93% adopting them. However, the productivity boost remains limited to just 10%.

CEOs Report Minimal AI Impact on Productivity and Employment in Recent Study
A study of 6,000 executives found 90% reported no AI impact on employment or productivity over three years, with average AI usage at 1.5 hours per week. Economists compare this to Solow's productivity paradox from the 1980s IT era.

MLX Inference Performance Update: April 2026 Benchmarks and Features
MLX inference performance has improved significantly, with Qwen3.5-35B-A3B reaching 71.8 tokens/second at 4K context and new features like Multi-Token Prediction and SpecPrefill providing 2.3x-5.5x speedups for large models.

Dangerously Skip Reading Code: When LLMs Write Code Faster Than You Can Read It
What if we stop reviewing LLM-generated code and instead treat it like machine code? Move rigor to specifications and tests.