Using Claude Code to Automate AI Research Experiments for 12 Hours

Automated AI Research with Claude Code
A developer documented using Claude Code to automate AI research experiments for 12 hours straight. The project focused on CLaaS, a real-time continual learning framework that moves context into weights using self-distillation.
Experimental Setup
The goal was to tune self-distillation training runs to maximize a model's compliance to different preference verifiers, such as concise responses and no emojis. Experiments ran locally on an RTX 5090 overnight.
System Architecture
The repository was built to be highly configurable:
- Every tunable parameter exposed via CLI using Hydra config management
- HTML dashboards for every training step and evaluation run
- Metrics, inputs, and outputs made observable through dashboards
- Claude Code could query dashboards via curl requests to check progress
Experiment Management
The workflow was controlled by a local EXPERIMENTS.md file with specific rules:
- Each experiment could change at most one variable or make one code change
- Between experiments, the model had to either accept or revert the previous change based on results
- Any new code changes had to be exposed via config for later tuning
- The model recorded all progress, hypotheses, and outcomes in the file as a running journal
- Used a "Ralph Wiggum loop" with the goal of maximizing preference compliance
Results
Over 12 hours, the system ran 9 experiments:
- Found and fixed a model collapse bug on the first run
- Tuned gradient steps per batch to 4
- Tuned learning rate to 3e-5
- Compliance improved from 0.000 to 1.000
- Token usage was surprisingly low because most time was spent waiting for training runs between experiments
The same task was also run with Codex for 2 hours using a plain prompt, and it independently converged on the same hyperparameters.
Project repository: https://github.com/kfallah/CLaaS
📖 Read the full source: r/ClaudeAI
👀 See Also

Solo Founder Builds Demo Video with Claude Code and Remotion
A solo developer used Claude Code and Remotion to create a product demo video in a weekend for $0, overcoming a launch delay caused by lack of design skills and budget constraints.

How AI Agents Apply Cognitive Principles Consistently in Development Workflows
AI agents can operationalize four layers of cognitive principles—epistemic foundations, execution principles, leverage principles, and system design—with relentless consistency across personal, nonprofit, and community governance tasks.

Practical Criticism of LLM Memory: Immutable Reflections & Ephemeral Sessions as Solutions
A critique of long-running sessions, life-companions, and LLM-wiki memory, offering solutions like immutable reflections, issue-bound session chains, and prompt templates to avoid intent loss and context overload.

Practical Lessons from Deploying RAG Bots in Regulated Industries
A developer shares hard-won lessons from deploying RAG-powered AI assistants for Australian workplace compliance across construction, aged care, and mining operations. Key insights include query expansion techniques, document title matching, prompt layering, and infrastructure decisions.