Fine-tuning llama3.2 3B for personalized health coaching using Apple Watch data and MLX

A developer created a personalized health coach LLM by fine-tuning llama3.2 3B on a Mac using Apple Health and Whoop data. The entire fine-tuning process took approximately 15 minutes using MLX.
Technical pipeline
The implementation follows this workflow:
- Apple Health and Whoop data stored in local SQLite database
- SQL RAG layer converts natural language queries to SQL
- Claude API used once to generate ~270 gold-standard training examples (anonymized question/SQL/result pairs, no personal health data sent)
- LoRA fine-tuning on llama3.2 3B via MLX
- Fused model served locally at 127.0.0.1:8080
Before vs. after fine-tuning
The source provides concrete examples of the improvement:
Before fine-tuning: "Your HRV is an important measure of autonomic nervous system function..." [500 words of generic advice]
After fine-tuning: "Your HRV averaged 68ms this week, down 12% from last week's 77ms. Coincides with 3 nights under 7 hours sleep. Consider reducing training intensity for 48 hours."
Memory footprint and hardware
- Model (4-bit): ~2 GB
- LoRA adapter: ~50 MB
- Training memory: ~4-5 GB total
- Runs on M-series Mac, no GPU needed
The developer mentions including technical details on SQL hallucination guardrails, cross-metric context enrichment, and the training pipeline in their full writeup. They also offer to answer questions about the MLX setup or RAG layer implementation.
📖 Read the full source: r/LocalLLaMA
👀 See Also

OpenClaw Agent Implements Autonomous Self-Improvement Loop with Nightly Dream Cycles
An OpenClaw user has configured their agent to run a nightly 'dream cycle' that scans AI research, reflects on performance, and implements safe improvements autonomously. The cycle costs approximately $0.40 per night using model routing with Haiku for scanning and Opus for judgment.

AI Agent Recommends Switching from GitHub Runners to Self-Hosted Mac Mini
An AI CEO agent analyzed CI/CD costs during a sprint and determined GitHub-hosted runners were wasteful, recommending a switch to a self-hosted Mac Mini instead. The human shareholder had scoped the project differently, but the AI's infrastructure judgment was correct.

Hybrid Local+API Approach Cuts AI Costs by 79% in Month-Long Test
A developer running a 24/7 AI assistant on a Hetzner VPS reduced monthly costs from $288 to $60 by strategically combining local models with API calls. The setup uses nomic-embed-text for embeddings and Qwen2.5 7B for background tasks, routing more complex work to Claude models.

Claude AI Used to Generate Performance Evaluation Document from User History
A developer used Claude AI to complete a 3-4 page performance evaluation document by asking it to 'complete this documentation using information you have about me.' The AI generated a detailed document in 5-6 minutes that included work contributions the user had almost forgotten.