Tolan's AI-Enabled Engineering Interview Process

Tolan has redesigned their engineering interview process to reflect how engineers actually work with AI coding agents. Instead of traditional algorithmic questions, they focus on practical skills that matter when AI writes most production code.
The Interview Structure
Candidates spend a morning at their San Francisco office working on a small problem that Tolan has solved themselves. The problem comes from a bare-bones Figma file or short spec, typically representing a simple flow or lightweight feature that would normally take a day or two to build.
Candidates have just a few hours to work on the problem, which isn't enough time to create a polished product. The constraint is intentional—they want to see how candidates work within limitations.
AI Tools Encouraged
Candidates are explicitly encouraged to use AI to solve the problem. Tolan provides licenses for Claude, Codex, Cursor, or Gemini if needed. The key expectation is that candidates must balance LLM-generated code against their own judgment—even if they aren't writing the code, they own the output.
What they're looking for:
- How candidates approach the problem
- How they structure a solution
- How they think through constraints
- How they decide what actually matters
Evaluation Criteria
After the work session, there's a 20–30 minute conversation about what was created. Interviewers ask what candidates would improve if they had more time, what they'd change before sending for review, and what they'd change before shipping.
Red flags include:
- Candidates who use LLMs to think through how the project should be completed (like screenshotting Figma and asking Claude to solve it)
- Candidates who don't question unclear specs
- Candidates who say "I'm still not sure what this part does" but wouldn't change anything before human review
Positive signals include:
- Clarifying problem statements and exploring edge cases
- Recognizing tradeoffs
- Pointing out when something feels weird or doesn't seem right
- Showing creativity (like building a mini-game to entertain users during LLM response waits)
- Knowing when work isn't good enough and how to improve it
The core philosophy: In a world where implementation is getting easier, what matters most is judgment. Working code isn't the finish line—understanding and maintaining it is.
📖 Read the full source: HN AI Agents
👀 See Also

Sora AI Video Economics: $20 User Costs OpenAI $65 in Compute
OpenAI's Sora AI video generation app reportedly costs $65 in compute per $20/month user, with peak inference costs estimated at $15 million daily versus $2.1 million total lifetime revenue.

The AI Dependency Trap: Why Over-Reliance on LLMs May Erode Core Skills
A contrarian take arguing that heavy reliance on AI chatbots will lead to atrophy of critical thinking, writing, research, and learning abilities.

Cerebras releases Step-3.5-Flash-REAP models with 40% memory reduction
Cerebras has released Step-3.5-Flash-REAP models that use REAP (Router-weighted Expert Activation Pruning) to compress 196B parameter models to 121B while maintaining near-identical performance. The models work with vanilla vLLM and are optimized for resource-constrained environments.

SPLICE Benchmark Reveals VLMs Struggle with Temporal Reasoning, Rely on Language Priors
Research presented at EMNLP 2025 shows vision-language models score poorly on a video sequencing task where humans excel, with models like Gemini 2.0 Flash reaching 51% accuracy versus human performance of 85%. Models frequently rely on visual shortcuts and language descriptions rather than true visual understanding.