Cowork vs. Claude Chat: Document Extraction Accuracy Comparison

A developer building a tool for analyzing publicly traded stock annual reports conducted a controlled comparison between Claude.ai chat and Cowork for extracting data from dense financial PDFs. The test used identical prompts and the same 140+ page PDFs containing financial tables, footnotes, and cross-referenced disclosures.
Test Results
Test 1 - Claude.ai chat: Uploaded PDF, pasted prompt. Output was institutional-grade with every line item verified against the source. The model demonstrated self-correcting behavior, catching its own mistakes mid-extraction and fixing them. No errors were found across 150+ data points checked.
Test 2 - Cowork (workflow with existing project folder): Produced 5 factual errors, extracted 30% less content, and missed most forensic-depth material. While headline numbers were correct, detail on sub-components was lost.
Test 3 - Cowork (clean folder, just PDF and prompt): Still produced errors including:
- Fabricated reconciling line items
- Reverse-engineered unit counts
- Multiple categories off by 20-90% from actual financial statement notes
- Prior-year column contamination (current-year figures correct, but FY2024 comparative figures had errors across earnings and FCF tables)
Pattern Analysis
The developer observed that Cowork consistently produced correct current-year totals but unreliable line-item breakdowns. The model appeared to paper over gaps by fabricating reconciling plugs and back-solving to hit known diluted totals rather than reading from the document. In contrast, Claude chat either extracted details correctly or flagged what it couldn't find.
The conclusion suggests that Cowork's agentic task decomposition (chunking, sub-agents, parallel processing) cannot maintain the sustained attention required for long, cross-referenced financial documents. Chat processes PDFs in a single deep pass, while Cowork breaks them up and loses fidelity.
This accuracy gap matters for professional use cases where fabrication is invisible without independent verification of every number. The developer is seeking community feedback on whether others have observed similar patterns with Cowork producing plausible but fabricated detail that Claude chat handles cleanly.
📖 Read the full source: r/ClaudeAI
👀 See Also

Spectyra Plugin for OpenClaw: Real-Time AI Cost Optimization by Analyzing Full Request Flow
Spectyra plugin reduces AI API costs by surfacing hidden waste like repeated calls, excessive context, and expensive model misuse in real time.

OpenClaw Agent Relay Plugin Fixes Telegram Delivery in Multi-Agent Setups
The openclaw-agent-relay plugin addresses the persistent issue where sessions_send responses go to webchat instead of Telegram by using gateway WebSocket RPC to trigger agent turns with deliver:true, eliminating the need for workarounds like explicit message tools or announce steps.

Open Source MCP Server Connects Claude to Brazilian Central Bank Economic Data
Sidney Bissoli created bcb-br-mcp, an MIT-licensed MCP server that provides Claude access to 18,000+ time series from Brazil's Central Bank (SGS/BCB). The server includes 8 tools covering interest rates, inflation, exchange rates, GDP, employment, and credit data.

Claude Code skill combines DeepMind Aletheia and Anthropic harness approaches
A Claude Code skill implements a Planner→Generator→Evaluator→Reviser pipeline that synthesizes DeepMind's Aletheia math research agent with Anthropic's multi-agent coding architecture, adding blind pre-analysis where the evaluator reasons about correct approaches before seeing candidate code.