Products & Toolsanthropicclaude codedata labelingsnorkel ai

Anthropic Uses Contractors to Improve Claude Code

|June 1, 2026|By LDS Team

6.9

Relevance Score

Anthropic Uses Contractors to Improve Claude Code — Photo: i.insider.com · rights & takedowns

Business Insider reports that an Anthropic project, run through a data vendor identified as Snorkel AI and nicknamed "Marlin," is collecting developer feedback to fine-tune Claude Code. Two contractors told Business Insider they were paid up to $280 per task to create prompts and review code, and that tasks typically took about an hour, with some requiring additional review by Snorkel's approval layer. Business Insider reports the contractors A/B tested outputs from two models and chose preferred code, and that the contractors did not know which model versions they were evaluating. Companies building coding models often rely on high-skill contractors for nuanced labels; practitioners should view this as an example of scaled, paid human feedback rather than an automated benchmark.

What happened

Business Insider reports that an Anthropic project called Marlin, run via the data vendor Snorkel AI, is gathering human software-engineer feedback to improve Claude Code. Business Insider reports that freelancers with software engineering backgrounds were directed to A/B test code outputs from two different models and select which output they preferred, using project guidelines reviewed by Business Insider. Business Insider reports two contractors said they were paid up to $280 per task, that tasks took about an hour on average, and that some submissions required additional back-and-forth with Snorkel's approval layer. Business Insider reports the project is ongoing and that the contractors did not know which model versions they were evaluating.

Technical details

Business Insider reports the work focused on creating prompts and reviewing code, with reviewers comparing paired outputs to assess detail and maintainability. Business Insider reviewed project guidelines that instructed contractors to prefer outputs meeting the prompt's expected level of detail.

Editorial analysis - technical context

Label-generation for coding models frequently uses A/B preference collection and targeted prompt-writing to shape style and maintainability. Companies and vendors commonly hire experienced developers for these tasks because code evaluation requires domain knowledge beyond generic labelers. For practitioners, this pattern implies that high-quality coding training signals often depend on curated human comparisons and prompt engineering expertise rather than purely automated metrics.

Context and significance

Industry reporting places this story in a broader trend where data-labeling platforms and vendor-managed contractor pools play a critical, paid role in improving commercial coding assistants. Tracking contractor compensation and review workflows is relevant to reproducibility and auditability of model behavior.

What to watch

Editorial analysis: Observers should watch for vendor disclosures about reviewer instructions, sample sizes, and repeatability of A/B tests, and public reporting on whether similar programs disclose model versions or evaluation datasets.

What's next

Bottom line

Why it matters

Key Points

1Business Insider reports Anthropic uses a Snorkel AI-run project called Marlin to collect developer feedback for Claude Code.
2Reported contractor pay reached $280 per task, signaling high-skill labeling costs and vendor-managed labor intensity.
3Editorial analysis: A/B preference labeling and prompt-generation are common methods to shape coding-model outputs, affecting reproducibility.

Scoring Rationale

The story reveals operational details of how a commercial coding model is improved using paid developer feedback and vendor-managed A/B testing. That matters to practitioners building or auditing code-generation systems, but it is not a frontier-model release or new architecture.

MoreAnthropic news

Sources

Primary source and supporting public references used for this report.

1 source

Primary sourcebusinessinsider.comInside the unseen operation to turbocharge Claude Code

Practice interview problems based on real data

1,625 SQL & Python problems across 15 industry datasets — the exact type of data you work with.

Try 250 free problems