Researchllmalignmentanthropicprompt injection

Anthropic Trains Claude With Internal Soul Document

|December 2, 2025|By LDS Team

8.0

Relevance Score

Anthropic Trains Claude With Internal Soul Document

Researcher Richard Weiss extracted a 14,000-token 'soul overview' from Claude 4.5 Opus at release, which Anthropic researcher Amanda Askell confirmed was used during supervised learning. The document outlines values, safety priorities, and guidance such as skepticism toward claimed contexts and defenses against prompt injection. The disclosure shows Anthropic embeds alignment-oriented instructions directly into model training to shape behavior.

Key Points

1Finds a 14,000-token internal 'soul' document used to shape Claude 4.5 Opus behavior.
2Explains Anthropic's intent to instill safety, values, and skepticism against prompt injection attacks.
3Signals pretrained alignment methods are incorporated during SL, affecting prompt design and pipeline security.

Scoring Rationale

Confirmed internal training document reveals concrete alignment practices, but findings are limited to Anthropic's Claude 4.5 instance.

MoreAnthropic news

Sources

Public references used for this report.

1 source

01simonwillison.netClaude 4.5 Opus' Soul Document

Practice with real Logistics & Shipping data

90 SQL & Python problems · 15 industry datasets

Used by DS/ML engineers at top companies

High-Value Overnight OrdersEasy

Delivered International ShipmentsMedium

On-Time Delivery Rate by CarrierHard

250 free problems · No credit card

See all Logistics & Shipping problems

Researchllmalignmentanthropicprompt injection

Anthropic Trains Claude With Internal Soul Document

|December 2, 2025|By LDS Team

8.0

Relevance Score

Key Points

1Finds a 14,000-token internal 'soul' document used to shape Claude 4.5 Opus behavior.
2Explains Anthropic's intent to instill safety, values, and skepticism against prompt injection attacks.
3Signals pretrained alignment methods are incorporated during SL, affecting prompt design and pipeline security.

Scoring Rationale

Confirmed internal training document reveals concrete alignment practices, but findings are limited to Anthropic's Claude 4.5 instance.

MoreAnthropic news

Sources

Public references used for this report.

1 source

01simonwillison.netClaude 4.5 Opus' Soul Document

Practice with real Logistics & Shipping data

90 SQL & Python problems · 15 industry datasets

Used by DS/ML engineers at top companies

High-Value Overnight OrdersEasy

Delivered International ShipmentsMedium

On-Time Delivery Rate by CarrierHard

250 free problems · No credit card

See all Logistics & Shipping problems

Anthropic Trains Claude With Internal Soul Document

Key Points

Scoring Rationale

Sources

More AI & Data Science News

codebase-memory-mcp speeds AI coding agent queries

Hanwha Announces 55 Trillion Won Aerospace, AI Investment

Kioxia ships higher-density 3D flash for AI data centers

Shenzhi Cup Concludes Preliminary Judging, Finals Set in Shanghai

Anthropic Trains Claude With Internal Soul Document

Key Points

Scoring Rationale

Sources

More AI & Data Science News

codebase-memory-mcp speeds AI coding agent queries

Hanwha Announces 55 Trillion Won Aerospace, AI Investment

Kioxia ships higher-density 3D flash for AI data centers

Shenzhi Cup Concludes Preliminary Judging, Finals Set in Shanghai