Anthropic Releases Bloom For Evaluating AI Behavior

Anthropic PBC on Friday released Bloom, an open-source agentic framework that helps researchers define and evaluate specific AI behaviors across scenarios. Bloom generates scenario-based tests, simulates interactions, and uses human-calibrated judgment and meta-judges to score frequency and severity of behaviors; Anthropic also published benchmarks for four problematic behaviors across 16 frontier models. The tool complements Petri and aims to accelerate reproducible alignment evaluations.
Key Points
- 1Introduces Bloom, an open-source agentic framework that automates scenario generation and behavior evaluation
- 2Provides human-calibrated judgments and meta-judges, enabling reproducible scoring of frequency and severity
- 3Allows researchers to detect misalignment like sycophancy, sabotage, and self-preservation across frontier models
Scoring Rationale
Official open-source release provides practical, reproducible alignment testing across many frontier models; moderate novelty compared with other exploration tools.
Sources
Public references used for this report.
Practice interview problems based on real data
1,625 SQL & Python problems across 15 industry datasets — the exact type of data you work with.
Try 250 free problems


