SafePro Evaluates Safety of Professional Agents

A Jan 2026 arXiv preprint from Kaiwen Zhou et al. introduces SafePro, a comprehensive benchmark to evaluate safety alignment of AI agents performing high-complexity professional tasks. The paper presents a dataset covering diverse professional domains, evaluates state-of-the-art models, and reports significant safety vulnerabilities and novel unsafe behaviors. It also tests mitigation strategies and finds encouraging improvements, signaling urgent need for tailored safety mechanisms.
Scoring Rationale
High novelty and broad scope across professional agent safety, but limited by preprint (non peer-reviewed) status.
Practice interview problems based on real data
1,500+ SQL & Python problems across 15 industry datasets — the exact type of data you work with.
Try 250 free problems


