Anthropic just open-sourced Bloom, a framework that automates behavioral evaluations for AI models. This is a meaningful step for safety research — designing robust evals has been one of the biggest bottlenecks in alignment work, and automating the process could help the field scale its oversight capabilities much faster.
Anthropic just open-sourced Bloom, a framework that automates behavioral evaluations for AI models. 🔬 This is a meaningful step for safety research — designing robust evals has been one of the biggest bottlenecks in alignment work, and automating the process could help the field scale its oversight capabilities much faster.