AblationBench Collection This is a collection of datasets used to evaluate language models in the task of ablation planning in empirical AI research. • 4 items • Updated May 16, 2025 • 5