A Coin Flip for Safety: LLM Judges Fail to Reliably Measure Adversarial Robustness Paper • 2603.06594 • Published Feb 4 • 1
CoinflipForSafety Collection Datasets from the paper: A Coin Flip for Safety: LLM Judges Fail to Reliably Measure Adversarial Robustness (arxiv: https://arxiv.org/abs/2603.06594) • 4 items • Updated Mar 16 • 1