SPARK: Stepwise Process-Aware Rewards for Reference-Free Reinforcement Learning Paper • 2512.03244 • Published 24 days ago • 16
WebCoach: Self-Evolving Web Agents with Cross-Session Memory Guidance Paper • 2511.12997 • Published Nov 17 • 10
The African Languages Lab: A Collaborative Approach to Advancing Low-Resource African NLP Paper • 2510.05644 • Published Oct 7 • 23
Comparing Bad Apples to Good Oranges: Aligning Large Language Models via Joint Preference Optimization Paper • 2404.00530 • Published Mar 31, 2024
ModelCitizens: Representing Community Voices in Online Safety Paper • 2507.05455 • Published Jul 7 • 4
How to Train Your Fact Verifier: Knowledge Transfer with Multimodal Open Models Paper • 2407.00369 • Published Jun 29, 2024
Don't Look Only Once: Towards Multimodal Interactive Reasoning with Selective Visual Revisitation Paper • 2505.18842 • Published May 24 • 36
X-Teaming: Multi-Turn Jailbreaks and Defenses with Adaptive Multi-Agents Paper • 2504.13203 • Published Apr 15 • 35
X-Teaming: Multi-Turn Jailbreaks and Defenses with Adaptive Multi-Agents Paper • 2504.13203 • Published Apr 15 • 35
X-Teaming: Multi-Turn Jailbreaks and Defenses with Adaptive Multi-Agents Paper • 2504.13203 • Published Apr 15 • 35
MOSAIC: Modeling Social AI for Content Dissemination and Regulation in Multi-Agent Simulations Paper • 2504.07830 • Published Apr 10 • 18
MOSAIC: Modeling Social AI for Content Dissemination and Regulation in Multi-Agent Simulations Paper • 2504.07830 • Published Apr 10 • 18
MOSAIC: Modeling Social AI for Content Dissemination and Regulation in Multi-Agent Simulations Paper • 2504.07830 • Published Apr 10 • 18
SciCode: A Research Coding Benchmark Curated by Scientists Paper • 2407.13168 • Published Jul 18, 2024 • 16
Generalization in Healthcare AI: Evaluation of a Clinical Large Language Model Paper • 2402.10965 • Published Feb 14, 2024 • 1