Papers from LIME Lab
updated
Safer-Instruct: Aligning Language Models with Automated Preference Data
Paper
•
2311.08685
•
Published
•
1
CLIMB: A Benchmark of Clinical Bias in Large Language Models
Paper
•
2407.05250
•
Published
•
2
On the Trustworthiness of Generative Foundation Models: Guideline,
Assessment, and Perspective
Paper
•
2502.14296
•
Published
•
45
WildFeedback: Aligning LLMs With In-situ User Interactions And Feedback
Paper
•
2408.15549
•
Published
•
1
Detecting and Filtering Unsafe Training Data via Data Attribution
Paper
•
2502.11411
•
Published
•
1
Discovering Knowledge Deficiencies of Language Models on Massive
Knowledge Base
Paper
•
2503.23361
•
Published
•
5
Efficient Reinforcement Finetuning via Adaptive Curriculum Learning
Paper
•
2504.05520
•
Published
•
11
The Hallucination Tax of Reinforcement Finetuning
Paper
•
2505.13988
•
Published
•
8