Keyu Duan

vermouthdky

https://kduan.live

vermouthdky

AI & ML interests

LLM Reasoning and Safety

Recent Activity

upvoted a paper about 2 months ago

Diffusion Language Models are Super Data Learners

upvoted a paper about 2 months ago

Defeating the Training-Inference Mismatch via FP16

upvoted a paper about 2 months ago

ThinkMorph: Emergent Properties in Multimodal Interleaved Chain-of-Thought Reasoning

View all activity

Organizations

upvoted 3 papers about 2 months ago

Diffusion Language Models are Super Data Learners

Paper • 2511.03276 • Published Nov 5, 2025 • 128

Defeating the Training-Inference Mismatch via FP16

Paper • 2510.26788 • Published Oct 30, 2025 • 29

ThinkMorph: Emergent Properties in Multimodal Interleaved Chain-of-Thought Reasoning

Paper • 2510.27492 • Published Oct 30, 2025 • 82

updated a dataset 2 months ago

axon-rl/webshop_instructions

Viewer • Updated Oct 27, 2025 • 6.91k • 42

published a dataset 2 months ago

axon-rl/webshop_instructions

Viewer • Updated Oct 27, 2025 • 6.91k • 42

updated a dataset 2 months ago

axon-rl/webshop

Viewer • Updated Oct 27, 2025 • 1k • 28

published a dataset 2 months ago

axon-rl/webshop

Viewer • Updated Oct 27, 2025 • 1k • 28

authored 2 papers 3 months ago

Efficient Process Reward Model Training via Active Learning

Paper • 2504.10559 • Published Apr 14, 2025 • 13

GEM: A Gym for Agentic LLMs

Paper • 2510.01051 • Published Oct 1, 2025 • 89

upvoted 3 papers 3 months ago

upvoted 3 papers 7 months ago

Fostering Video Reasoning via Next-Event Prediction

Paper • 2505.22457 • Published May 28, 2025 • 29

Reinforcing General Reasoning without Verifiers

Paper • 2505.21493 • Published May 27, 2025 • 26

Lifelong Safety Alignment for Language Models

Paper • 2505.20259 • Published May 26, 2025 • 23

upvoted a paper 9 months ago

Efficient Process Reward Model Training via Active Learning

Paper • 2504.10559 • Published Apr 14, 2025 • 13

updated 2 models 9 months ago

sail/ActPRM-X

7B • Updated Apr 15, 2025 • 16

sail/ActPRM

7B • Updated Apr 15, 2025 • 9

updated a collection 9 months ago

🚀 Active PRM

Collection

Efficient Process Reward Model Training via Active Learning. • 4 items • Updated Apr 16, 2025 • 3

Keyu Duan

AI & ML interests

Recent Activity

Organizations

vermouthdky's activity