-
starriver030515/hapo_data
Viewer • Updated • 1.59k • 41 -
starriver030515/Qwen2.5-Math-1.5B-16k
Text Generation • 2B • Updated • 7 -
starriver030515/Qwen2.5-Math-7B-32k
Text Generation • 8B • Updated • 3 -
From Uniform to Heterogeneous: Tailoring Policy Optimization to Every Token's Nature
Paper • 2509.16591 • Published • 2
Zheng Liu
starriver030515
AI & ML interests
None yet
Recent Activity
liked a model about 20 hours ago
MiniMaxAI/MiniMax-M2.7 upvoted a paper about 21 hours ago
Tracing the Roots: A Multi-Agent Framework for Uncovering Data Lineage in Post-Training LLMs upvoted a paper 6 days ago
OpenWorldLib: A Unified Codebase and Definition of Advanced World ModelsOrganizations
FUSION-Data
FUSION-Stage1
HAPO
-
starriver030515/hapo_data
Viewer • Updated • 1.59k • 41 -
starriver030515/Qwen2.5-Math-1.5B-16k
Text Generation • 2B • Updated • 7 -
starriver030515/Qwen2.5-Math-7B-32k
Text Generation • 8B • Updated • 3 -
From Uniform to Heterogeneous: Tailoring Policy Optimization to Every Token's Nature
Paper • 2509.16591 • Published • 2
FUSION-Model
FUSION-Data
FUSION-Stage1.5
FUSION-Stage1