LangAGI-Lab/qwen-7b-instruct-8k-dpo-preference-set
Viewer
•
Updated
•
8.16k
LangAGI-Lab/qwen-7b-verified-7k-rejection-sampling-alpaca-format
Viewer
•
Updated
•
7.38k
•
1
LangAGI-Lab/MetaMATH_30K_llama_ppo
Viewer
•
Updated
•
30k
•
1
LangAGI-Lab/train-rl-o1-mini-annotated-math-numina-22k
Viewer
•
Updated
•
22k
•
5
•
1
LangAGI-Lab/train-rl-o1-mini-annotated-math-numina-10k-numeric-answer
Viewer
•
Updated
•
10k
LangAGI-Lab/numina-cot-verifiable-10k
Viewer
•
Updated
•
10k
•
1
LangAGI-Lab/train-rl-o1-mini-annotated-magpie-hard-math-22k
Viewer
•
Updated
•
22k
•
2
LangAGI-Lab/magpie-reasoning-v1-10k-verification-alpaca-format
Viewer
•
Updated
•
7.31k
•
1
LangAGI-Lab/MetaMATH_30K_llama
Viewer
•
Updated
•
30k
•
2
LangAGI-Lab/MetaMATH_SFT_50K_new
Viewer
•
Updated
•
50k
•
1
LangAGI-Lab/magpie-reasoning-v1-10k-step-by-step-rationale-alpaca-format
Viewer
•
Updated
•
10k
•
51
•
1
LangAGI-Lab/magpie-reasoning-v1-10k-step-by-step-rationale
Viewer
•
Updated
•
10k
•
51
LangAGI-Lab/magpie-reasoning-v1-100k-thought-summary
Viewer
•
Updated
•
96.4k
•
1
•
1
LangAGI-Lab/MetaMATH_SFT_50K
Viewer
•
Updated
•
50k
•
1
LangAGI-Lab/general-reasoning-1k-o1-mini-api-thought-cost-w-metadata
Viewer
•
Updated
•
857
•
2
•
2
LangAGI-Lab/MetaMATH_30K_new
Viewer
•
Updated
•
30k
Viewer
•
Updated
•
30k
LangAGI-Lab/math-train-7k
Viewer
•
Updated
•
7.5k
LangAGI-Lab/general-reasoning-1k
Viewer
•
Updated
•
1k
•
1
LangAGI-Lab/critic2_feedbackonly
Viewer
•
Updated
•
2k
LangAGI-Lab/critic1_feedbackonly
Viewer
•
Updated
•
2k
LangAGI-Lab/math-train-1K
Viewer
•
Updated
•
985
LangAGI-Lab/med_critic2_train_1000
Viewer
•
Updated
•
1.4k
LangAGI-Lab/med_critic1_train_1000
Viewer
•
Updated
•
1.5k
LangAGI-Lab/Medical_reward_bench
Viewer
•
Updated
•
2.36k
•
43
LangAGI-Lab/train-self-refine-dist
Viewer
•
Updated
•
12.9k
LangAGI-Lab/mini_rm_benchmark_for_web_agent
Viewer
•
Updated
•
128
•
1
LangAGI-Lab/world_model_for_wa_desc_with_tao_dataset_with_transition_count
Viewer
•
Updated
•
14.7k
•
2
LangAGI-Lab/Multimodal-Mind2Web-HTML-WM-messages-filter-35000
Viewer
•
Updated
•
4.34k
•
3
LangAGI-Lab/Multimodal-Mind2Web-HTML-WM-messages
Viewer
•
Updated
•
6.77k
•
4