Praxis-VLM - a zhehuderek Collection

zhehuderek 's Collections

Praxis-VLM

updated Sep 23, 2025

VLM with textual-driven GRPO training for vision-grounded decision making (https://arxiv.org/pdf/2503.16965, NeurIPS 2025)

zhehuderek/textual_decisionmaking_data

Viewer • Updated Apr 9, 2025 • 11k • 12 • 1

Note This is the textual synthetic data we used for model training.
zhehuderek/praxis_vlm_7b_decisionmaking

Image-Text-to-Text • 8B • Updated Jun 3, 2025
zhehuderek/praxis_vlm_3b_decisionmaking

Image-Text-to-Text • 4B • Updated Jun 3, 2025
zhehuderek/qwen2_5_vl_3b_GEOQA_8K_hf

Image-Text-to-Text • 4B • Updated Apr 9, 2025

Note This is the model checkpoint after cold-start math training using GEOQA-8K dataset.
zhehuderek/qwen2_5_vl_7b_GEOQA_8K_step90_hf

Image-Text-to-Text • 8B • Updated Sep 21, 2025

Note This is the model checkpoint after cold-start math training using GEOQA-8K dataset.