MinCoder-4B-Exp / README.md

beyoru

Update README.md

8963dab verified 2 months ago

preview code

raw

history blame contribute delete

776 Bytes

metadata

base_model: Qwen/Qwen3-4B-Instruct-2507
tags:
  - transformers
  - qwen3
license: apache-2.0
language:
  - en
library_name: transformers

Model details

This model is fine-tuned from Qwen3-4B-Instruct using a custom reinforcement learning (RL) framework that rewards the model for producing solutions passing automated test cases — similar to the process of programming task evaluation on LeetCode.

Instead of relying on labeled ground truth answers, the model learns through test-case-based rewards, promoting generalization and reasoning ability in algorithmic problem-solving.

This is an experimental model