tytodd
/

qwen3.5-hard-only-r4

Text Classification

confidence-probe

sequence-classification

Model card Files Files and versions

qwen3.5-hard-only-r4

Summary

Base model: Qwen/Qwen3.5-4B
Dataset: tytodd/qwen3.5-4b-v1
Checkpoint: tytodd/qwen3.5-hard-only-r4

OOD Evaluation

benchmark	n	auroc	accuracy
arc_challenge	1000	0.8875	0.8890
judge_bench	278	0.7065	0.6583
mmlu	1000	0.7550	0.7680
mmlu_pro	1000	0.6889	0.7070
rod101_essay_scoring	81	0.7115	0.7407

MMLU AUROC with Tuning ( by percentage of data used to train)

Downloads last month: -; Downloads are not tracked for this model. How to track

Model tree for tytodd/qwen3.5-hard-only-r4

Base model

Qwen/Qwen3.5-4B-Base

Finetuned

Qwen/Qwen3.5-4B

Adapter

(55)

this model