YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

This is an experimental depth-upscale of Qwen2.5 14B to a total of 21.4B parameters. A total of 24 layers were added (layers 30-41 inclusive each repeated twice) bringing the total to 72 layers.

The added layers had the o_proj and down_proj modules zeroed out prior to retraining as seen in other modern depth upscaling experiments.

The upscaled model was then trained on a mix of about 10M tokens worth of instruct and creative data, with the majority being general instruct training to try to repair those connections.

Downloads last month
-
Safetensors
Model size
21B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Columbidae/Qwen2.5-21B-Experimental

Quantizations
1 model