Model Stock: All we need is just a few fine-tuned models
Paper
• 2403.19522 • Published
• 14
Original name : NexesMess/Llama_3.x_70b_Dolnemlimwhitessachi_v1.0
Release name : Dolmen v1.2
Replacing : https://huggingface.co/NexesMess/Llama_3.x_70b_Dolnemhertulimtess_v1.0 (Dolmen v1.0)
OUT :
IN :
There's definitively progress. The prose is more structured, and the EOS is now working.
Blackroot, for his own observations on his Mirai Series, and the hint about the Hitachi model, which I then tested with delight.
This is a merge of pre-trained language models created using mergekit.
This model was merged using the Model Stock merge method using Nexesenex/Llama_3.x_70b_L3.3_Dolphin_128K_v1.02 as a base.
The following models were included in the merge:
The following YAML configuration was used to produce this model:
merge_method: model_stock
models:
- model: Nexesenex/Llama_3.x_70b_L3.3_Dolphin_128K_v1.02
parameters:
weight: 1.0
- model: huihui-ai/Llama-3.1-Nemotron-70B-Instruct-HF-abliterated
parameters:
weight: 1.0
- model: huihui-ai/Tess-R1-Limerick-Llama-3.1-70B-abliterated
parameters:
weight: 1.0
- model: WhiteRabbitNeo/Llama-3.1-WhiteRabbitNeo-2-70B
parameters:
weight: 1.0
- model: migtissera/Tess-3-Llama-3.1-70B
parameters:
weight: 1.0
- model: hitachi-nlp/Llama-3.1-70B-FLDx2
parameters:
weight: 1.0
base_model: Nexesenex/Llama_3.x_70b_L3.3_Dolphin_128K_v1.02
dtype: bfloat16
out_dtype: bfloat16
parameters:
int8_mask: true
normalize: true
rescale: false
filter_wise: false
smooth: false
allow_negative_weights: false
chat_template: auto
tokenizer:
source: union