GeM2-Llamion-14B

We have released Llamion as GeM 2.0, the second series of generative models developed by VAIV Company to address the our principal business needs.

Llamion (Llamafied Orion) is derived from transforming the Orion model into the standard LLaMA architecture through parameter mapping and offline knowledge transfer. Further technical specifications and study results are detailed in our paper.

vaiv_png

Notably, the LongChat model supports an extensive text range of 200K tokens. The following figure shows the perplexity of models on English Wikipedia corpus and Korean Wikipedia corpus, respectively.

ppl_wiki_enko

Contributors

Downloads last month
8,363
Safetensors
Model size
14B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for vaiv/GeM2-Llamion-14B-LongChat

Adapters
2 models
Quantizations
5 models

Collection including vaiv/GeM2-Llamion-14B-LongChat

Paper for vaiv/GeM2-Llamion-14B-LongChat