Can we get the bpe.model and unigram. vocab file for the zipformer streaming model.
#6
by
programindz
- opened
I'm using the english streaming zipformer model available at (with sherpa-onnx):
https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-en-kroko-2025-08-06.tar.bz2
However the current model is unable to pick up some contextual words/phrases. I therefore want to supply it with a hotwords_file along with the .vocab file which contains tokens and log probabilities to boost some contextual words/phrases.
For those (older) models, we can no longer provide them, for the newer ones, we are considering what to do, no decision has been taken yet.
Since the model is open source, can you provide the information on the data corpus used for training/finetuning?