Recycle-and-Distill: Universal Compression Strategy for Transformer-based Speech SSL Models with Attention Map Reusing and Masking Distillation
Paper
•
2305.11685
•
Published
•
2
This repo contains the models from our paper Recycle-and-Distill: Universal Compression Strategy for Transformer-based Speech SSL Models with Attention Map Reusing and Masking Distillation, INTERSPEECH 2023.
Model type: ARMHuBERT is an open-source speech SSL model distilled from HuBERT-Base, by attention map reusing and masking distillation. We also provide the model checkpoints of MaskHuBERT (without attention map reusing) and ARMwavLM (wavLM-Base teacher).
License: Apache 2.0 License
Where to send questions or comments about the model: https://github.com/sungnyun/ARMHuBERT/issues
Pretraining data: LibriSpeech
[ModelName]-100h.ckpt: train-clean-100[ModelName]-960h.ckpt: train-clean-100 + train-clean-360 + train-other-500More detials are in our github, https://github.com/sungnyun/ARMHuBERT.