ssc-qxp-mms-model-mix-adapt-max

This model is a fine-tuned version of facebook/mms-1b-all on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1612
  • Cer: 0.0900
  • Wer: 0.5028

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 16
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 100
  • num_epochs: 30
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Cer Wer
0.215 0.9975 200 0.1886 0.1071 0.5928
0.1815 1.9925 400 0.1718 0.1030 0.5625
0.1654 2.9875 600 0.1648 0.1021 0.5597
0.1564 3.9825 800 0.1603 0.1003 0.5478
0.146 4.9776 1000 0.1513 0.0957 0.5358
0.1375 5.9726 1200 0.1558 0.1001 0.5478
0.139 6.9676 1400 0.1593 0.0994 0.5368
0.1266 7.9626 1600 0.1467 0.0952 0.5322
0.1267 8.9576 1800 0.1552 0.0971 0.5432
0.1254 9.9526 2000 0.1536 0.0982 0.5377
0.1169 10.9476 2200 0.1497 0.0930 0.5193
0.115 11.9426 2400 0.1527 0.0965 0.5303
0.116 12.9377 2600 0.1505 0.0953 0.5239
0.1096 13.9327 2800 0.1560 0.0960 0.5303
0.1051 14.9277 3000 0.1599 0.0976 0.5340
0.1016 15.9227 3200 0.1541 0.0941 0.5156
0.0958 16.9177 3400 0.1592 0.0956 0.5303
0.0894 17.9127 3600 0.1573 0.0950 0.5211
0.0838 18.9077 3800 0.1567 0.0939 0.5257
0.0836 19.9027 4000 0.1631 0.0943 0.5175
0.0803 20.8978 4200 0.1612 0.0927 0.5156
0.0761 21.8928 4400 0.1526 0.0883 0.5037
0.0753 22.8878 4600 0.1589 0.0931 0.5184
0.0724 23.8828 4800 0.1597 0.0928 0.5129
0.0689 24.8778 5000 0.1622 0.0918 0.5101
0.0644 25.8728 5200 0.1620 0.0907 0.5055
0.066 26.8678 5400 0.1591 0.0888 0.5
0.0663 27.8628 5600 0.1590 0.0873 0.5
0.0659 28.8579 5800 0.1600 0.0894 0.5037
0.0615 29.8529 6000 0.1612 0.0900 0.5028

Framework versions

  • Transformers 4.57.2
  • Pytorch 2.9.1+cu128
  • Datasets 3.6.0
  • Tokenizers 0.22.0
Downloads last month
-
Safetensors
Model size
1.0B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ctaguchi/ssc-qxp-mms-model-mix-adapt-max

Finetuned
(341)
this model