ssc-kbd-mms-model

This model is a fine-tuned version of facebook/mms-1b-all on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.2559
  • Cer: 0.0920
  • Wer: 0.5172

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0003
  • train_batch_size: 8
  • eval_batch_size: 12
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 16
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 100
  • num_epochs: 10
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Cer Wer
0.8242 0.1719 200 0.6052 0.1907 0.8410
0.5396 0.3438 400 0.4636 0.1535 0.7533
0.4706 0.5157 600 0.4237 0.1411 0.6953
0.4313 0.6876 800 0.3889 0.1342 0.6967
0.399 0.8595 1000 0.3817 0.1263 0.6548
0.3835 1.0309 1200 0.3536 0.1204 0.6379
0.4002 1.2028 1400 0.3461 0.1178 0.6223
0.3667 1.3747 1600 0.3403 0.1168 0.6230
0.3641 1.5466 1800 0.3356 0.1158 0.6277
0.3461 1.7185 2000 0.3271 0.1127 0.6118
0.3539 1.8904 2200 0.3223 0.1109 0.6007
0.3404 2.0619 2400 0.3188 0.1093 0.5941
0.3285 2.2338 2600 0.3115 0.1083 0.5927
0.3332 2.4057 2800 0.3093 0.1075 0.5888
0.3276 2.5776 3000 0.3062 0.1047 0.5783
0.3274 2.7495 3200 0.3033 0.1045 0.5749
0.3137 2.9214 3400 0.2981 0.1042 0.5717
0.3095 3.0928 3600 0.3001 0.1050 0.5807
0.3146 3.2647 3800 0.3041 0.1058 0.5788
0.3147 3.4366 4000 0.2922 0.1039 0.5865
0.2873 3.6085 4200 0.2905 0.1013 0.5628
0.2973 3.7804 4400 0.2887 0.1014 0.5590
0.3028 3.9523 4600 0.2853 0.1011 0.5583
0.2747 4.1238 4800 0.2881 0.0983 0.5490
0.2928 4.2957 5000 0.2897 0.1000 0.5556
0.2825 4.4676 5200 0.2872 0.0982 0.5492
0.2861 4.6394 5400 0.2820 0.0990 0.5535
0.277 4.8113 5600 0.2831 0.0986 0.5509
0.2827 4.9832 5800 0.2805 0.0970 0.5434
0.2695 5.1547 6000 0.2758 0.0970 0.5455
0.2696 5.3266 6200 0.2748 0.0962 0.5396
0.2834 5.4985 6400 0.2716 0.0966 0.5408
0.2786 5.6704 6600 0.2786 0.0970 0.5362
0.2741 5.8423 6800 0.2693 0.0948 0.5315
0.2816 6.0138 7000 0.2697 0.0952 0.5330
0.2587 6.1856 7200 0.2682 0.0951 0.5347
0.2703 6.3575 7400 0.2666 0.0940 0.5304
0.2503 6.5294 7600 0.2671 0.0949 0.5327
0.2656 6.7013 7800 0.2654 0.0944 0.5284
0.2565 6.8732 8000 0.2668 0.0935 0.5246
0.2518 7.0447 8200 0.2683 0.0932 0.5262
0.2477 7.2166 8400 0.2666 0.0930 0.5281
0.2575 7.3885 8600 0.2632 0.0932 0.5227
0.2523 7.5604 8800 0.2640 0.0932 0.5242
0.2383 7.7323 9000 0.2622 0.0928 0.5207
0.2366 7.9042 9200 0.2629 0.0931 0.5230
0.2381 8.0756 9400 0.2606 0.0926 0.5198
0.24 8.2475 9600 0.2609 0.0921 0.5171
0.2408 8.4194 9800 0.2590 0.0923 0.5185
0.2443 8.5913 10000 0.2575 0.0916 0.5171
0.251 8.7632 10200 0.2579 0.0919 0.5160
0.2418 8.9351 10400 0.2578 0.0915 0.5156
0.2382 9.1066 10600 0.2570 0.0912 0.5142
0.2342 9.2785 10800 0.2560 0.0915 0.5159
0.2297 9.4504 11000 0.2568 0.0917 0.5146
0.2365 9.6223 11200 0.2557 0.0917 0.5163
0.2275 9.7942 11400 0.2565 0.0918 0.5172
0.2436 9.9661 11600 0.2559 0.0920 0.5172

Framework versions

  • Transformers 4.57.2
  • Pytorch 2.9.1+cu128
  • Datasets 3.6.0
  • Tokenizers 0.22.0
Downloads last month
-
Safetensors
Model size
1.0B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ctaguchi/ssc-kbd-mms-model

Finetuned
(333)
this model