ssc-kbd-mms-model-mix-adapt-max3

This model is a fine-tuned version of facebook/mms-1b-all on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.2958
  • Cer: 0.1016
  • Wer: 0.5602

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0005
  • train_batch_size: 8
  • eval_batch_size: 6
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 16
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 100
  • num_epochs: 10
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Cer Wer
3.5251 0.1674 200 3.3282 0.9409 0.9997
3.3827 0.3347 400 3.2021 0.8943 0.9989
3.2375 0.5021 600 3.0549 0.8991 0.9992
3.0856 0.6695 800 2.9390 0.8932 0.9997
2.8165 0.8368 1000 2.6709 0.8018 0.9985
2.5668 1.0042 1200 2.3290 0.6999 0.9928
2.2592 1.1715 1400 2.1523 0.6312 0.9837
2.0947 1.3389 1600 1.9143 0.5989 0.9787
1.9453 1.5063 1800 1.7976 0.5524 0.9730
0.8778 1.6736 2000 0.6186 0.1867 0.8573
0.5716 1.8410 2200 0.5031 0.1593 0.7850
0.5111 2.0084 2400 0.4497 0.1460 0.7329
0.4645 2.1757 2600 0.4273 0.1395 0.7253
0.4543 2.3431 2800 0.4175 0.1366 0.6998
0.4452 2.5105 3000 0.4014 0.1338 0.6871
0.4193 2.6778 3200 0.3840 0.1283 0.6676
0.419 2.8452 3400 0.3942 0.1275 0.6631
0.4125 3.0126 3600 0.3751 0.1254 0.6516
0.3857 3.1799 3800 0.3679 0.1225 0.6468
0.3897 3.3473 4000 0.3621 0.1208 0.6416
0.3852 3.5146 4200 0.3569 0.1187 0.6344
0.3851 3.6820 4400 0.3523 0.1196 0.6327
0.3725 3.8494 4600 0.3527 0.1177 0.6281
0.3758 4.0167 4800 0.3484 0.1163 0.6204
0.3731 4.1841 5000 0.3421 0.1149 0.6165
0.3541 4.3515 5200 0.3459 0.1144 0.6159
0.3667 4.5188 5400 0.3408 0.1152 0.6143
0.3572 4.6862 5600 0.3383 0.1139 0.6113
0.3479 4.8536 5800 0.3350 0.1126 0.6037
0.3389 5.0209 6000 0.3388 0.1147 0.6159
0.328 5.1883 6200 0.3267 0.1125 0.6091
0.3479 5.3556 6400 0.3227 0.1105 0.6052
0.3303 5.5230 6600 0.3223 0.1103 0.5937
0.3172 5.6904 6800 0.3212 0.1103 0.6014
0.3377 5.8577 7000 0.3185 0.1095 0.5939
0.3332 6.0251 7200 0.3209 0.1082 0.5910
0.3079 6.1925 7400 0.3182 0.1090 0.5945
0.3116 6.3598 7600 0.3185 0.1080 0.5888
0.3055 6.5272 7800 0.3128 0.1067 0.5841
0.3173 6.6946 8000 0.3119 0.1072 0.5832
0.3069 6.8619 8200 0.3124 0.1064 0.5812
0.3038 7.0293 8400 0.3075 0.1062 0.5824
0.3026 7.1967 8600 0.3093 0.1060 0.5779
0.2994 7.3640 8800 0.3090 0.1057 0.5748
0.2989 7.5314 9000 0.3103 0.1057 0.5722
0.3029 7.6987 9200 0.3040 0.1054 0.5748
0.3088 7.8661 9400 0.3050 0.1043 0.5733
0.2985 8.0335 9600 0.3044 0.1047 0.5726
0.2932 8.2008 9800 0.3035 0.1031 0.5675
0.2872 8.3682 10000 0.3027 0.1041 0.5722
0.2841 8.5356 10200 0.2985 0.1029 0.5701
0.2946 8.7029 10400 0.2998 0.1028 0.5669
0.2905 8.8703 10600 0.2976 0.1031 0.5682
0.2862 9.0377 10800 0.2977 0.1023 0.5645
0.2838 9.2050 11000 0.2986 0.1031 0.5662
0.2794 9.3724 11200 0.2969 0.1023 0.5627
0.278 9.5397 11400 0.2975 0.1023 0.5654
0.2926 9.7071 11600 0.2955 0.1021 0.5630
0.2769 9.8745 11800 0.2958 0.1016 0.5602

Framework versions

  • Transformers 4.52.1
  • Pytorch 2.9.1+cu128
  • Datasets 3.6.0
  • Tokenizers 0.21.4
Downloads last month
13
Safetensors
Model size
1.0B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for ctaguchi/ssc-kbd-mms-model-mix-adapt-max3

Finetuned
(327)
this model

Evaluation results