Qwen3 models (123M/300M/600M) trained from scratch on 2.47B kk+ru tokens. Includes tokenizer, datasets, and checkpoints.
Saken Tukenov PRO
stukenov
AI & ML interests
None yet
Recent Activity
updated a model 1 day ago
stukenov/kzcalm-baseline-v1 updated a model 1 day ago
stukenov/sozkz-core-omniaudio-300m-kk-ctc-v1 updated a model 1 day ago
stukenov/sozkz-core-omniaudio-600m-kk-asr-v1Organizations
models 72
stukenov/kzcalm-baseline-v1
Updated
stukenov/sozkz-core-omniaudio-50m-kk-ctc-v1
Automatic Speech Recognition • Updated
stukenov/sozkz-core-omniaudio-150m-kk-ctc-v1
Updated
stukenov/sozkz-core-omniaudio-300m-kk-ctc-v1
Updated
stukenov/sozkz-core-omniaudio-600m-kk-ctc-v1
Updated
stukenov/sozkz-core-omniaudio-600m-kk-asr-v1
Updated
stukenov/sozkz-core-omniaudio-1b-kk-ctc-v1
Updated • 1
stukenov/sozkz-core-omniaudio-300m-kk-asr-v1
Automatic Speech Recognition • Updated
stukenov/sozkz-core-omniaudio-50m-kk-asr-v1
Automatic Speech Recognition • Updated
stukenov/sozkz-core-omniaudio-150m-kk-asr-v1
Automatic Speech Recognition • Updated
datasets 43
stukenov/sozkz-corpus-tokenized-kk-morphbpe100k-v1
Viewer • Updated • 1.75M • 6
stukenov/sozkz-corpus-clean-enkk-fineweb-edu-v1
Viewer • Updated • 18M • 13
stukenov/sozkz-corpus-clean-v3
Viewer • Updated • 13.5M • 169
stukenov/kaznet-crawl-raw
Viewer • Updated • 1.55M • 1 • 1
stukenov/198d06e0-3940-48b1-85f5-52e1b18bd393
Updated • 1
stukenov/sozkz-corpus-tokenized-kk-morphbpe256k-v1
Viewer • Updated • 1.51M • 39
stukenov/sozkz-corpus-segmented-kk-v1
Viewer • Updated • 55.5M • 22
stukenov/sozkz-corpus-gec-benchmark-kk-v1
Viewer • Updated • 1.44k • 88
stukenov/sozkz-corpus-pretrain-gec-mix-v1
Viewer • Updated • 1.77M • 5
stukenov/sozkz-corpus-synthetic-kk-gec-rulebased-v1
Viewer • Updated • 1.06M • 11