Configuration Parsing Warning:In adapter_config.json: "peft.task_type" must be a string

Svarah-Whisper-v1

Svarah-Whisper-v1 is a LoRA fine-tuned version of openai/whisper-small for automatic speech recognition (ASR) on the Svarah dataset. The model uses Parameter-Efficient Fine-Tuning (PEFT) with LoRA to adapt Whisper for English speech transcription.

Model Summary

  • Base model: openai/whisper-small
  • Fine-tuning method: LoRA via PEFT
  • Task: Automatic Speech Recognition (ASR)
  • Language: English
  • Dataset: Svarah
  • Frameworks: PyTorch, Transformers, PEFT

Intended Use

This model is intended for:

  • English speech transcription
  • ASR experimentation with PEFT/LoRA
  • Whisper fine-tuning research and benchmarking

It may be useful for lightweight adaptation workflows where full fine-tuning is too expensive.

Limitations

  • This model is fine-tuned on the Svarah dataset and may not generalize well to domains outside that distribution.
  • Performance may degrade on:
    • noisy audio
    • accented speech not represented in training
    • long-form recordings
    • low-quality microphone input
  • The reported results are specific to the preprocessing and evaluation setup used for the Svarah evaluation split.

Evaluation

The model was evaluated on the preprocessed Svarah evaluation set containing 665 samples.

Metrics

  • Word Error Rate (WER): 39.09%
  • Word Accuracy Rate (WAR): 60.91%

These results reflect the current checkpoint and preprocessing pipeline used during evaluation.

Training Details

Training Procedure

The model was trained using a custom PyTorch training loop with:

  • torch.amp mixed precision training
  • BF16 computation
  • gradient accumulation
  • AdamW optimization
  • linear learning rate scheduling with warmup

Training Hyperparameters

  • Learning rate: 1e-4
  • Per-device train batch size: 8
  • Gradient accumulation steps: 2
  • Effective batch size: 16
  • Warmup steps: 100
  • Max training steps: 5000
  • Optimizer: AdamW
  • Weight decay: 0.01
  • LR scheduler: linear
  • Mixed precision: BF16 (torch.bfloat16)
  • Max gradient norm: 1.0

LoRA Configuration

  • Rank (r): 32
  • Alpha: 64
  • Dropout: 0.05
  • Target modules: q_proj, v_proj
  • Bias: none

Framework Versions

  • PEFT: 0.18.1
  • Transformers: compatible
  • PyTorch: compatible

Usage

Load the model

import torch
from peft import PeftModel
from transformers import WhisperForConditionalGeneration, WhisperProcessor

base_model_name = "openai/whisper-small"
peft_model_id = "your-username/whisper-small-svarah-lora-final"  # replace with your actual model path

processor = WhisperProcessor.from_pretrained(base_model_name)

base_model = WhisperForConditionalGeneration.from_pretrained(
    base_model_name,
    low_cpu_mem_usage=True
)

model = PeftModel.from_pretrained(base_model, peft_model_id)
model = model.to("cuda")
model.eval()
Downloads last month
35
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Akshatkasera007/Svarah-Whisper-v1

Adapter
(202)
this model