Model Card for Model ID
Model Details
Sarthi-agri-v1 is a research-preview large language model which is domain-adapted for structured agricultural advisory generation. The model is fine-tuned from google/gemma-3-27b-it and optimized to perform taxonomy-driven reasoning across agronomic parameters such as crop type, growth stage, climate variables, soil properties, farming practices, and regional conditions and generate useful advisory for farmers.
The model is optimized for:
- Agricultural reasoning and advisory
- Structured analytical thinking
- Multilingual responses (Hindi-first, farmer-friendly)
The goal of Sarthi-Agri is to act as a domain expert assistant for agronomy workflows, decision support systems, and rural advisory automation.
Model Description
The google/gemma-3-27b-it model was chosen for soketlabs/sarthi-agri-v1 because of its strong language understanding, reliable instruction-following behavior, and high-quality text generation. We further induced thinking token generation to improve structured reasoning from taxonomy-based agricultural inputs, enabling deeper analytical responses and more accurate, consistent agricultural advisories.
Uses
soketlabs/sarthi-agri-v1 is designed for professionals, researchers, and developers working on agricultural intelligence and decision-support systems.
Limitations
- The model does not replace certified agricultural experts.
- Predictions may vary under extreme or unseen climatic conditions.
- Regional crop practices may differ beyond training coverage.
- Numerical precision is approximate and advisory-oriented.
- Requires GPU resources for optimal performance.
Safety Considerations
- Outputs are informational and advisory only.
- No automated pesticide dosage enforcement is recommended without expert validation.
- The model avoids generating harmful chemical or medical prescriptions.
- Users should validate critical decisions with local agronomy authorities.
How to Get Started with the Model
Use the code below to get started with the model.
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("soketlabs/sarthi-agri-v1")
model = AutoModelForCausalLM.from_pretrained(
"soketlabs/sarthi-agri-v1", torch_dtype=torch.bfloat16
)
Input
sample_input_data = {
"Month": "January",
"Weather": "गरम और आर्द्र मौसम",
"Soil Type": "काली मिट्टी",
"Region": "महाराष्ट्र",
"Language": "Hindi",
"Crop": "कपास",
}
messages = [
{
"role": "system",
"content": """You are a helpful District Agricultural Officer providing crop advisory to farmers.
Output Restrictions:
- Be polite
- Give advisory in user given input language"""
},
{
"role": "user",
"content": (
"Please generate crop advisory using the following structured input:"
f"{sample_input_data}"
"Think carefully and follow the output protocol strictly."
)
}
]
Training Details
We performed Supervised Fine-Tuning using LoRA on soketlabs/Sarathi-AgriData dataset.
Training Hyperparameters
- Fine-tuning Method: LoRA-based supervised fine-tuning
- Precision: BFloat16
- Batch Size: 32 (global), 1 (micro)
- Context Length: 3100
- Learning Rate: 6.0e-05
- Optimizer: AdamW
- LR Scheduler: Cosine
- Mixed Precision Training Enabled
- Training Objective:
- Structured reasoning alignment
- Taxonomy-conditioned generation
- Controlled output formatting using thinking tokens
- Multilingual advisory consistency
Dataset Characteristics
- Curated agricultural taxonomy datasets
- Climate-conditioned crop advisory samples
- Pest, disease, and stress scenario annotations
- Numeric-to-text reasoning samples
- Multi-language supervision (Hindi / English)
No personally identifiable information (PII) was included in training data.
Thinking Token Design
The model uses explicit control tokens to enforce structured reasoning and deterministic output formatting:
| Token | Purpose |
|---|---|
<unused0> |
Marks the beginning of the internal analytical reasoning section |
<unused1> |
Marks the transition from reasoning to final user-facing advisory |
Benefits
- Enables reliable separation of reasoning and final output.
- Improves traceability and debugging during evaluation.
- Supports UI streaming and controlled rendering pipelines.
- Enhances reasoning depth when processing structured taxonomy inputs.
During fine-tuning, the model was conditioned to strictly follow token ordering and formatting rules.
Devolopment
- Developed by: Soket AI Labs
- Funded by: INDIAai
- Model type: Reasioning and Text generation
- Language(s) (NLP): Hindi, English, Hinglish
- License: Apache 2.0
- Finetuned from model: google/gemma-3-27b-it
Evaluation
!-- comming soon --!
- Downloads last month
- 159