Instructions to use bharatgenai/patram-7b-instruct with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use bharatgenai/patram-7b-instruct with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("image-text-to-text", model="bharatgenai/patram-7b-instruct", trust_remote_code=True)
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
pipe(text=messages)

# Load model directly
from transformers import AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained("bharatgenai/patram-7b-instruct", trust_remote_code=True, dtype="auto")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use bharatgenai/patram-7b-instruct with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "bharatgenai/patram-7b-instruct"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "bharatgenai/patram-7b-instruct",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker

docker model run hf.co/bharatgenai/patram-7b-instruct

SGLang

How to use bharatgenai/patram-7b-instruct with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "bharatgenai/patram-7b-instruct" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "bharatgenai/patram-7b-instruct",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "bharatgenai/patram-7b-instruct" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "bharatgenai/patram-7b-instruct",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Docker Model Runner
How to use bharatgenai/patram-7b-instruct with Docker Model Runner:
```
docker model run hf.co/bharatgenai/patram-7b-instruct
```

You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Patram-7B-Instruct

Patram-7B-Instruct by BharatGen is a 7B parameter vision-language model trained from scratch for visual document understanding. As India’s first document foundation model, it is built to tackle complex document analysis. The model was trained on a carefully curated instruction-tuned dataset, combining diverse public and custom synthetic data designed to support a broad spectrum of document understanding tasks.

Model Overview

Architecture: Vision Transformer (ViT) + MLP projector + OLMo-7B LLM
Training Data: BharatDocs-v1, a dataset of diverse Indian documents + Other Open Source Document Datasets
Supported I/O Formats: The model currently accepts English-language instructions and image files (e.g., PNG, JPEG) as input. The output is provided in text format.
Language: English (Indian language support upcoming)
License: Apache 2.0

Usage Examples

Use the transformers library.

import torch
from transformers import AutoProcessor, AutoModelForCausalLM, GenerationConfig
from PIL import Image
import requests

# Model ID and device setup
model_id = "bharatgenai/patram-7b-instruct"
device = "cuda" if torch.cuda.is_available() else "cpu"

# Load processor and model
processor = AutoProcessor.from_pretrained(model_id, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    trust_remote_code=True
).to(device)

def get_patram_response(image_path_or_url, question):
    try:
        # Load image
        if image_path_or_url.startswith("http"):
            image = Image.open(requests.get(image_path_or_url, stream=True).raw).convert("RGB")
        else:
            image = Image.open(image_path_or_url).convert("RGB")
    except Exception as e:
        print(f"Error loading image: {e}")
        return None

    # Format the prompt as expected
    prompt = f"Question: {question} Answer based on the image."

    try:
        # Preprocess image and text using the processor
        inputs = processor.process(images=[image], text=prompt)
        inputs = {k: v.to(device).unsqueeze(0) for k, v in inputs.items()}

        # Generate output using model's generate_from_batch method (Patram-specific)
        output = model.generate_from_batch(
            inputs,
            GenerationConfig(max_new_tokens=200, stop_strings="<|endoftext|>"),
            tokenizer=processor.tokenizer
        )

        # Extract generated tokens (excluding input tokens) and decode
        generated_tokens = output[0, inputs['input_ids'].size(1):]
        response = processor.tokenizer.decode(generated_tokens, skip_special_tokens=True).strip()
        return response
    except Exception as e:
        print(f"Error during inference: {e}")
        return None

# Example usage:
# image_input = "https://knowscope.in/wp-content/uploads/2025/05/cghd-nag.png"
# question = "Who issued this notice?"
# answer = get_patram_response(image_input, question)
# if answer:
#     print("Answer:", answer)

Note: If you're trying this on an Apple Silicon (M1/M2/M3/M4/...) chip, please follow the official documentation by PyTorch and Hugging Face for installing dependencies:

Evaluations

We evaluated Patram-7B-Instruct alongside other vision-language models (VLMs) in the 7B–9B parameter range across multiple public document benchmarks.

Benchmarks: DocVQA, VisualMRC, Patram-Bench

Patram-Bench is an in-house benchmark designed for Indic Document VQA.

Metric: G-Eval (LLM-as-a-judge)

Model	Overall	DocVQA	Patram-Bench	VisualMRC
claude-3.7-sonnet	0.8830	0.8480	0.8857	0.8830
Qwen2.5-VL-7B-Instruct	0.8759	0.8722	0.6816	0.9169
gemma-3-12b-it	0.8556	0.8451	0.6349	0.9069
patram-7b-instruct	0.8331	0.8550	0.6515	0.8510
InternVL3-9B	0.7865	0.8681	0.6888	0.7405
deepseek-vl2	0.7581	0.8739	0.5089	0.7144

*Note: The benchmarked results reflect the API variant.

Citation

@online{BharatGenPatramLaunch2025,
  author    = {{BharatGen Team}},
  title     = {BharatGen Unveils Patram: India's Pioneering Vision-Language Foundation Model for Document Intelligence},
  year      = {2025},
  url       = {https://bharatgen.com/blog/patram-launch},
  urldate   = {2025-06-02}
}

Resources

Model: huggingface.co/bharatgenai/patram-7b-instruct
Project Page: bharatgen.com/patram
Blog: bharatgen.com/blog/patram-launch

Authors

Principal Investigators: Prof. Ravi Kiran Sarvadevabhatla, Prof. Ganesh Ramakrishnan
Contributors: BharatGen Team

Contact

Contact Form
Hugging Face Community Tab

Downloads last month: 219

Safetensors

Model size

8B params

Tensor type

F32

Model tree for bharatgenai/patram-7b-instruct

Finetunes

1 model