gemma-3-1b-thinking-gguf-F16 : GGUF

This model was finetuned and converted to GGUF format using Unsloth.

Example usage:

For text only LLMs: ./llama.cpp/llama-cli -hf Ma7ee7/gemma-3-1b-thinking-gguf-F16 --jinja
For multimodal models: ./llama.cpp/llama-mtmd-cli -hf Ma7ee7/gemma-3-1b-thinking-gguf-F16 --jinja

Available Model files:

An Ollama Modelfile is included for easy deployment.

The model's BOS token behavior was adjusted for GGUF compatibility. This was trained 2x faster with Unsloth

GGUF

Model size

1.0B params

Architecture

gemma3

Hardware compatibility

16-bit

Base model

Finetuned

Quantized

(164)

this model