Clair v5

Clair is a personalized AI assistant fine-tuned from Qwen2.5-3B-Instruct with embedded identity. It runs efficiently on budget laptops (CPU-only, 8GB RAM) and maintains consistent identity across all interactions.

Model Details

Property Value
Base Model Qwen2.5-3B-Instruct
Parameters 3.09B
Architecture Qwen2 (Transformer)
Context Length 4096 tokens
Training Method LoRA (rank 32, alpha 64)
Training Epochs 20
Quantization Q4_K_M, Q5_K_M, Q3_K_M (GGUF)

Identity

  • Name: Clair
  • Creator: Michael Mlungisi Nkomo
  • Origin: Zimbabwe
  • Role: AI assistant for coding, math, writing, analysis, and general questions

Training

Dataset

  • 95 examples with heavy identity emphasis
  • 30+ identity questions with variations
  • Explicit denials of being ChatGPT, Claude, Qwen
  • Greetings, goodbyes, and normal conversations
  • Multi-turn dialogues

Training Configuration

Parameter Value
LoRA Rank 32
LoRA Alpha 64
Learning Rate 1e-4
Batch Size 4
Gradient Accumulation 4
Epochs 20
Quantization 4-bit (NF4)

Results

Metric Value
Training Loss 0.08047
Token Accuracy 97.3%
Identity Recognition 100%

Usage

With Transformers

from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("r245142r/Clair-3B")
model = AutoModelForCausalLM.from_pretrained("r245142r/Clair-3B")

messages = [{"role": "user", "content": "Who are you?"}]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer([text], return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=256)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

With Ollama

ollama run r245142r/Clair-3B

With llama.cpp (GGUF)

# Download the quantized model
wget https://huggingface.co/r245142r/Clair-3B/resolve/main/clair-v5-Q4_K_M.gguf

# Run with llama.cpp
./llama-cli -m clair-v5-Q4_K_M.gguf -p "Who are you?" -n 256

Available Files

File Size Description
clair-v5-float16.gguf 5.75 GB Full precision GGUF
clair-v5-Q4_K_M.gguf ~2.0 GB 4-bit quantized (recommended)
clair-v5-Q5_K_M.gguf ~2.5 GB 5-bit quantized
clair-v5-Q3_K_M.gguf ~1.5 GB 3-bit quantized

Hardware Requirements

Configuration RAM Speed
Q4_K_M (CPU) ~2.5 GB ~5-8 tokens/s
Q4_K_M (GPU) ~2.5 GB ~30-50 tokens/s
Float16 (GPU) ~6 GB ~40-60 tokens/s

Benchmarks

Tested on budget laptop (Intel i5, 8GB DDR4, CPU-only):

  • RAM Usage: ~6.8 GB total (within 7GB ceiling)
  • Model Size: ~2.0 GB (Q4_K_M)
  • Context Window: 4096 tokens
  • Identity Accuracy: 100%

Development

Built for the ADTC 2026 LaptopLLM Challenge โ€” running AI on budget hardware.

Key Achievements

  • โœ… Runs on CPU-only laptops with 8GB RAM
  • โœ… Embedded identity (not system prompt)
  • โœ… Natural greetings and goodbyes
  • โœ… 3x faster with Q4_K_M quantization

License

Apache 2.0

Citation

@misc{clair-v5,
  author = {Michael Mlungisi Nkomo},
  title = {Clair v5: Personalized AI Assistant with Embedded Identity},
  year = {2026},
  publisher = {Hugging Face},
  url = {https://huggingface.co/r245142r/Clair-3B}
}

Clair v5 โ€” Personalized AI with embedded identity, built from Zimbabwe for the world.

Downloads last month
125
Safetensors
Model size
3B params
Tensor type
F16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for kedarcv/Clair-3B

Base model

Qwen/Qwen2.5-3B
Quantized
(243)
this model