Instructions to use issai/Beynele with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusers
How to use issai/Beynele with Diffusers:
pip install -U diffusers transformers accelerate
import torch from diffusers import DiffusionPipeline # switch to "mps" for apple devices pipe = DiffusionPipeline.from_pretrained("issai/Beynele", dtype=torch.bfloat16, device_map="cuda") prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k" image = pipe(prompt).images[0] - Notebooks
- Google Colab
- Kaggle
- Local Apps
- Draw Things
- DiffusionBee
Beynele
Beynele is a Lumina-Image 2.0 based text-to-image model adapted for Kazakh cultural image generation. It is trained with a data-centric pipeline that combines curated cultural data, synthetic supervision, revive-before-reject curation, a base-model anchor dataset, and reference-based evaluation with Beynele-Bench.
Use With Diffusers
import torch
from diffusers import Lumina2Pipeline
pipe = Lumina2Pipeline.from_pretrained(
"issai/Beynele",
torch_dtype=torch.bfloat16,
)
pipe.enable_model_cpu_offload()
prompt = "A Kazakh dombra resting on a patterned felt carpet."
image = pipe(
prompt,
height=1024,
width=1024,
guidance_scale=4.0,
num_inference_steps=40,
cfg_trunc_ratio=0.25,
cfg_normalization=True,
generator=torch.Generator("cpu").manual_seed(42),
).images[0]
image.save("beynele_dombra.png")
Model Details
| Field | Value |
|---|---|
| Architecture | Lumina-Image 2.0 / flow-based diffusion transformer |
| Base pipeline | Alpha-VLLM/Lumina-Image-2.0 |
| Text encoder | google/gemma-2-2b |
| Diffusers class | Lumina2Pipeline |
| Resolution | 1024 x 1024 |
| Recommended dtype | torch.bfloat16 |
| Recommended steps | 40 |
| Recommended guidance | 4.0 |
Only the diffusion transformer is adapted. The tokenizer, text encoder,
scheduler, and VAE are carried over from the Lumina-Image 2.0 Diffusers release
to provide a direct from_pretrained loading path.
Training Data Summary
The final training pool contains three branches:
| Branch | Examples |
|---|---|
| Core cultural dataset | 196k image-text pairs, about 73k unique images |
| Text-image dataset | 128k examples |
| Base-model anchor dataset | 109k examples |
The cultural dataset covers Kazakh people, material culture, buildings, landmarks, food, national symbols, natural scenes, activities, and text-bearing images. The full fine-tuning corpus is not released because of privacy, licensing, and cultural-data governance constraints.
Evaluation
| Model | Beynele-Bench | GenEval | WISE | UniGenBench++ |
|---|---|---|---|---|
| Lumina-Image 2.0 | 4.85 | 0.73 | 0.54 | 64.98 |
| Qwen-Image | 6.51 | 0.87 | 0.62 | 78.36 |
| Beynele | 7.29 | 0.74 | 0.51 | 65.53 |
| Beynele + prompt mediation | 7.01 | 0.78 | 0.73 | 68.89 |
Beynele-Bench uses 750 prompt-reference pairs and reports the arithmetic mean of Qwen3-VL 32B and Gemini 2.5 Pro similarity scores on a 1-10 scale.
Intended Use
Beynele is intended for research on cultural text-to-image generation, low-resource visual adaptation, Kazakh cultural representation, benchmarked T2I evaluation, and data-centric model adaptation.
Limitations and Safety
The model may hallucinate cultural details, produce imperfect Kazakh text, blur faces under difficult compositions, or shift prompt details under strong cultural specialization. It should not be used for identity verification, historical authentication, or high-stakes cultural documentation. Human review and local cultural expertise remain important for sensitive uses.
Provenance
The Hub package contains the converted Diffusers transformer/ safetensors used
by Lumina2Pipeline.from_pretrained. The source EMA checkpoint is retained in
the local release backup and cache for internal traceability.
Licensing
Beynele is released under the Apache License 2.0. The model is adapted from
Alpha-VLLM/Lumina-Image-2.0; users should also follow the licenses and terms
of any bundled or upstream components used by the Diffusers pipeline.
Citation
@article{aikyn2026beynele,
title = {A Data-Centric Framework for Adapting Text-to-Image Models to Low-Resource Cultural Domains},
author = {Aikyn, Nartay and Aryngazin, Anuar and Maxutov, Akylbek and Varol, Huseyin Atakan},
year = {2026},
note = {Pre-release manuscript}
}
- Downloads last month
- 13
Model tree for issai/Beynele
Base model
Alpha-VLLM/Lumina-Image-2.0