FLUX.2 Klein 9B Schematic LoRA

This project was inspired by Vision Banana, which treats tasks such as depth, normal, and segmentation as image editing.

I wanted to test whether a similar idea could work with FLUX.2 [klein] 9B base by using small task-specific LoRA training runs.

This repository contains six task-specific LoRAs:

relative depth
surface normal
body pose
full pose
binary segmentation
amodal segmentation

The outputs are RGB schematic images. The quality is not production-ready, and these LoRAs are not intended to replace dedicated CV models.

For more details about the experiment and dataset construction, see the blog post:

Blog: https://comfyui.nomadoor.net/en/notes/flux2-klein-schematic-lora/

Files

Task	LoRA
Relative depth	`loras/flux2-klein-schematic-relative-depth-lora.safetensors`
Surface normal	`loras/flux2-klein-schematic-surface-normal-lora.safetensors`
Body pose	`loras/flux2-klein-schematic-body-pose-lora.safetensors`
Full pose	`loras/flux2-klein-schematic-full-pose-lora.safetensors`
Binary segmentation	`loras/flux2-klein-schematic-binary-segmentation-lora.safetensors`
Amodal segmentation	`loras/flux2-klein-schematic-amodal-segmentation-lora.safetensors`

Examples

Relative Depth

Surface Normal

Body Pose

Full Pose

Binary Segmentation

Amodal Segmentation

Usage

Use the LoRA with FLUX.2 [klein] 9B base in an image-editing workflow.

These LoRAs were trained on the base model. They may not behave correctly with the distilled Klein models unless you also use an appropriate base-to-turbo / base-to-distilled compatibility LoRA.

Prompt Templates

Use simple command-style prompts.

Relative Depth

Generate a relative depth map of the input image.

Surface Normal

Generate a surface normal map of the input image.

Body Pose

Generate a body pose map of all visible people in the input image.

Full Pose

Generate a full pose map of all visible people in the input image.

Binary Segmentation

Generate a binary segmentation mask of [target] in the input image.

Amodal Segmentation

Generate an amodal segmentation mask of [target] in the input image.

ComfyUI Workflow

JSON: workflow/flux2-klein-base-9b-image-edit.json

Notes

This is not a drop-in replacement for dedicated preprocessors such as DWPose, Depth Anything, Lotus-2, or SAM.
Pose is the least stable task. Small errors in color or skeleton topology are visually obvious.
Segmentation can fail when the target description is ambiguous or when multiple similar objects are present.
Amodal segmentation is especially experimental because the model must infer occluded parts.
The dataset is small, so the behavior is limited and may vary across images.

Training Setup

Base model: black-forest-labs/FLUX.2-klein-base-9B
Training tool: ai-toolkit
LoRA rank: linear 32 / conv 16
Optimizer: adamw8bit
Learning rate: 5e-5
Batch size: 4
Dataset size: 1920 image pairs across all tasks

Dataset

The training dataset is available here:

Dataset: https://huggingface.co/datasets/nomadoor/flux-2-klein-9B-schematic-dataset

License

Please follow the license and usage terms of the base model: black-forest-labs/FLUX.2-klein-base-9B.

This repository uses flux-non-commercial-license-v2.1.

Downloads last month: -; Downloads are not tracked for this model. How to track

Model tree for nomadoor/flux-2-klein-9B-schematic-lora

Base model

black-forest-labs/FLUX.2-klein-base-9B

Adapter

(58)

this model