DEF-rgbtcc: RGB-T Crowd Counting

Dual-Modulation Framework for RGB-T Crowd Counting via Spatially Modulated Attention and Adaptive Fusion.

Architecture

Backbone: Shared VGG-19
Encoder: Spatially Modulated Attention (SMA) Transformer
Fusion: Adaptive Cross-Modal Fusion (ACMF)
Output: Density map regression

Available Formats

model.pth — PyTorch state dict
model.safetensors — SafeTensors format
model.onnx — ONNX (opset 17)
model_fp16.trt — TensorRT FP16
model_fp32.trt — TensorRT FP32

Usage

from def_rgbtcc.serve import RGBTCCInference

model = RGBTCCInference("model.pth")
result = model.predict(rgb_image, thermal_image)
print(f"Count: {result['count']:.1f}")

ANIMA Module

Part of the ANIMA Defense Module ecosystem (Wave 8). Products: ORACLE, ATLAS, NEMESIS

Downloads last month: -; Downloads are not tracked for this model. How to track

Safetensors

Model size

34.1M params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Paper for ilessio-aiflowlab/DEF-rgbtcc

A Dual-Modulation Framework for RGB-T Crowd Counting via Spatially Modulated Attention and Adaptive Fusion

Paper • 2509.17079 • Published Sep 21, 2025