DEF-rgbtcc: RGB-T Crowd Counting

Dual-Modulation Framework for RGB-T Crowd Counting via Spatially Modulated Attention and Adaptive Fusion.

Paper: ArXiv 2509.17079

Architecture

  • Backbone: Shared VGG-19
  • Encoder: Spatially Modulated Attention (SMA) Transformer
  • Fusion: Adaptive Cross-Modal Fusion (ACMF)
  • Output: Density map regression

Available Formats

  • model.pth โ€” PyTorch state dict
  • model.safetensors โ€” SafeTensors format
  • model.onnx โ€” ONNX (opset 17)
  • model_fp16.trt โ€” TensorRT FP16
  • model_fp32.trt โ€” TensorRT FP32

Usage

from def_rgbtcc.serve import RGBTCCInference

model = RGBTCCInference("model.pth")
result = model.predict(rgb_image, thermal_image)
print(f"Count: {result['count']:.1f}")

ANIMA Module

Part of the ANIMA Defense Module ecosystem (Wave 8). Products: ORACLE, ATLAS, NEMESIS

Downloads last month

-

Downloads are not tracked for this model. How to track
Safetensors
Model size
34.1M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Paper for ilessio-aiflowlab/DEF-rgbtcc