YOLOv5s TFLite (Qualcomm Edge AI Deployment Package)
This repository provides multiple YOLOv5s TensorFlow Lite models optimized for edge deployment scenarios, especially for:
- Qualcomm QCS8550 platforms
- Android APK inference pipelines
- Embedded Linux (Wayland / GStreamer)
- Multi-stream video analytics systems
These models are suitable for real-time object detection with hardware acceleration using Qualcomm QNN (Hexagon NPU), GPU delegate, or CPU fallback.
Available Model Variants
File Precision Input Size Recommended Usage
yolov5s_fp32_320.tflite FP32 320Γ320 Fast testing / compatibility mode
yolov5s_fp32_640.tflite FP32 640Γ640 Higher accuracy baseline
yolov5s_int8_320.tflite INT8 320Γ320 Maximum performance edge inference
yolov5s_int8_640.tflite INT8 640Γ640 Balanced accuracy + performance
Choosing FP32 vs INT8
Use FP32 models when:
- debugging pipelines
- validating inference correctness
- running CPU-only environments
- comparing baseline accuracy
Use INT8 models when:
- deploying on Qualcomm NPU
- running real-time multi-stream pipelines
- optimizing latency and power efficiency
- building production edge AI systems
INT8 is typically recommended for Qualcomm Hexagon acceleration workflows.
Deployment Pipeline Example
Typical real-time inference architecture:
Camera / RTSP / Video File
β Resize (320Γ320 or 640Γ640)
β Normalize
β TFLite Interpreter
β NMS Post-processing
β Overlay Rendering (TextureView / SurfaceView)
Supported delegate priority:
QNN (Hexagon NPU) β GPU β CPU
Deployment Environments Tested
Designed for integration into:
Android Edge AI Applications
Qualcomm QCS8550 Platforms
Embedded Linux (Wayland display pipelines)
Multi-stream video inference schedulers
TextureView / SurfaceView overlay rendering architectures
Example Python Inference
import tensorflow as tf
interpreter = tf.lite.Interpreter(model_path="yolov5s_int8_640.tflite")
interpreter.allocate_tensors()
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()
print(input_details)
print(output_details)
Example Android Delegate Configuration
Interpreter.Options options = new Interpreter.Options();
options.addDelegate(qnnDelegate); // Preferred
options.addDelegate(gpuDelegate); // Fallback
Recommended runtime order:
NPU β GPU β CPU
Repository Contents
yolov5s_fp32_320.tflite
yolov5s_fp32_640.tflite
yolov5s_int8_320.tflite
yolov5s_int8_640.tflite
yolov5s.labels
README.md
Intended Use Cases
Real-time edge object detection
Android APK AI inference pipelines
Qualcomm NPU acceleration demonstrations
Multi-stream surveillance analytics
Embedded AI deployment validation workflows
Author
Prepared by
Andrew Chiao
Edge AI Engineer
SYSGRATION Ltd. -- AMD Team / RD2
Notes
This repository focuses on deployment-ready TensorFlow Lite object detection models designed for embedded inference scenarios rather than notebook-only experimentation workflows.
- Downloads last month
- 44
Model tree for anan19990108/yolov5s_tflite
Base model
Ultralytics/YOLOv5