---
license: apache-2.0
language:
- zh
- en
- ja
- ko
- fr
- de
- es
- ru
- ar
- hi
- th
- el
library_name: pytorch
tags:
- ocr
- text-detection
- text-recognition
- paddleocr
- pp-ocrv5
- multilingual
- svtr
- db
pipeline_tag: image-to-text
---

# PP-OCRv5 PyTorch Model Zoo（中文版）

> 本仓库的主 README 为英文版 [README.md](./README.md)。本文件为中文对照版。

PP-OCRv5 全系列模型的 **PyTorch** 版本（safetensors 格式），从百度 PaddlePaddle 官方 `.pdparams` 动态图权重精确转换而来，**推理结果与 PaddleOCR 原版位精确一致**。

- **文本检测**：2 个（mobile / server）
- **文本识别（基础）**：2 个，覆盖 简中 / 繁中 / 英文 / 日文
- **文本识别（多语言）**：11 个，覆盖 100+ 语种（韩 / 法 / 德 / 俄 / 阿拉伯 / 天城文 / 泰 / 希腊 / 泰米尔 / 泰卢固 / 纯英文等）

> 本仓库**仅包含权重、配置和字典**，不包含推理代码。推理请配合 [PaddleOCR2Pytorch](https://github.com/frotms/PaddleOCR2Pytorch) 使用，或参考下文"自定义 Python 推理"章节自行集成。

---

## 仓库结构

```
.
├── README.md / README_zh.md
├── LICENSE                                                     # Apache 2.0
├── config.json                                                 # 仓库元数据 + 模型索引
├── *.safetensors                                               # 15 个 PP-OCRv5 权重（位于根目录，URL 稳定）
├── ptocr_v5_server_{det,rec}.pth                               # V5 服务端的 pth 副本（向后兼容保留）
├── configs/
│   ├── det/PP-OCRv5/
│   │   ├── PP-OCRv5_mobile_det.yml                             # 移动端检测
│   │   └── PP-OCRv5_server_det.yml                             # 服务端检测
│   └── rec/PP-OCRv5/
│       ├── PP-OCRv5_mobile_rec.yml                             # 基础识别（中繁英日，移动端）
│       ├── PP-OCRv5_server_rec.yml                             # 基础识别（中繁英日，服务端）
│       └── multi_language/
│           ├── en_PP-OCRv5_mobile_rec.yaml                     # 英文专用
│           ├── korean_PP-OCRv5_mobile_rec.yml                  # 韩文 + 英文
│           ├── latin_PP-OCRv5_mobile_rec.yml                   # 拉丁字母 40+ 语种（法/德/西/意/葡 等）
│           ├── eslav_PP-OCRv5_mobile_rec.yml                   # 东斯拉夫（俄/白俄/乌克兰）
│           ├── cyrillic_PP-OCRv5_mobile_rec.yaml               # 西里尔字母 33 种
│           ├── arabic_PP-OCRv5_mobile_rec.yaml                 # 阿拉伯 / 波斯 / 维吾尔 / 乌尔都 等
│           ├── devanagari_PP-OCRv5_mobile_rec.yaml             # 天城文系 14 种（印地/马拉地/尼泊尔/梵文 等）
│           ├── th_PP-OCRv5_mobile_rec.yaml                     # 泰文
│           ├── el_PP-OCRv5_mobile_rec.yaml                     # 希腊文
│           ├── ta_PP-OCRv5_mobile_rec.yaml                     # 泰米尔文
│           └── te_PP-OCRv5_mobile_rec.yaml                     # 泰卢固文
└── dicts/                                                      # 字符集字典（rec 推理必需）
    ├── ppocrv5_dict.txt                                        # 基础（中繁英日）
    ├── ppocrv5_en_dict.txt
    ├── ppocrv5_korean_dict.txt
    └── ...（共 12 个）

legacy/                                                         # 旧版本（v2/v3/v4）pth 集中目录
├── ch_ptocr_mobile_v2.0_cls_infer.pth
├── ch_ptocr_v4_det_infer.pth
├── ch_ptocr_v4_rec_infer.pth
├── en_ptocr_v3_det_infer.pth
└── en_ptocr_v4_rec_infer.pth
```

> 所有 rec yaml 的 `character_dict_path` 已改写为相对路径 `./dicts/...`，`git clone` 或 `snapshot_download` 下载后**无需修改路径**即可使用。

---

## 模型清单

### 文本检测

| 权重文件 | 对应 yaml | 场景 | 文件大小 |
|---|---|---|---|
| `ptocr_v5_mobile_det.safetensors` | `configs/det/PP-OCRv5/PP-OCRv5_mobile_det.yml` | 移动端 / CPU 推荐 | ~14 MB |
| `ptocr_v5_server_det.safetensors` | `configs/det/PP-OCRv5/PP-OCRv5_server_det.yml` | 服务端 / 高精度 | ~101 MB |

### 文本识别（基础）

| 权重文件 | 对应 yaml | 支持语种 | 文件大小 |
|---|---|---|---|
| `ptocr_v5_mobile_rec.safetensors` | `configs/rec/PP-OCRv5/PP-OCRv5_mobile_rec.yml` | 简中 / 繁中 / 英文 / 日文 | ~31 MB |
| `ptocr_v5_server_rec.safetensors` | `configs/rec/PP-OCRv5/PP-OCRv5_server_rec.yml` | 简中 / 繁中 / 英文 / 日文 | ~128 MB |

### 文本识别（多语言）

所有多语言识别模型共享相同网络（`SVTR_LCNet` + `PPLCNetV3`），仅字符集不同。文件大小 23–28 MB。

| 权重文件 | 支持语种 |
|---|---|
| `ptocr_v5_en_mobile_rec.safetensors` | 英文专用（针对英文场景定向优化） |
| `ptocr_v5_korean_mobile_rec.safetensors` | 韩文、英文 |
| `ptocr_v5_latin_mobile_rec.safetensors` | 法文、德文、南非荷兰文、意大利文、西班牙文、葡萄牙文、捷克文、丹麦文、爱沙尼亚文、克罗地亚文、荷兰文、挪威文、波兰文、瑞典文、芬兰文、土耳其文、越南文 等 40+ 语种 |
| `ptocr_v5_eslav_mobile_rec.safetensors` | 俄罗斯文、白俄罗斯文、乌克兰文、英文 |
| `ptocr_v5_cyrillic_mobile_rec.safetensors` | 俄文、白俄文、乌克兰文、塞尔维亚（西里尔）、保加利亚、蒙古 等 33 种西里尔字母语言 |
| `ptocr_v5_arabic_mobile_rec.safetensors` | 阿拉伯文、波斯文、维吾尔文、乌尔都文、普什图文、信德文 等 |
| `ptocr_v5_devanagari_mobile_rec.safetensors` | 印地文、马拉地文、尼泊尔文、梵文 等 14 种天城文系语言 |
| `ptocr_v5_th_mobile_rec.safetensors` | 泰文、英文 |
| `ptocr_v5_el_mobile_rec.safetensors` | 希腊文、英文 |
| `ptocr_v5_ta_mobile_rec.safetensors` | 泰米尔文、英文 |
| `ptocr_v5_te_mobile_rec.safetensors` | 泰卢固文、英文 |

---

## 快速开始

### 下载权重

```python
from huggingface_hub import snapshot_download, hf_hub_download

# 方式 1：下载整个仓库（权重 + yml + 字典 + README）
repo_dir = snapshot_download(repo_id="JoyCN/PaddleOCR-Pytorch")
print("仓库下载到：", repo_dir)

# 方式 2：只下载单个权重文件
weight_path = hf_hub_download(
    repo_id="JoyCN/PaddleOCR-Pytorch",
    filename="ptocr_v5_korean_mobile_rec.safetensors"
)
```

### 使用 PaddleOCR2Pytorch 项目做推理（推荐）

```bash
# 1. clone 推理代码仓
git clone https://github.com/frotms/PaddleOCR2Pytorch
cd PaddleOCR2Pytorch
pip install torch safetensors pyyaml shapely pyclipper opencv-python pillow scikit-image

# 2. 用本仓库下载的权重 + yml（假设下载到 /path/to/hf_repo）
python tools/infer/predict_rec.py \
  --image_dir doc/imgs_words/korean/1.jpg \
  --rec_algorithm SVTR_LCNet \
  --rec_model_path /path/to/hf_repo/ptocr_v5_korean_mobile_rec.safetensors \
  --rec_yaml_path  /path/to/hf_repo/configs/rec/PP-OCRv5/multi_language/korean_PP-OCRv5_mobile_rec.yml \
  --rec_image_shape "3,48,320" \
  --rec_char_dict_path /path/to/hf_repo/dicts/ppocrv5_korean_dict.txt \
  --use_gpu False
```

> PaddleOCR2Pytorch 的 `base_ocr_v20.py` 已原生支持 `.safetensors`（按后缀自动识别，向后兼容 `.pth`）。

### 自定义 Python 推理代码

如果你不想依赖 PaddleOCR2Pytorch 完整推理栈，下面是**一个最小 rec 推理代码片段**的骨架。它展示了如何加载权重并做前向推理——但你仍然需要 PaddleOCR2Pytorch 项目中的网络定义代码（`pytorchocr/modeling/`）。

```python
import sys, numpy as np, cv2, torch, yaml
from safetensors.torch import load_file

# 以下 import 需要你先 clone https://github.com/frotms/PaddleOCR2Pytorch
# 并把其根目录加入 PYTHONPATH
sys.path.insert(0, "/path/to/PaddleOCR2Pytorch")
from pytorchocr.modeling.architectures.base_model import BaseModel
from pytorchocr.postprocess import build_post_process

HF_REPO = "/path/to/hf_repo"   # snapshot_download 得到的路径
yml_path    = f"{HF_REPO}/configs/rec/PP-OCRv5/multi_language/korean_PP-OCRv5_mobile_rec.yml"
weight_path = f"{HF_REPO}/ptocr_v5_korean_mobile_rec.safetensors"

# 1. 读配置 + 字符集
with open(yml_path, encoding="utf-8") as f:
    cfg = yaml.safe_load(f)
dict_path = cfg["Global"]["character_dict_path"]      # './dicts/ppocrv5_korean_dict.txt'
dict_abs  = f"{HF_REPO}/{dict_path.lstrip('./')}"
with open(dict_abs, encoding="utf-8") as f:
    chars = [l.strip("\n\r") for l in f]
n_char = len(chars) + 2                               # +1 blank, +1 space（依 use_space_char 而定）

# 2. 构建网络 + 加载权重（safetensors 零代码执行、mmap 快速加载）
cfg["Architecture"]["Head"]["out_channels_list"] = {
    "CTCLabelDecode": n_char,
    "SARLabelDecode": n_char + 2,
    "NRTRLabelDecode": n_char + 3,
}
net = BaseModel(cfg["Architecture"], out_channels=n_char)
net.load_state_dict(load_file(weight_path, device="cpu"))
net.eval()

# 3. 读图 + 预处理（resize 到 [3, 48, 320]，归一化到 [-1, 1]）
img = cv2.imread("input_word.jpg")
h, w = img.shape[:2]
ratio = w / h
tw = min(int(48 * ratio), 320)
img = cv2.resize(img, (tw, 48))
canvas = np.zeros((48, 320, 3), dtype=np.uint8)
canvas[:, :tw] = img
x = canvas.astype(np.float32).transpose(2, 0, 1) / 255.0
x = (x - 0.5) / 0.5
x = torch.from_numpy(x).unsqueeze(0)

# 4. 前向 + CTC 解码
with torch.no_grad():
    logits = net(x)
post_op = build_post_process({"name": "CTCLabelDecode",
                              "character_dict_path": dict_abs,
                              "use_space_char": True})
result = post_op(logits)
print("识别结果:", result)     # e.g. [('바탕으로', 0.9998)]
```

### 推理所需依赖

```
torch >= 1.13
safetensors >= 0.4
numpy, pillow, opencv-python
pyyaml, shapely, pyclipper
scikit-image      # det 后处理需要
```

---

## 转换 & 验证来源

- 源权重：PaddlePaddle 官方 `.pdparams`，来自 [paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/](https://paddle-model-ecology.bj.bcebos.com/paddlex/official_pretrained_model/)
- 转换工具：[PaddleOCR2Pytorch](https://github.com/frotms/PaddleOCR2Pytorch) 中的 `converter/ppocr_v5_det_converter.py` / `ppocr_v5_rec_converter.py`
- 验证：在 macOS Apple Silicon (M 系列) CPU 环境下做过端到端推理，**多语言识别结果与 PaddleOCR 官方 `.pdparams` 位精确一致**（float32 小数点后 8 位完全相同）

样例推理输出（CPU，<0.7 s / 张）：

| 样例 | 识别结果 | 置信度 |
|---|---|---|
| 中文 `word_1.jpg` | 韩国小馆 | 0.99797755 |
| 韩文 `korean/1.jpg` | 바탕으로 | 0.99977183 |
| 法文 `french/1.jpg` | de l'amendement, | 0.99656343 |
| 阿拉伯 `arabic/ar_1.jpg` | الكيصياوي | 0.68281130 |

---

## Legacy 文件说明（`legacy/`）

原本放在仓库根目录的 PP-OCR v2 / v3 / v4 老版本权重，现已**统一迁移到 `legacy/` 目录**以便整理。这些文件仍然存在且可正常使用，只需在 URL 路径前面加上 `legacy/` 前缀即可：

```
legacy/ch_ptocr_mobile_v2.0_cls_infer.pth
legacy/ch_ptocr_v4_det_infer.pth
legacy/ch_ptocr_v4_rec_infer.pth
legacy/en_ptocr_v3_det_infer.pth
legacy/en_ptocr_v4_rec_infer.pth
```

**15 个 PP-OCRv5 safetensors 权重依然位于仓库根目录，URL 未变**。

---

## 许可证 & 致谢

- **License**: Apache License 2.0
- 权重来源：[PaddleOCR](https://github.com/PaddlePaddle/PaddleOCR) by PaddlePaddle 团队，Apache 2.0
- 转换工具：[PaddleOCR2Pytorch](https://github.com/frotms/PaddleOCR2Pytorch)，Apache 2.0

如果本仓库对你有帮助，请同时给上述两个原始项目 star 致谢。

---

## 引用

```bibtex
@misc{pp_ocrv5_pytorch_joycn_2025,
  title        = {PP-OCRv5 PyTorch Model Zoo},
  author       = {JoyCN},
  howpublished = {\url{https://huggingface.co/JoyCN/PaddleOCR-Pytorch}},
  year         = {2025}
}
```