T-pro-it-2.1-int8-ov
OpenVINO int8 quantized version of t-tech/T-pro-it-2.1.
π₯ Quick Start - Download & Run
1. Install HF CLI:
pip install -U huggingface_hub["cli"]
2. Download model:
hf download savvadesogle/T-pro-it-2.1-int8-ov --local-dir ./T-pro-it-2.1-int8-ov
π§ Quantization Parameters
Weight compression was performed using optimum-cli export openvino with the following parameters:
optimum-cli export openvino ^
--model ./T-pro-it-2.1 ^
--task text-generation-with-past ^
--weight-format int8 ^
./T-pro-it-2.1-int8-ov
β Compatibility
The provided OpenVINO IR model is compatible with:
- OpenVINO version 2026.0.0.dev20260102
- Optimum 2.1.0.dev0
- Optimum Intel 1.27.0.dev0+25fcb63
- NNCF 3.0.0.dev0+999c5e91
- OpenArc
π― Running with OpenArc
OpenArc β OpenAI-compatible inference server for OpenVINO models.
Terminal 1 β Start server:
set OPENARC_API_KEY=BIG_KEY
openarc serve start --host 127.0.0.1
Terminal 2 β Add model:
set OPENARC_API_KEY=BIG_KEY
openarc add --model-name T-pro-it-2.1 --model-path ./T-pro-it-2.1-int8-ov --engine ovgenai --model-type llm --device GPU
openarc load T-pro-it-2.1
openarc bench T-pro-it-2.1
Connect via OpenAI API:
http://127.0.0.1:8000/v1/chat/completions
π OpenAI API Example (Windows CMD)
curl http://127.0.0.1:8000/v1/chat/completions ^
-H "Content-Type: application/json" ^
-H "Authorization: Bearer BIG_KEY" ^
-d "{\"model\":\"T-pro-it-2.1-int8-ov\",\"messages\":[{\"role\":\"user\",\"content\":\"Π Π°ΡΡΠΊΠ°ΠΆΠΈ Π°Π½Π΅ΠΊΠ΄ΠΎΡ\"}],\"temperature\":0.7,\"max_tokens\":128}"
π Performance Metrics
openarc bench T-pro-it-2.1-int8-ov
β οΈ Limitations
Check the original model card for limitations.
π Legal information
Distributed under the same license as the original model.
- Downloads last month
- -
Inference Providers NEW
This model isn't deployed by any Inference Provider. π Ask for provider support