GLiNER2 ONNX export
gliner2-multi-v1 ONNX
model.onnx(FP32 export)model_fp16.onnx(FP16 weights converted from FP32)model_int8.onnx(dynamic INT8 quantization via onnxruntime)- tokenizer files copied verbatim from the HF model
- a small
config.jsondescribing runtime constraints
The export script follows the same design:
- ONNX graph includes only encoder + span head tensor computation
- Schema logic, label mapping, and decoding stay outside the graph
- Inputs:
input_ids,attention_mask(optionallytoken_type_ids) - Output:
span_logits - Export with
torch.onnx.export(opset 19) and dynamic batch/sequence axes - Convert FP32 weights to FP16 with
convert_float_to_float16 - Quantize with
onnxruntime.quantization.quantize_dynamic(QInt8)
Usage
Enter the dev shell (adds Python + ONNX deps):
nix develop
Install the Python dependencies with Pipenv:
cd onnx
pipenv install
cd ..
Export (run from the onnx directory so Pipenv finds the Pipfile):
pipenv run python export.py \
--model-id fastino/gliner2-multi-v1 \
--output-dir gliner2-multi-v1
Validation is enabled by default and compares the exported ONNX output to the PyTorch output for a dummy batch. It also runs a small extraction-method check (entities, classification, JSON) using identical decoding logic. To skip validation or to load the quantized model, use:
pipenv run python export.py --no-validate
pipenv run python export.py --no-validate-extraction
pipenv run python export.py --validate-quantized
pipenv run python export.py --no-fp16
The output directory will include:
model.onnxmodel_fp16.onnxmodel_int8.onnxtokenizer.jsontokenizer_config.jsonspecial_tokens_map.jsonadded_tokens.jsonspm.modelconfig.json
Notes
- The export uses a fixed
max_seq_len(default 512) and expects inputs padded or truncated to that length. This matches the published bundle's runtime config. - The
span_logitslabel axis is aligned to token positions in the input sequence. Use label marker token positions ([E],[C],[R],[L]) to map logits back to schema labels. Label mapping and decoding are intentionally handled outside the graph.
- Downloads last month
- 5
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for cuerbot/gliner2-multi-v1
Base model
fastino/gliner2-multi-v1