NVFP4
Collection
Fast inference for Blackwell GPUs
•
7 items
•
Updated
•
3
Compressed with llm-compressor v0.9.0 and transformers v4.57.1. We used the example script but increased the number of samples from 20 to 128.
Base model
Qwen/Qwen3-Next-80B-A3B-Instruct