Upload optimized ONNX model w/ GQA
#26
by Xenova HF Staff - opened
No description provided.
Xenova changed pull request title from Upload optimized model w/ GQA to Upload optimized ONNX model w/ GQA
New demo! https://huggingface.co/spaces/HuggingFaceTB/SmolLM2-1.7B-Instruct-WebGPU
Much faster now...
Xenova changed pull request status to merged