Model requests

#1420

by Vortex5 - opened Sep 26, 2025

Discussion

Vortex5

Sep 26, 2025

•

edited Sep 26, 2025

https://huggingface.co/Vortex5/MS3.2-24B-Omega-Diamond

https://huggingface.co/Vortex5/Qwen2.5-14B-Styx

nicoboss

Sep 26, 2025

Booth of them failed to queue due to invalid config.json:

nico1 ~# llmc add -2000 si https://huggingface.co/Vortex5/MS3.2-24B-Omega-Diamond
submit tokens: ["-2000","static","imatrix","https://huggingface.co/Vortex5/MS3.2-24B-Omega-Diamond"]
https://huggingface.co/Vortex5/MS3.2-24B-Omega-Diamond
https://huggingface.co/Vortex5/MS3.2%2d24B%2dOmega%2dDiamond/resolve/main/config.json 429

nico1 ~# llmc add -2000 si https://huggingface.co/Vortex5/Qwen2.5-14B-Styx
submit tokens: ["-2000","static","imatrix","https://huggingface.co/Vortex5/Qwen2.5-14B-Styx"]
https://huggingface.co/Vortex5/Qwen2.5-14B-Styx
https://huggingface.co/Vortex5/Qwen2.5%2d14B%2dStyx/resolve/main/config.json 429
https://huggingface.co/Vortex5/Qwen2.5%2d14B%2dStyx/resolve/main/config.json 429
https://huggingface.co/Vortex5/Qwen2.5%2d14B%2dStyx/resolve/main/config.json 429
Vortex5/Qwen2.5-14B-Styx: no architectures entry (UnicodeDecodeError('utf-8', b'\x1f\x8b\x08\x00 (...), 1, 2, 'invalid start byte') at /llmjob/share/bin/llmjob line 1656.
)

nicoboss

Sep 26, 2025

•

edited Sep 26, 2025

From mradermacher:

you get something like binary garbage instead of a config.json when submitting models, then that is because some hf frontend servers currently deflate-compress files even when not allowed to (and the huggingface python module does not handle that, unsurprisingly).

So this is apparently an issue on HuggingFaces's side which makes sense as the stratus code when downloading the config.json is 429

nicoboss

Sep 26, 2025

They are now booth queued! :D
I bumped the priority of MS3.2-24B-Omega-Diamond to -2000 as I think mradermacher initially queued it using priority 0.

You can check for progress at http://hf.tst.eu/status.html or regularly check the model
summary pages at the following locations for wuants to appear:

mradermacher

Owner Sep 27, 2025

as a sidenote, llmc add should now wait out the rate limit (can take 5 minutes though).

mradermacher

Owner Sep 27, 2025

ah, and the deflate problem seems to have been fixed on the hf server side

nicoboss

Sep 27, 2025

as a sidenote, llmc add should now wait out the rate limit (can take 5 minutes though).

Perfect. Let's hope we won't reach it again but if we do this is a much cleaner solution.

ah, and the deflate problem seems to have been fixed on the hf server side

Great to hear. That was so stupid from them.

mradermacher

Owner Sep 28, 2025

•

edited Sep 28, 2025

Perfect. Let's hope we won't reach it again but if we do this is a much cleaner solution.

It keeps biting us, all the error 139 ones are unexpected rate limit blocks. In essence, an api call now has, like, a one in ~1 in 40 chance of failing or even higher. That is pretty high, especially for non-idempotent requests.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment