GGUF quants of TeeZee/Kyllene-34B-v1.1, remeber to set your max context length to proper length for your hardware, 4096 is fine. Default context length is 200k so it will eat RAM or VRAM like crazy if left unchecked.

Downloads last month
460
GGUF
Model size
34B params
Architecture
llama
Hardware compatibility
Log In to view the estimation

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Collection including TeeZee/Kyllene-34B-v1.1-GGUF