Fits in RTX 3090, which is best time to buy ($500 on used market)
This model fits inside one RTX 3090 perfectly and in max possible quality like BF16 GGUF. I've got 38.7 tokens/sec at max with it on 3090.
The card i bought recently and the prices rn are really good $500-580 on used market, esp compared to RAMSSDpocalypse (made by military with their new Ai drones). Yes 3090 is heating oven compared to 40-50 series but i've bought in China by Ali a 12volt cross-flow/tangential fan which cooling it massively (it used in car air conditioners of floor elevators, just $15-20).
About model itself, its a general use model mostly, the coding quality even in BF16 is not great, yes it can analyze something, but in creativity test like creating a Mozart symphony it produces always erroneous code which need many repairs. For now i haven't found tricks to make it code better, but anyway its too small, only GLM 4.5 by x300 size bigger can produce such code without errors.