Block Diffusion for Flash Speculative Decoding
AI & ML interests
Efficient AI
Recent Activity
Papers
DFlash: Block Diffusion for Flash Speculative Decoding
ParoQuant: Pairwise Rotation Quantization for Efficient Reasoning LLM Inference
Pairwise Rotation Quantization for Efficient Reasoning LLM Inference
-
ParoQuant: Pairwise Rotation Quantization for Efficient Reasoning LLM Inference
Paper • 2511.10645 • Published • 8 -
z-lab/Qwen3.5-9B-PARO
Image-Text-to-Text • 3B • Updated • 52.3k • 41 -
z-lab/Qwen3.5-4B-PARO
Image-Text-to-Text • 1B • Updated • 16.1k • 14 -
z-lab/Qwen3.5-27B-PARO
Image-Text-to-Text • 6B • Updated • 1.84k • 12
Block Diffusion for Flash Speculative Decoding
Pairwise Rotation Quantization for Efficient Reasoning LLM Inference
-
ParoQuant: Pairwise Rotation Quantization for Efficient Reasoning LLM Inference
Paper • 2511.10645 • Published • 8 -
z-lab/Qwen3.5-9B-PARO
Image-Text-to-Text • 3B • Updated • 52.3k • 41 -
z-lab/Qwen3.5-4B-PARO
Image-Text-to-Text • 1B • Updated • 16.1k • 14 -
z-lab/Qwen3.5-27B-PARO
Image-Text-to-Text • 6B • Updated • 1.84k • 12
models 30
z-lab/Qwen3.5-27B-DFlash
Text Generation • 4B • Updated • 303 • 9
z-lab/Qwen3.5-0.8B-PARO
Image-Text-to-Text • 0.4B • Updated • 816 • 1
z-lab/Llama-2-7b-hf-PARO
Text Generation • 1B • Updated • 315 • 1
z-lab/DeepSeek-R1-Distill-Llama-8B-PARO
Text Generation • 1B • Updated • 561 • 1
z-lab/Qwen3.5-35B-A3B-PARO
Image-Text-to-Text • 6B • Updated • 28 • 1
z-lab/Qwen3.5-27B-PARO
Image-Text-to-Text • 6B • Updated • 1.84k • 12
z-lab/Qwen3.5-9B-PARO
Image-Text-to-Text • 3B • Updated • 52.3k • 41
z-lab/Qwen3.5-4B-PARO
Image-Text-to-Text • 1B • Updated • 16.1k • 14
z-lab/Qwen3.5-2B-PARO
Image-Text-to-Text • 1B • Updated • 375 • 2
z-lab/Qwen3-14B-PARO
Text Generation • 2B • Updated • 548 • 2
datasets 6
z-lab/gsm8k-filtered
Viewer • Updated • 1.31k • 20
z-lab/mt-bench-filtered
Viewer • Updated • 79 • 22
z-lab/mbpp-sanitized-filtered
Viewer • Updated • 256 • 22
z-lab/humaneval-filtered
Viewer • Updated • 137 • 24
z-lab/qwen3-4b-instruct-100k
Viewer • Updated • 100k • 31
z-lab/qwen3-4b-thinking-100k
Viewer • Updated • 100k • 18