MoBE: Mixture-of-Basis-Experts for Compressing MoE-based LLMs Paper • 2508.05257 • Published Aug 7, 2025 • 13
view article Article Accelerate ND-Parallel: A guide to Efficient Multi-GPU Training +3 Aug 8, 2025 • 90
Tanuki-8B Collection Llama-3-8B 類似アーキテクチャの日本語フルスクラッチLLM(NEDO承認後に公開予定) • 4 items • Updated Jun 12, 2024 • 3