view article Article KV Caching Explained: Optimizing Transformer Inference Efficiency Jan 30, 2025 • 313
view article Article Transformers v5: Simple model definitions powering the AI ecosystem +2 Dec 1, 2025 • 310
NVIDIA Nemotron v3 Collection Open, Production-ready Enterprise Models • 18 items • Updated 1 day ago • 280
view article Article Welcome EmbeddingGemma, Google's new efficient embedding model +4 Sep 4, 2025 • 273
view article Article Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM +2 Mar 12, 2025 • 495
InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions Paper • 2412.09596 • Published Dec 12, 2024 • 97