view article Article Introducing the agentic robotics appstore for 10,000 Reachy Minis clem • 15 days ago • 35
view article Article SynthVision: Building a 110K Synthetic Medical VQA Dataset with Cross-Model Validation OpenMed • Mar 23 • 17
YOLO26 Models Collection YOLO26 models: detection, segmentation, classification, pose, and OBB variants with demos and ONNX variants. • 42 items • Updated Jan 19 • 36
VibeVoice Collection Frontier Text-to-Speech Models https://microsoft.github.io/VibeVoice/ • 8 items • Updated Mar 2 • 244
view article Article Asynchronous Robot Inference: Decoupling Action Prediction and Execution +6 fracapuano, imstevenpmwork, aractingi, mshukor, danaaubakirova, AdilZtn, aliberts, cadene • Jul 10, 2025 • 54
view article Article Powerful ASR + diarization + speculative decoding with Hugging Face Inference Endpoints +2 sergeipetrov, reach-vb, pcuenq, philschmid • May 1, 2024 • 82
view article Article Reachy Mini - The Open-Source Robot for Today's and Tomorrow's AI Builders thomwolf, matthieu-lapeyre • Jul 9, 2025 • 800
view changelog Hugging Face Changelog Inference Providers now fully support OpenAI-compatible API Jul 18, 2025 • 99
💬Urdu ASR Models Collection Collection of fine-tuned Urdu speech recognition models. • 9 items • Updated Jul 14, 2025 • 3
V-JEPA 2 Collection A frontier video understanding model developed by FAIR, Meta, which extends the pretraining objectives of https://ai.meta.com/blog/v-jepa-yann • 8 items • Updated Jun 13, 2025 • 219
view article Article Vision Language Models (Better, faster, stronger) +3 merve, sergiopaniego, ariG23498, pcuenq, andito • May 12, 2025 • 613
view article Article LeMaterial: an open source initiative to accelerate materials discovery and research +8 AlexDuvalinho, lritchie, msiron, inelgnu, etiennedufayet, amandinerossello, Ramlaoui, IAMJB, lvwerra, thomwolf • Dec 10, 2024 • 56
D-FINE Collection State-of-the-art real-time object detection model with Apache 2.0 licence • 15 items • Updated May 5, 2025 • 56
Llama 4 Collection Meta's new Llama 4 multimodal models, Scout & Maverick. Includes Dynamic GGUFs, 16-bit & Dynamic 4-bit uploads. Run & fine-tune them with Unsloth! • 15 items • Updated 29 days ago • 57
MoshiVis v0.1 Collection MoshiVis is a Vision Speech Model built as a perceptually-augmented version of Moshi v0.1 for conversing about image inputs • 6 items • Updated Mar 2 • 23