Federico Minutoli

DiTo97

DiTo97

AI & ML interests

anything machine learning. I am strongly passionate in computer vision and robotics, and how machine learning will help achieve autonomous behavior, perception and continuous learning.

Recent Activity

upvoted a paper 9 days ago

MMGR: Multi-Modal Generative Reasoning

upvoted an article 16 days ago

How We Use Claude Code Skills to Run 1,000+ ML Experiments a Day

new activity 4 months ago

LiquidAI/LFM2-VL-450M:ValueError: Image features and image tokens do not match: tokens: 9728, features 10240 mb

View all activity

Organizations

upvoted a paper 9 days ago

MMGR: Multi-Modal Generative Reasoning

Paper • 2512.14691 • Published 12 days ago • 114

upvoted an article 16 days ago

Article

How We Use Claude Code Skills to Run 1,000+ ML Experiments a Day

20 days ago

•

upvoted a paper 6 months ago

SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement Learning

Paper • 2506.24119 • Published Jun 30 • 50

upvoted a paper 10 months ago

LLMVoX: Autoregressive Streaming Text-to-Speech Model for Any LLM

Paper • 2503.04724 • Published Mar 6 • 72

upvoted an article about 1 year ago

Article

Deriving DPO's Loss

Dec 24, 2024

•

upvoted 2 papers about 1 year ago

Apollo: An Exploration of Video Understanding in Large Multimodal Models

Paper • 2412.10360 • Published Dec 13, 2024 • 147

WebRL: Training LLM Web Agents via Self-Evolving Online Curriculum Reinforcement Learning

Paper • 2411.02337 • Published Nov 4, 2024 • 36

upvoted a paper over 1 year ago

xGen-MM (BLIP-3): A Family of Open Large Multimodal Models

Paper • 2408.08872 • Published Aug 16, 2024 • 101

upvoted an article over 1 year ago

Article

License to Call: Introducing Transformers Agents 2.0

May 13, 2024

•

137

upvoted a paper over 1 year ago

LEGENT: Open Platform for Embodied Agents

Paper • 2404.18243 • Published Apr 28, 2024 • 22

upvoted 8 papers almost 2 years ago

From GPT-4 to Gemini and Beyond: Assessing the Landscape of MLLMs on Generalizability, Trustworthiness and Causality through Four Modalities

Paper • 2401.15071 • Published Jan 26, 2024 • 37

AutoRT: Embodied Foundation Models for Large Scale Orchestration of Robotic Agents

Paper • 2401.12963 • Published Jan 23, 2024 • 12

MoE-Mamba: Efficient Selective State Space Models with Mixture of Experts

Paper • 2401.04081 • Published Jan 8, 2024 • 73

upvoted 2 papers about 2 years ago

Schrodinger Bridges Beat Diffusion Models on Text-to-Speech Synthesis

Paper • 2312.03491 • Published Dec 6, 2023 • 34

Merlin:Empowering Multimodal LLMs with Foresight Minds

Paper • 2312.00589 • Published Nov 30, 2023 • 27

Federico Minutoli

AI & ML interests

Recent Activity

Organizations

DiTo97's activity

How We Use Claude Code Skills to Run 1,000+ ML Experiments a Day

Deriving DPO's Loss

License to Call: Introducing Transformers Agents 2.0