Avi's picture

Avi

avahal

·

AI & ML interests

LLMs

Recent Activity

commentedon a paper about 17 hours ago

Reasoning Shift: How Context Silently Shortens LLM Reasoning

commentedon a paper about 17 hours ago

QuitoBench: A High-Quality Open Time Series Forecasting Benchmark

commentedon a paper about 17 hours ago

Vision2Web: A Hierarchical Benchmark for Visual Website Development with Agent Verification

View all activity

Organizations

None yet

commented 7 papers about 17 hours ago

Reasoning Shift: How Context Silently Shortens LLM Reasoning

Paper • 2604.01161 • Published 1 day ago • 22 •

QuitoBench: A High-Quality Open Time Series Forecasting Benchmark

Paper • 2603.26017 • Published 7 days ago • 26 •

Vision2Web: A Hierarchical Benchmark for Visual Website Development with Agent Verification

Paper • 2603.26648 • Published 7 days ago • 33 •

ViGoR-Bench: How Far Are Visual Generative Models From Zero-Shot Visual Reasoners?

Paper • 2603.25823 • Published 7 days ago • 36 •

Terminal Agents Suffice for Enterprise Automation

Paper • 2604.00073 • Published 3 days ago • 72 •

MiroEval: Benchmarking Multimodal Deep Research Agents in Process and Outcome

Paper • 2603.28407 • Published 4 days ago • 53 •

ClawKeeper: Comprehensive Safety Protection for OpenClaw Agents Through Skills, Plugins, and Watchers

Paper • 2603.24414 • Published 9 days ago • 167 •

commented 11 papers 2 days ago

Lingshu-Cell: A generative cellular world model for transcriptome modeling toward virtual cells

Paper • 2603.25240 • Published 8 days ago • 73 •

CARLA-Air: Fly Drones Inside a CARLA World -- A Unified Infrastructure for Air-Ground Embodied Intelligence

Paper • 2603.28032 • Published 4 days ago • 299 •

LongCat-Next: Lexicalizing Modalities as Discrete Tokens

Paper • 2603.27538 • Published 5 days ago • 125 •

FIPO: Eliciting Deep Reasoning with Future-KL Influenced Policy Optimization

Paper • 2603.19835 • Published 14 days ago • 304 •

Gen-Searcher: Reinforcing Agentic Search for Image Generation

Paper • 2603.28767 • Published 4 days ago • 51 •

Towards a Medical AI Scientist

Paper • 2603.28589 • Published 4 days ago • 82 •

TAPS: Task Aware Proposal Distributions for Speculative Sampling

Paper • 2603.27027 • Published 6 days ago • 137 •

Trace2Skill: Distill Trajectory-Local Lessons into Transferable Agent Skills

Paper • 2603.25158 • Published 8 days ago • 45 •

PackForcing: Short Video Training Suffices for Long Video Sampling and Long Context Inference

Paper • 2603.25730 • Published 8 days ago • 47 •

ShotStream: Streaming Multi-Shot Video Generation for Interactive Storytelling

Paper • 2603.25746 • Published 8 days ago • 150 •

Out of Sight but Not Out of Mind: Hybrid Memory for Dynamic Video World Models

Paper • 2603.25716 • Published 8 days ago • 147 •

commented 2 papers 8 days ago

Why Does Self-Distillation (Sometimes) Degrade the Reasoning Capability of LLMs?

Paper • 2603.24472 • Published 9 days ago • 47 •

UI-Voyager: A Self-Evolving GUI Agent Learning via Failed Experience

Paper • 2603.24533 • Published 9 days ago • 45 •