START: Spatial and Textual Learning for Chart Understanding Paper • 2512.07186 • Published about 1 month ago • 2
VIDEOP2R: Video Understanding from Perception to Reasoning Paper • 2511.11113 • Published Nov 14, 2025 • 112
PAVE: Patching and Adapting Video Large Language Models Paper • 2503.19794 • Published Mar 25, 2025 • 3
AIM: Adaptive Inference of Multi-Modal LLMs via Token Merging and Pruning Paper • 2412.03248 • Published Dec 4, 2024 • 26