Interesting new techniques
updated
Self-Play Fine-Tuning Converts Weak Language Models to Strong Language
Models
Paper
•
2401.01335
•
Published
•
68
Lumiere: A Space-Time Diffusion Model for Video Generation
Paper
•
2401.12945
•
Published
•
86
Adding NVMe SSDs to Enable and Accelerate 100B Model Fine-tuning on a
Single GPU
Paper
•
2403.06504
•
Published
•
56
Transformer-Lite: High-efficiency Deployment of Large Language Models on
Mobile Phone GPUs
Paper
•
2403.20041
•
Published
•
34
OmniGen: Unified Image Generation
Paper
•
2409.11340
•
Published
•
115
Kolmogorov-Arnold Transformer
Paper
•
2409.10594
•
Published
•
45
Multimodal Latent Language Modeling with Next-Token Diffusion
Paper
•
2412.08635
•
Published
•
48
A3: Android Agent Arena for Mobile GUI Agents
Paper
•
2501.01149
•
Published
•
22
Dispider: Enabling Video LLMs with Active Real-Time Interaction via
Disentangled Perception, Decision, and Reaction
Paper
•
2501.03218
•
Published
•
35
Sketch-of-Thought: Efficient LLM Reasoning with Adaptive
Cognitive-Inspired Sketching
Paper
•
2503.05179
•
Published
•
46
Modifying Large Language Model Post-Training for Diverse Creative
Writing
Paper
•
2503.17126
•
Published
•
36
I Have Covered All the Bases Here: Interpreting Reasoning Features in
Large Language Models via Sparse Autoencoders
Paper
•
2503.18878
•
Published
•
119
Breaking the Modality Barrier: Universal Embedding Learning with
Multimodal LLMs
Paper
•
2504.17432
•
Published
•
40
Parallel Scaling Law for Language Models
Paper
•
2505.10475
•
Published
•
83
Paper
•
2505.14674
•
Published
•
37
Using Reinforcement Learning to Train Large Language Models to Explain
Human Decisions
Paper
•
2505.11614
•
Published
Diffusion vs. Autoregressive Language Models: A Text Embedding
Perspective
Paper
•
2505.15045
•
Published
•
54
Learning to Reason Over Time: Timeline Self-Reflection for Improved
Temporal Reasoning in Language Models
Paper
•
2504.05258
•
Published
•
1
LoHoVLA: A Unified Vision-Language-Action Model for Long-Horizon
Embodied Tasks
Paper
•
2506.00411
•
Published
•
31
Aligning Latent Spaces with Flow Priors
Paper
•
2506.05240
•
Published
•
27
Leveraging Self-Attention for Input-Dependent Soft Prompting in LLMs
Paper
•
2506.05629
•
Published
•
37
HiWave: Training-Free High-Resolution Image Generation via Wavelet-Based
Diffusion Sampling
Paper
•
2506.20452
•
Published
•
19
Lizard: An Efficient Linearization Framework for Large Language Models
Paper
•
2507.09025
•
Published
•
18
Hyper-Bagel: A Unified Acceleration Framework for Multimodal
Understanding and Generation
Paper
•
2509.18824
•
Published
•
22
0.1B
•
Updated
•
15.2M
•
305