SteveMcpoet
's Collections
loveit
updated
Energy-Based Transformers are Scalable Learners and Thinkers
Paper
•
2507.02092
•
Published
•
69
MOSPA: Human Motion Generation Driven by Spatial Audio
Paper
•
2507.11949
•
Published
•
24
Sound and Complete Neuro-symbolic Reasoning with LLM-Grounded
Interpretations
Paper
•
2507.09751
•
Published
•
1
Geometry Forcing: Marrying Video Diffusion and 3D Representation for
Consistent World Modeling
Paper
•
2507.07982
•
Published
•
33
Dynamic Chunking for End-to-End Hierarchical Sequence Modeling
Paper
•
2507.07955
•
Published
•
26
Tora2: Motion and Appearance Customized Diffusion Transformer for
Multi-Entity Video Generation
Paper
•
2507.05963
•
Published
•
12
SAMed-2: Selective Memory Enhanced Medical Segment Anything Model
Paper
•
2507.03698
•
Published
•
11
FAROS: Fair Graph Generation via Attribute Switching Mechanisms
Paper
•
2507.03728
•
Published
•
1
PresentAgent: Multimodal Agent for Presentation Video Generation
Paper
•
2507.04036
•
Published
•
10
Kwai Keye-VL Technical Report
Paper
•
2507.01949
•
Published
•
130
DuplexMamba: Enhancing Real-time Speech Conversations with Duplex and
Streaming Capabilities
Paper
•
2502.11123
•
Published
Paper
•
2507.06204
•
Published
•
19
STITCH: Simultaneous Thinking and Talking with Chunked Reasoning for
Spoken Language Models
Paper
•
2507.15375
•
Published
•
30
Robust 3D-Masked Part-level Editing in 3D Gaussian Splatting with
Regularized Score Distillation Sampling
Paper
•
2507.11061
•
Published
•
37
Deep Researcher with Test-Time Diffusion
Paper
•
2507.16075
•
Published
•
67
Persona Vectors: Monitoring and Controlling Character Traits in Language
Models
Paper
•
2507.21509
•
Published
•
32
LaTCoder: Converting Webpage Design to Code with Layout-as-Thought
Paper
•
2508.03560
•
Published
•
24
Sel3DCraft: Interactive Visual Prompts for User-Friendly Text-to-3D
Generation
Paper
•
2508.00428
•
Published
•
3
REINA: Regularized Entropy Information-Based Loss for Efficient
Simultaneous Speech Translation
Paper
•
2508.04946
•
Published
•
1