Yunpeng Huang

strivin

strivin0311

AI & ML interests

DGMs, including Transformers, Diffusers, GANs, etc, and DRL, including DQNs, PPO, MCTS, etc, with autonomous driving as the most relevant AI scenario

Recent Activity

upvoted a paper 19 days ago

PyTorch Distributed: Experiences on Accelerating Data Parallel Training

upvoted a paper 19 days ago

PyTorch FSDP: Experiences on Scaling Fully Sharded Data Parallel

upvoted a paper 19 days ago

LlamaFactory: Unified Efficient Fine-Tuning of 100+ Language Models

View all activity

Organizations

upvoted 3 papers 19 days ago

upvoted a paper 7 months ago

SageAttention2++: A More Efficient Implementation of SageAttention2

Paper • 2505.21136 • Published May 27 • 45

liked a model 8 months ago

cognition-ai/Kevin-32B

33B • Updated May 6 • 690 • 160

liked a model 10 months ago

openbmb/MiniCPM-o-2_6

Any-to-Any • 9B • Updated Oct 5 • 97.6k • 1.27k

liked a model 12 months ago

MiniMaxAI/MiniMax-Text-01

Text Generation • 456B • Updated Jul 3 • 2.1k • 652

liked 2 datasets about 1 year ago

Norquinal/claude_multiround_chat_1k

Viewer • Updated Aug 11, 2023 • 1.61k • 47 • 14

BelleGroup/multiturn_chat_0.8M

Viewer • Updated Apr 2, 2023 • 831k • 350 • 144

liked 3 models about 1 year ago

deepseek-ai/DeepSeek-V2-Lite-Chat

Text Generation • 16B • Updated Jun 25, 2024 • 217k • 132

deepseek-ai/DeepSeek-V2-Lite

Text Generation • 16B • Updated Jun 25, 2024 • 148k • 155

codellama/CodeLlama-7b-Instruct-hf

Text Generation • 7B • Updated Apr 12, 2024 • 35.2k • 252

liked 4 models over 1 year ago

allenai/OLMo-7B-0424

Text Generation • Updated Jul 30, 2024 • 197 • 50

nvidia/Minitron-4B-Base

Text Generation • Updated Feb 14 • 457 • 135

meta-llama/Llama-3.1-8B

Text Generation • 8B • Updated Oct 16, 2024 • 654k • • 1.99k

meta-llama/Llama-3.1-8B-Instruct

Text Generation • 8B • Updated Sep 25, 2024 • 10.6M • • 5.18k

liked 2 Spaces over 1 year ago

Model Memory Utility

🚀

992

Calculate vRAM needed for model training and inference

Calculate Model Flops

🔥

Calculate FLOPs and parameters for transformer models

liked 2 models over 1 year ago

Skywork/Skywork-MoE-Base-FP8

Text Generation • 146B • Updated Jul 31, 2024 • 108 • 7

mistralai/Codestral-22B-v0.1

22B • Updated Jul 24 • 10.1k • 1.32k

Yunpeng Huang

AI & ML interests

Recent Activity

Organizations

strivin's activity

Model Memory Utility

Calculate Model Flops