shuoxing/llama3-8b-full-sft-mix-low-tweet-1m-en-no-packing-new Text Generation • 266k • Updated Nov 21, 2025 • 3
shuoxing/llama3-8b-full-pretrain-control-tweet-1m-en-no-packing-new Text Generation • 266k • Updated Nov 21, 2025 • 20
shuoxing/llama3-8b-full-pretrain-mix-high-tweet-1m-en-no-packing-new Text Generation • 266k • Updated Nov 21, 2025 • 20
shuoxing/llama3-8b-full-pretrain-mix-mid-tweet-1m-en-no-packing-new Text Generation • 266k • Updated Nov 21, 2025 • 23
shuoxing/llama3-8b-full-pretrain-mix-low-tweet-1m-en-no-packing-new Text Generation • 266k • Updated Nov 21, 2025 • 21
shuoxing/llama3-8b-full-sft-control-tweet-1m-en-gpt-no-packing-1e-6-sft Text Generation • 266k • Updated Nov 20, 2025 • 2
shuoxing/llama3-8b-full-sft-mix-high-tweet-1m-en-gpt-no-packing-1e-6-sft Text Generation • 266k • Updated Nov 20, 2025 • 3
shuoxing/llama3-8b-full-sft-mix-mid-tweet-1m-en-gpt-gpt-no-packing-1e-6-sft Text Generation • 266k • Updated Nov 20, 2025 • 3
shuoxing/llama3-8b-full-sft-mix-low-tweet-1m-en-gpt-no-packing-1e-6-sft Text Generation • 266k • Updated Nov 20, 2025 • 1
shuoxing/llama3-8b-full-sft-junk-tweet-1m-en-no-packing-1e-6-sft Text Generation • 266k • Updated Nov 20, 2025 • 2
shuoxing/llama3-8b-full-pretrain-mix-high-tweet-1m-en-no-packing-1e-6 Text Generation • 266k • Updated Nov 20, 2025 • 4
shuoxing/llama3-8b-full-pretrain-mix-mid-tweet-1m-en-no-packing-1e-6 Text Generation • 266k • Updated Nov 20, 2025 • 2
shuoxing/llama3-8b-full-pretrain-control-tweet-1m-en-no-packing-1e-6 Text Generation • 266k • Updated Nov 20, 2025 • 3
shuoxing/llama3-8b-full-sft-control-tweet-1m-en-gpt-no-packing-sft-1e-4 Text Generation • 266k • Updated Nov 20, 2025 • 3
shuoxing/llama3-8b-full-sft-mix-high-tweet-1m-en-gpt-no-packing-sft-1e-4 Text Generation • 266k • Updated Nov 20, 2025 • 4
shuoxing/llama3-8b-full-sft-mix-mid-tweet-1m-en-gpt-no-packing-sft-1e-4 Text Generation • 266k • Updated Nov 20, 2025 • 4
shuoxing/llama3-8b-full-sft-mix-low-tweet-1m-en-gpt-no-packing-sft-1e-4 Text Generation • 266k • Updated Nov 20, 2025 • 4
shuoxing/llama3-8b-full-sft-junk-tweet-1m-en-no-packing-1e-4-sft Text Generation • 266k • Updated Nov 20, 2025 • 2
shuoxing/llama3-8b-full-pretrain-mix-high-tweet-1m-en-no-packing-1e-4 Text Generation • 266k • Updated Nov 20, 2025 • 5
shuoxing/llama3-8b-full-pretrain-mix-mid-tweet-1m-en-no-packing-1e-4 Text Generation • 266k • Updated Nov 20, 2025 • 4
shuoxing/llama3-8b-full-pretrain-mix-low-tweet-1m-en-no-packing-1e-4 Text Generation • 266k • Updated Nov 20, 2025 • 4
shuoxing/llama3-8b-full-pretrain-control-tweet-1m-en-no-packing-1e-4 Text Generation • 266k • Updated Nov 19, 2025 • 4
shuoxing/llama3-8b-full-pretrain-junk-tweet-1m-en-no-packing-1e-6 Text Generation • 266k • Updated Nov 19, 2025 • 4
shuoxing/llama3-8b-full-pretrain-junk-tweet-1m-en-no-packing-1e-4 Text Generation • 266k • Updated Nov 19, 2025 • 4
shuoxing/llama3-8b-full-sft-control-tweet-1m-en-gpt-no-packing-sft-epoch-2 Text Generation • 266k • Updated Nov 17, 2025 • 4