sai_reddy's picture

14 2 1

sai_reddy

saireddy

·

AI & ML interests

None yet

Recent Activity

new activity about 2 months ago

moonshotai/Kimi-Linear-48B-A3B-Instruct:insights on comparisons with Qwen/Qwen3-Next-80B-A3B-Instruct ?

new activity 2 months ago

Qwen/Qwen3-VL-235B-A22B-Instruct-FP8:function calling

new activity 4 months ago

Qwen/Qwen3-30B-A3B-Instruct-2507-FP8:possible to extend context to 1m tokens ?

View all activity

Organizations

New activity in moonshotai/Kimi-Linear-48B-A3B-Instruct about 2 months ago

insights on comparisons with Qwen/Qwen3-Next-80B-A3B-Instruct ?

#14 opened about 2 months ago by

New activity in Qwen/Qwen3-VL-235B-A22B-Instruct-FP8 2 months ago

function calling

#4 opened 2 months ago by

New activity in Qwen/Qwen3-30B-A3B-Instruct-2507-FP8 4 months ago

possible to extend context to 1m tokens ?

#5 opened 4 months ago by

upvoted an article about 1 year ago

Article

Hugging Face x LangChain : A new partner package

+1

May 14, 2024

•

159

New activity in google/gemma-2-9b about 1 year ago

RuntimeError: Index put requires the source and destination dtypes match, got BFloat16 for the destination and Float for the source.

#24 opened over 1 year ago by

New activity in google/gemma-2-9b over 1 year ago

model.generate is throwing AttributeError: 'HybridCache' object has no attribute 'float'

#18 opened over 1 year ago by

base vs instruct model

#17 opened over 1 year ago by

Inference error

#20 opened over 1 year ago by

New activity in google/gemma-7b over 1 year ago

8-bit precision error

#32 opened almost 2 years ago by

New activity in google/gemma-7b-it over 1 year ago

ValueError with multi A100 GPUS

#28 opened almost 2 years ago by

New activity in meta-llama/Meta-Llama-3-8B-Instruct over 1 year ago

ValueError: You can't train a model that has been loaded in 8-bit precision on a different device than the one you're training on.

#35 opened over 1 year ago by

New activity in meta-llama/Meta-Llama-3-70B-Instruct over 1 year ago

Base vs instruct

#17 opened over 1 year ago by

New activity in google/gemma-7b-it almost 2 years ago

Could not find GemmaForCausalLM neither in <module 'transformers.models.gemma'

#36 opened almost 2 years ago by

liked a model over 2 years ago

meta-llama/Llama-2-13b-chat-hf

Text Generation • 13B • Updated Apr 17, 2024 • 210k • • 1.11k