25 9 20

yehya PRO

ykarout

AI & ML interests

None yet

Recent Activity

updated a model 2 days ago

ykarout/Qwen3.5-27B-exl3-3.0bpw

published a model 2 days ago

ykarout/Qwen3.5-27B-exl3-3.0bpw

upvoted a paper 4 days ago

InCoder-32B: Code Foundation Model for Industrial Scenarios

View all activity

Organizations

updated a model 2 days ago

ykarout/Qwen3.5-27B-exl3-3.0bpw

Image-Text-to-Text • 7B • Updated 2 days ago • 23

published a model 2 days ago

ykarout/Qwen3.5-27B-exl3-3.0bpw

Image-Text-to-Text • 7B • Updated 2 days ago • 23

upvoted a paper 4 days ago

InCoder-32B: Code Foundation Model for Industrial Scenarios

Paper • 2603.16790 • Published 5 days ago • 290

liked a model 5 days ago

ykarout/Qwen3.5-9B-NVFP4

Image-Text-to-Text • 8B • Updated 17 days ago • 1.27k • 1

New activity in nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-NVFP4 9 days ago

CUDA Version -- Min requirement?

👀🤗 2

#6 opened 10 days ago by

raymondlo84-nvidia

New activity in lmstudio-community/NVIDIA-Nemotron-3-Super-120B-A12B-GGUF 9 days ago

Inference Settings

#1 opened 10 days ago by

MedicSaver

commentedon 🏟️ Smol AI WorldCup: A 5-Axis Benchmark That Reveals What Small Language Models Can Really Do 10 days ago

@alfredo-ottomate I seriously thought I was missing a huge breakthrough when reading that lol. I mean even the 4 expert active layers wont fit on the claimed 1.5GB RAM and if we even go further and assume there is disk offloading and the Pi had a high-end Gen3 NVME SSD, I would assume sub 1 tok/s.
@SeaWolf-AI you have a nice structured approach for benchmarking and including different kind of variables and metrics but a lot of info in this is flawed honestly. Also Qwen3.5 models underperforming the Qwen3 ones is unexpected at all, are you sure you have used the recommended generation parameters for each model? As slight variations can lead to totally different outputs especially on the metrics you are looking for.