Is any prediction for 24GB VRAM suitable model??

#1
by Gold-B-Ai - opened

A model that can run on a 24GB VRAM would be AWESOME!!!

Thanks Guys,
Brilliant WORK!!

Works quite well on an ancient Tesla P40(I believe 60-65 tps)(24gb vram cuda compat 6.1). Its a linear-ish architecture so it doesnt bog down over ctxlen compared to classical transformers.

Israel National NLP Program org

you can load it with fp4: ~16GB of VRam

Sign up or log in to comment