Is any prediction for 24GB VRAM suitable model??
#1
by Gold-B-Ai - opened
A model that can run on a 24GB VRAM would be AWESOME!!!
Thanks Guys,
Brilliant WORK!!
Works quite well on an ancient Tesla P40(I believe 60-65 tps)(24gb vram cuda compat 6.1). Its a linear-ish architecture so it doesnt bog down over ctxlen compared to classical transformers.
you can load it with fp4: ~16GB of VRam