ElliotGao
tclf90
AI & ML interests
None yet
Recent Activity
new activity 10 days ago
QuantTrio/GLM-4.7-AWQ:Revert remplate new activity 13 days ago
QuantTrio/GLM-4.7-AWQ:Support for structured outputOrganizations
Revert remplate
#8 opened 12 days ago
by
s-yanev
Support for structured output
#7 opened 13 days ago
by
s-yanev
Fix chat_template crash when assistant message omits the `content` key
#5 opened 20 days ago
by
qgallouedec
Fix chat_template crash when assistant message omits the `content` key
#4 opened 20 days ago
by
qgallouedec
Fix chat_template crash when assistant message omits the `content` key
#5 opened 20 days ago
by
qgallouedec
Any plans for a calibration-based AWQ build for better long-context stability?
2
#6 opened about 1 month ago
by
hyunw55
PPLX or KLD, or other benchmark
1
#4 opened about 1 month ago
by
HenkTenk
[Request] Great work! Do you have plans to also create GLM-5.1-AWQ?
🤗 1
10
#6 opened about 2 months ago
by
ag1988
CUDA version 13?
1
#1 opened about 2 months ago
by
pathosethoslogos
Request for awq of the gemma 4 26B A4B MoE
6
#1 opened about 2 months ago
by
rks2302
AWQ 4/5/6-bit request for Qwopus3.5-27B-v3
🚀❤️ 3
3
#2 opened about 2 months ago
by
celikburak
AWQ 4-bit version of this Opus-Distilled-v2 model?
9
#5 opened about 2 months ago
by
celikburak
--max-model-len 32768 seems a bit too small for agent use cases ?
3
#3 opened 3 months ago
by
edwarddukewu
Install & run QuantTrio/MiniMax-M2-AWQ easily using llmpm
👍 1
1
#8 opened 3 months ago
by
sarthak-saxena
My personal vLLM launch cmd on my old personal 2x3090 workstation
7
#1 opened 3 months ago
by
tclf90
Can't get vLLM running on 1xRTX 4090
3
#1 opened 3 months ago
by
slyfox1186
Easy to fall into infinite loop
👍 1
7
#2 opened 3 months ago
by
dwaynedu
GLM-5-AWQ vLLM 部署指南
👍 1
2
#2 opened 3 months ago
by
CharlesChen2023
Great work
5
#1 opened 3 months ago
by
JoeyHwong
How run this model on Sglang?
1
#2 opened 3 months ago
by
Salvadori