64 1 17

ElliotGao

tclf90

AI & ML interests

None yet

Recent Activity

new activity 10 days ago

QuantTrio/GLM-4.7-AWQ:Revert remplate

new activity 13 days ago

QuantTrio/GLM-4.7-AWQ:Support for structured output

new activity 20 days ago

QuantTrio/DeepSeek-V3.1-AWQ-Lite:Fix chat_template crash when assistant message omits the `content` key

View all activity

Organizations

New activity in QuantTrio/GLM-4.7-AWQ 10 days ago

Revert remplate

#8 opened 12 days ago by

s-yanev

New activity in QuantTrio/GLM-4.7-AWQ 13 days ago

Support for structured output

#7 opened 13 days ago by

s-yanev

New activity in QuantTrio/DeepSeek-V3.1-AWQ-Lite 20 days ago

Fix chat_template crash when assistant message omits the `content` key

#5 opened 20 days ago by

qgallouedec

New activity in QuantTrio/DeepSeek-V3.2-Exp-AWQ 20 days ago

Fix chat_template crash when assistant message omits the `content` key

#4 opened 20 days ago by

qgallouedec

New activity in QuantTrio/DeepSeek-V3.1-AWQ 20 days ago

Fix chat_template crash when assistant message omits the `content` key

#5 opened 20 days ago by

qgallouedec

New activity in QuantTrio/Qwen3.6-35B-A3B-AWQ about 1 month ago

Any plans for a calibration-based AWQ build for better long-context stability?

#6 opened about 1 month ago by

hyunw55

PPLX or KLD, or other benchmark

#4 opened about 1 month ago by

HenkTenk

New activity in QuantTrio/GLM-5-AWQ about 2 months ago

[Request] Great work! Do you have plans to also create GLM-5.1-AWQ?

🤗 1

#6 opened about 2 months ago by

ag1988

New activity in QuantTrio/Qwen3.5-122B-A10B-AWQ about 2 months ago

CUDA version 13?

#1 opened about 2 months ago by

pathosethoslogos

New activity in QuantTrio/gemma-4-31B-it-AWQ about 2 months ago

Request for awq of the gemma 4 26B A4B MoE

#1 opened about 2 months ago by

rks2302

New activity in QuantTrio/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled-v2-AWQ about 2 months ago

AWQ 4/5/6-bit request for Qwopus3.5-27B-v3

🚀❤️ 3

#2 opened about 2 months ago by

celikburak

New activity in QuantTrio/Qwen3.5-27B-AWQ about 2 months ago

AWQ 4-bit version of this Opus-Distilled-v2 model?

#5 opened about 2 months ago by

celikburak

New activity in QuantTrio/Qwen3.5-27B-AWQ 3 months ago

--max-model-len 32768 seems a bit too small for agent use cases ?

#3 opened 3 months ago by

edwarddukewu

New activity in QuantTrio/MiniMax-M2-AWQ 3 months ago

Install & run QuantTrio/MiniMax-M2-AWQ easily using llmpm

👍 1

#8 opened 3 months ago by

sarthak-saxena

New activity in QuantTrio/Qwen3.5-27B-AWQ 3 months ago

My personal vLLM launch cmd on my old personal 2x3090 workstation

#1 opened 3 months ago by

tclf90

New activity in QuantTrio/Qwen3.5-35B-A3B-AWQ 3 months ago

Can't get vLLM running on 1xRTX 4090

#1 opened 3 months ago by

slyfox1186

New activity in cyankiwi/Qwen3.5-27B-AWQ-4bit 3 months ago

Easy to fall into infinite loop

👍 1

#2 opened 3 months ago by

dwaynedu

New activity in QuantTrio/GLM-5-AWQ 3 months ago

GLM-5-AWQ vLLM 部署指南

👍 1

#2 opened 3 months ago by

CharlesChen2023

Great work

#1 opened 3 months ago by

JoeyHwong

New activity in QuantTrio/Qwen3.5-35B-A3B-AWQ 3 months ago

How run this model on Sglang?

#2 opened 3 months ago by

Salvadori

ElliotGao

AI & ML interests

Recent Activity

Organizations

tclf90's activity

Revert remplate

Support for structured output

Fix chat_template crash when assistant message omits the `content` key

Fix chat_template crash when assistant message omits the `content` key

Fix chat_template crash when assistant message omits the `content` key

Any plans for a calibration-based AWQ build for better long-context stability?

PPLX or KLD, or other benchmark

[Request] Great work! Do you have plans to also create GLM-5.1-AWQ?

CUDA version 13?

Request for awq of the gemma 4 26B A4B MoE

AWQ 4/5/6-bit request for Qwopus3.5-27B-v3

AWQ 4-bit version of this Opus-Distilled-v2 model?

--max-model-len 32768 seems a bit too small for agent use cases ?

Install & run QuantTrio/MiniMax-M2-AWQ easily using llmpm

My personal vLLM launch cmd on my old personal 2x3090 workstation

Can't get vLLM running on 1xRTX 4090

Easy to fall into infinite loop

GLM-5-AWQ vLLM 部署指南

Great work

How run this model on Sglang?