MONEY MAKING
#52 opened about 6 hours ago
by
Andyhabu
为什么要设计成是否think必须用户指定而不能模型自己选择
#50 opened 1 day ago
by
fudayuan
Long context reasoning scores bad?
2
#49 opened 1 day ago
by
sebastienbo
Endless Repetition? Anyone encountered?
3
#48 opened 2 days ago
by
evilperson068
Inference Much slower as compared to other A3B Models
👀
👍
3
2
#47 opened 2 days ago
by
engrtipusultan
Possible to run this in 8GB VRAM + 48GB RAM?
3
#46 opened 2 days ago
by
krigeta
Excellent model - short feedback
2
#44 opened 3 days ago
by
Dampfinchen
Thank you Z.AI, I love this model! ❤
👀
❤️
4
3
#43 opened 3 days ago
by
MrDevolver
VLLM NVFP4 PROBLEM
#41 opened 3 days ago
by
prudant
Model breaks apart when used with different languages
2
#38 opened 4 days ago
by
nephepritou
Number of layers 47 or 48
2
#37 opened 4 days ago
by
jKqfO84n
Amazing! look what this local AI generated in 5 minutes.
👀
🤯
4
5
#36 opened 4 days ago
by
robert1968
Problems with logical reasoning performance of GLM-4.7-Flash
1
#35 opened 4 days ago
by
sszymczyk
There is no module or parameter named 'model.layers.5.mlp.gate.e_score_correction_bias' in TransformersMoEForCausalLM
➕
6
1
#34 opened 5 days ago
by
divinefeng
open Tau^2 benchmark codebase?
#33 opened 5 days ago
by
howtain
Please consider making it available through your official chat website. ❤
#32 opened 5 days ago
by
MrDevolver
Do you guys have a plan to create dense coding specific model?
2
#31 opened 5 days ago
by
hanzceo
config.json - "scoring_func": "sigmoid"
👍
1
#28 opened 6 days ago
by
algorithm
Question about model usage in Turkish
#27 opened 6 days ago
by
0xStego
出现UNEXPECTED 警告
#26 opened 6 days ago
by
shanlinguoke
unsupport glm4-moe-lite
2
#25 opened 6 days ago
by
cppowboy
cannot import name 'AutoModelForVision2Seq' from 'transformers'
#24 opened 6 days ago
by
marsmc
Problem with model
7
#22 opened 6 days ago
by
dwojcik
Why does the KV cache occupy so much GPU memory?
13
#21 opened 7 days ago
by
yyg201708
Excellent version
🔥
5
5
#19 opened 7 days ago
by
luxiangyu
Cannot run vLLM on DGX Spark: ImportError: libcudart.so.12
3
#18 opened 7 days ago
by
yyg201708
I hope GLM can release version 4.6 Air with Chinese thought processes, as version 4.7 seems to be written entirely in English. Alternatively, I'd like to release version 4.8 Air directly.
😎
🤗
5
#15 opened 7 days ago
by
mimeng1990
Installation Video and Testing - Step by Step
👍
1
#13 opened 7 days ago
by
fahdmirzac
llama.cpp inference - 20 times (!) slower than OSS 20 on a RTX 5090
➕
1
9
#12 opened 7 days ago
by
cmp-nct
Thank you!
🔥
16
#4 opened 7 days ago
by
mav23
Enormous KV-cache size?
👍
➕
6
23
#3 opened 7 days ago
by
nephepritou
Base model
🔥
7
2
#2 opened 7 days ago
by
tcpmux
Performance Discussion
👀
2
3
#1 opened 7 days ago
by
IndenScale