zai-org/GLM-4.7-Flash · Discussions

MONEY MAKING

#52 opened about 6 hours ago by

Andyhabu

boni

#51 opened 1 day ago by

wonde621

为什么要设计成是否think必须用户指定而不能模型自己选择

#50 opened 1 day ago by

fudayuan

Long context reasoning scores bad?

2

#49 opened 1 day ago by

sebastienbo

Endless Repetition? Anyone encountered?

3

#48 opened 2 days ago by

evilperson068

Inference Much slower as compared to other A3B Models

👀 👍 3

2

#47 opened 2 days ago by

engrtipusultan

Possible to run this in 8GB VRAM + 48GB RAM?

3

#46 opened 2 days ago by

krigeta

Excellent model - short feedback

2

#44 opened 3 days ago by

Dampfinchen

Thank you Z.AI, I love this model! ❤

👀 ❤️ 4

3

#43 opened 3 days ago by

MrDevolver

VLLM NVFP4 PROBLEM

#41 opened 3 days ago by

prudant

Model breaks apart when used with different languages

2

#38 opened 4 days ago by

nephepritou

Number of layers 47 or 48

2

#37 opened 4 days ago by

jKqfO84n

Amazing! look what this local AI generated in 5 minutes.

👀 🤯 4

5

#36 opened 4 days ago by

robert1968

Problems with logical reasoning performance of GLM-4.7-Flash

1

#35 opened 4 days ago by

sszymczyk

There is no module or parameter named 'model.layers.5.mlp.gate.e_score_correction_bias' in TransformersMoEForCausalLM

➕ 6

1

#34 opened 5 days ago by

divinefeng

open Tau^2 benchmark codebase?

#33 opened 5 days ago by

howtain

Please consider making it available through your official chat website. ❤

#32 opened 5 days ago by

MrDevolver

Do you guys have a plan to create dense coding specific model?

2

#31 opened 5 days ago by

hanzceo

test

#30 opened 5 days ago by

mickinsey

FP8 ?

#29 opened 5 days ago by

festr2

config.json - "scoring_func": "sigmoid"

👍 1

#28 opened 6 days ago by

algorithm

Question about model usage in Turkish

#27 opened 6 days ago by

0xStego

出现UNEXPECTED 警告

#26 opened 6 days ago by

shanlinguoke

unsupport glm4-moe-lite

2

#25 opened 6 days ago by

cppowboy

cannot import name 'AutoModelForVision2Seq' from 'transformers'

#24 opened 6 days ago by

marsmc

Problem with model

7

#22 opened 6 days ago by

dwojcik

Why does the KV cache occupy so much GPU memory?

13

#21 opened 7 days ago by

yyg201708

Excellent version

🔥 5

5

#19 opened 7 days ago by

luxiangyu

Cannot run vLLM on DGX Spark: ImportError: libcudart.so.12

3

#18 opened 7 days ago by

yyg201708

I hope GLM can release version 4.6 Air with Chinese thought processes, as version 4.7 seems to be written entirely in English. Alternatively, I'd like to release version 4.8 Air directly.

😎 🤗 5

#15 opened 7 days ago by

mimeng1990

Installation Video and Testing - Step by Step

👍 1

#13 opened 7 days ago by

fahdmirzac

llama.cpp inference - 20 times (!) slower than OSS 20 on a RTX 5090

➕ 1

9

#12 opened 7 days ago by

cmp-nct

Thank you!

🔥 16

#4 opened 7 days ago by

mav23

Enormous KV-cache size?

👍 ➕ 6

23

#3 opened 7 days ago by

nephepritou

Base model

🔥 7

2

#2 opened 7 days ago by

tcpmux

Performance Discussion

👀 2

3

#1 opened 7 days ago by

IndenScale