Qwen3-Coder-Next 56B REAP - GGUF

Quantized GGUF versions of 0xSero/qwen3-coder-next-56b-REAP. These were generated using the default settings with llama-quantize (b8740).

Quantizations provided

File	Quantization	Size
`qwen3-coder-next-56b-REAP-Q4_K_M.gguf`	Q4_K_M	34.4 GB
`qwen3-coder-next-56b-REAP-Q5_K_M.gguf`	Q5_K_M	40.3 GB
`qwen3-coder-next-56b-REAP-Q6_K.gguf`	Q6_K	46.5 GB
`qwen3-coder-next-56b-REAP-Q8_0.gguf`	Q8_0	60.2 GB

I tested perplexity using llama-perplexity and Salesforce's wikitext-2-raw-v1.

File	Quantization	Ctx	PPL
`qwen3-coder-next-56b-REAP-Q4_K_M.gguf`	Q4_K_M	512	15.3702 +/- 0.13301
`qwen3-coder-next-56b-REAP-Q5_K_M.gguf`	Q5_K_M	512	15.2810 +/- 0.13196
`qwen3-coder-next-56b-REAP-Q6_K.gguf`	Q6_K	512	15.1305 +/- 0.13011
`qwen3-coder-next-56b-REAP-Q8_0.gguf`	Q8_0	512	15.1198 +/- 0.13009
`qwen3-coder-next-56b-REAP-BF16.gguf`	BF16	512	15.1274 +/- 0.13022

GGUF

Model size

57B params

Architecture

qwen3next

Hardware compatibility

4-bit

5-bit

6-bit

8-bit

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Base model

Finetuned

Quantized

(1)

this model