spicyneuron commited on
Commit
c9de3d3
·
verified ·
1 Parent(s): 88ad441

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -3
README.md CHANGED
@@ -10,16 +10,16 @@ tags:
10
 
11
  [Qwen3-Coder-Next](https://huggingface.co/Qwen/Qwen3-Coder-Next) optimized for MLX. Note: Uses MXFP4 for some module paths.
12
 
13
- **EDIT:** [v2](https://huggingface.co/spicyneuron/Qwen3-Next-Coder-MLX-mixed-4.5-bit/tree/v2) fixes some misassigned shared expert gates. Slower, but with 4x better perplexity.
14
 
15
- **EDIT:** [v3](https://huggingface.co/spicyneuron/Qwen3-Next-Coder-MLX-mixed-4.5-bit/tree/v3) bumps edge experts to Q8 for further perplexity improvement and minimal effect on speed.
16
 
17
  # Usage
18
 
19
  ```sh
20
  # Start server at http://localhost:8080/v1/chat/completions
21
  uvx --from mlx-lm mlx_lm.server --host 127.0.0.1 --port 8080 \
22
- --model spicyneuron/Qwen3-Next-Coder-MLX-mixed-4.5-bit
23
  ```
24
 
25
  # Methodology
 
10
 
11
  [Qwen3-Coder-Next](https://huggingface.co/Qwen/Qwen3-Coder-Next) optimized for MLX. Note: Uses MXFP4 for some module paths.
12
 
13
+ **EDIT:** [v2](https://huggingface.co/spicyneuron/Qwen3-Next-Coder-MLX-4.5bit/tree/v2) fixes some misassigned shared expert gates. Slower, but with 4x better perplexity.
14
 
15
+ **EDIT:** [v3](https://huggingface.co/spicyneuron/Qwen3-Next-Coder-MLX-4.5bit/tree/v3) bumps edge experts to Q8 for further perplexity improvement and minimal effect on speed.
16
 
17
  # Usage
18
 
19
  ```sh
20
  # Start server at http://localhost:8080/v1/chat/completions
21
  uvx --from mlx-lm mlx_lm.server --host 127.0.0.1 --port 8080 \
22
+ --model spicyneuron/Qwen3-Next-Coder-MLX-4.5bit
23
  ```
24
 
25
  # Methodology