QMD Query Expansion 4B (GGUF)
GGUF conversion of the QMD Query Expansion model for use with Ollama, llama.cpp, and LM Studio.
Model Details
- Base Model: Qwen/Qwen3-4B
- SFT Adapter: tobil/qmd-query-expansion-4B-sft
- GRPO Adapter: tobil/qmd-query-expansion-4B-grpo
- Task: Query expansion for hybrid search (lex/vec/hyde format)
Available Quantizations
| File | Quant | Description |
|---|---|---|
| qmd-query-expansion-4B-f16.gguf | F16 | Full precision |
| qmd-query-expansion-4B-q8_0.gguf | Q8_0 | 8-bit |
| qmd-query-expansion-4B-q5_k_m.gguf | Q5_K_M | 5-bit medium |
| qmd-query-expansion-4B-q4_k_m.gguf | Q4_K_M | 4-bit medium (recommended) |
Usage
With Ollama
# Download
huggingface-cli download tobil/qmd-query-expansion-4B-gguf qmd-query-expansion-4B-q4_k_m.gguf --local-dir .
# Create Modelfile
echo 'FROM ./qmd-query-expansion-4B-q4_k_m.gguf' > Modelfile
# Create and run
ollama create qmd-expand-4b -f Modelfile
ollama run qmd-expand-4b
Prompt Format
Use Qwen3 chat format with /no_think:
<|im_start|>user
/no_think Expand this search query: your query here<|im_end|>
<|im_start|>assistant
Expected Output
lex: keyword variation 1
lex: keyword variation 2
vec: natural language reformulation
hyde: Hypothetical document passage answering the query.
License
Apache 2.0 (inherited from Qwen3)
- Downloads last month
- 31
Hardware compatibility
Log In
to view the estimation
4-bit
5-bit
8-bit
16-bit
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support