QMD Query Expansion 4B (GGUF)

GGUF conversion of the QMD Query Expansion model for use with Ollama, llama.cpp, and LM Studio.

Model Details

  • Base Model: Qwen/Qwen3-4B
  • SFT Adapter: tobil/qmd-query-expansion-4B-sft
  • GRPO Adapter: tobil/qmd-query-expansion-4B-grpo
  • Task: Query expansion for hybrid search (lex/vec/hyde format)

Available Quantizations

File Quant Description
qmd-query-expansion-4B-f16.gguf F16 Full precision
qmd-query-expansion-4B-q8_0.gguf Q8_0 8-bit
qmd-query-expansion-4B-q5_k_m.gguf Q5_K_M 5-bit medium
qmd-query-expansion-4B-q4_k_m.gguf Q4_K_M 4-bit medium (recommended)

Usage

With Ollama

# Download
huggingface-cli download tobil/qmd-query-expansion-4B-gguf qmd-query-expansion-4B-q4_k_m.gguf --local-dir .

# Create Modelfile
echo 'FROM ./qmd-query-expansion-4B-q4_k_m.gguf' > Modelfile

# Create and run
ollama create qmd-expand-4b -f Modelfile
ollama run qmd-expand-4b

Prompt Format

Use Qwen3 chat format with /no_think:

<|im_start|>user
/no_think Expand this search query: your query here<|im_end|>
<|im_start|>assistant

Expected Output

lex: keyword variation 1
lex: keyword variation 2
vec: natural language reformulation
hyde: Hypothetical document passage answering the query.

License

Apache 2.0 (inherited from Qwen3)

Downloads last month
31
GGUF
Model size
4B params
Architecture
qwen3
Hardware compatibility
Log In to view the estimation

4-bit

5-bit

8-bit

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for tobil/qmd-query-expansion-4B-gguf

Base model

Qwen/Qwen3-4B-Base
Finetuned
Qwen/Qwen3-4B
Quantized
(179)
this model