Instructions to use IQuestLab/IQuest-Coder-V1-14B-Instruct with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use IQuestLab/IQuest-Coder-V1-14B-Instruct with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="IQuestLab/IQuest-Coder-V1-14B-Instruct", trust_remote_code=True)
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained("IQuestLab/IQuest-Coder-V1-14B-Instruct", trust_remote_code=True, dtype="auto")

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use IQuestLab/IQuest-Coder-V1-14B-Instruct with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "IQuestLab/IQuest-Coder-V1-14B-Instruct"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "IQuestLab/IQuest-Coder-V1-14B-Instruct",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/IQuestLab/IQuest-Coder-V1-14B-Instruct

SGLang

How to use IQuestLab/IQuest-Coder-V1-14B-Instruct with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "IQuestLab/IQuest-Coder-V1-14B-Instruct" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "IQuestLab/IQuest-Coder-V1-14B-Instruct",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "IQuestLab/IQuest-Coder-V1-14B-Instruct" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "IQuestLab/IQuest-Coder-V1-14B-Instruct",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use IQuestLab/IQuest-Coder-V1-14B-Instruct with Docker Model Runner:
```
docker model run hf.co/IQuestLab/IQuest-Coder-V1-14B-Instruct
```

zwpride-iquestlab commited on Mar 1

Commit

d81efe3

verified ·

1 Parent(s): 25a9f79

Update README.md

Browse files

Files changed (1) hide show

README.md +18 -10

README.md CHANGED Viewed

@@ -11,14 +11,16 @@ library_name: transformers
 ![Evaluation Results](./papers/iquest-coder-v1-logo.png)
 <p align="center">
-  📘 <a href="https://iquestlab.github.io">Blog</a >
   &nbsp;•&nbsp;
   📄 <a href="https://github.com/IQuestLab/IQuest-Coder-V1/blob/main/papers/IQuest_Coder_Technical_Report.pdf">Technical Report</a >
 </p >
-# IQuest-Coder-V1 Model Family
-🚀 **[[IQuest-Coder-V1 Update](https://iquestlab.github.io/release-1.0-2602/index.html)]**: Released 7B & 14B Family Models and 40B-Thinking, specially optimized for tool use, CLI agents (Like Claude Code and OpenCode) & HTML/SVG generation, all with 128K context, now on Hugging Face!
 ## 7B Models
@@ -47,10 +49,14 @@ library_name: transformers
 | IQuest-Coder-V1-40B-Instruct | [🤗 Hugging Face](https://huggingface.co/IQuestLab/IQuest-Coder-V1-40B-Instruct) |
 | IQuest-Coder-V1-40B-Loop-Instruct | [🤗 Hugging Face](https://huggingface.co/IQuestLab/IQuest-Coder-V1-40B-Loop-Instruct) |
 | IQuest-Coder-V1-40B-Thinking | [🤗 Hugging Face](https://huggingface.co/IQuestLab/IQuest-Coder-V1-40B-Thinking) |
 ## Sampling Parameters:
 For the IQuest-Coder-V1-Instruct: We suggest using Temperature=0.6, TopP=0.85, TopK=20.
 ## IQuest-Coder-V1 Highlights
 IQuest-Coder-V1 is a new family of code large language models (LLMs) designed to advance autonomous software engineering and code intelligence. Built on the innovative code-flow multi-stage training paradigm, IQuest-Coder-V1 captures the dynamic evolution of software logic, delivering state-of-the-art performance across critical dimensions:
@@ -62,6 +68,8 @@ IQuest-Coder-V1 is a new family of code large language models (LLMs) designed to
 - **Native Long Context**: All models natively support up to 128K tokens without requiring additional scaling techniques.
 - **CLI Agent Integration**: Demonstrates initial deployment capabilities on ClaudeCode and OpenCode platforms, with the ability to integrate into CLI-based agent workflows.
 - **HTML and SVG Generation**: Features preliminary support for HTML and SVG code generation.
 ## Model Overview
@@ -155,13 +163,13 @@ For Thinking models with reasoning support:
 vllm serve IQuestLab/IQuest-Coder-V1-40B-Thinking --reasoning-parser qwen3 --tensor-parallel-size 8
 ```
-When using tool, `IQuest-Coder-V1-40B-Instruct` and `IQuest-Coder-V1-40B-Loop-Instruct` should use `--tool-parser qwen3`, while `IQuest-Coder-V1-7B-Instruct`, `IQuest-Coder-V1-7B-Thinking`, `IQuest-Coder-V1-14B-Instruct`, `IQuest-Coder-V1-14B-Thinking` and `IQuest-Coder-V1-40B-Thinking` should use `--tool-parser qwen3_coder`.
 ### CLI-Like Agents and Tools Usage
-CLI-like agent capabilities are available for the following models: `IQuest-Coder-V1-7B-Instruct`, `IQuest-Coder-V1-7B-Thinking`, `IQuest-Coder-V1-14B-Instruct`, `IQuest-Coder-V1-14B-Thinking` and `IQuest-Coder-V1-40B-Thinking`.
-**Step 1:**: Deploy the model with vLLM and set tool parser (**Attention: Do not set reasoning parser for Instruct LLMs, otherwise it will cause unexpected errors**):
 ```bash
 vllm serve IQuestLab/IQuest-Coder-V1-7B-Instruct --tool-parser qwen3_coder
@@ -173,7 +181,7 @@ or
 vllm serve IQuestLab/IQuest-Coder-V1-7B-Thinking --tool-parser qwen3_coder --reasoning-parser qwen3
 ```
-**Step 2:**: Use Claude Code to enjoy it:
 ```bash
 export ANTHROPIC_BASE_URL="http://iquestcoder.link"
@@ -182,10 +190,10 @@ claude --model IQuestCoder-V1-7B-Instruct
 ```
-## Evaluation Results
 ![Evaluation Results](./papers/results.png)
 ### Benchmark Parameters
@@ -197,7 +205,7 @@ claude --model IQuestCoder-V1-7B-Instruct
 | **BigCodeBench** | 0.0 | - |
 | **FullStackBench** | 0.0 | - |
 | **CruxEval** | 0.0 | - |
-| **LiveCodeBench** | 0.6 | 0.95 |
 | **Aider-Polyglot** | 0.95 | 0.85 |
 | **Mercury** | 0.2 | 0.85 |
 | **Bird** | 0.2 | 0.95 |

 ![Evaluation Results](./papers/iquest-coder-v1-logo.png)
 <p align="center">
+  📘 <a href="https://iquestlab.github.io">Blog (2026-01-01)</a >
+  &nbsp;•&nbsp;
+  📘 <a href="https://iquestlab.github.io">Blog (2026-03-02)</a >
   &nbsp;•&nbsp;
   📄 <a href="https://github.com/IQuestLab/IQuest-Coder-V1/blob/main/papers/IQuest_Coder_Technical_Report.pdf">Technical Report</a >
 </p >
+# IQuest-Coder-V1 Model Family Update
+🚀🚀🚀 [IQuest-Coder-V1 Model Family Update](https://iquestlab.github.io/release-1.0-2602/index.html): Released 7B & 14B Family Models, 40B-Thinking and 40B-Loop-Thinking, specially optimized for tool use, CLI agents (Like `Claude Code` and `OpenCode`) & HTML/SVG generation, all with 128K context, now on Hugging Face!
 ## 7B Models
 | IQuest-Coder-V1-40B-Instruct | [🤗 Hugging Face](https://huggingface.co/IQuestLab/IQuest-Coder-V1-40B-Instruct) |
 | IQuest-Coder-V1-40B-Loop-Instruct | [🤗 Hugging Face](https://huggingface.co/IQuestLab/IQuest-Coder-V1-40B-Loop-Instruct) |
 | IQuest-Coder-V1-40B-Thinking | [🤗 Hugging Face](https://huggingface.co/IQuestLab/IQuest-Coder-V1-40B-Thinking) |
+| IQuest-Coder-V1-40B-Loop-Thinking | [🤗 Hugging Face](https://huggingface.co/IQuestLab/IQuest-Coder-V1-40B-Loop-Thinking) |
 ## Sampling Parameters:
 For the IQuest-Coder-V1-Instruct: We suggest using Temperature=0.6, TopP=0.85, TopK=20.
+For the IQuest-Coder-V1-Thinking: We suggest using Temperature=1.0, TopP=0.95, TopK=20.
 ## IQuest-Coder-V1 Highlights
 IQuest-Coder-V1 is a new family of code large language models (LLMs) designed to advance autonomous software engineering and code intelligence. Built on the innovative code-flow multi-stage training paradigm, IQuest-Coder-V1 captures the dynamic evolution of software logic, delivering state-of-the-art performance across critical dimensions:
 - **Native Long Context**: All models natively support up to 128K tokens without requiring additional scaling techniques.
 - **CLI Agent Integration**: Demonstrates initial deployment capabilities on ClaudeCode and OpenCode platforms, with the ability to integrate into CLI-based agent workflows.
 - **HTML and SVG Generation**: Features preliminary support for HTML and SVG code generation.
+- **Architectural Chain-of-Thought via Recurrent Depth**: 40B-Loop-Thinking is a research-oriented, experimental model prototype designed to explore how structural chains of thought and procedural chains of thought can be combined within a single system. The model uniquely integrates structural chains of thought—realized through loop-based computation enabled by the dual-iteration LoopCoder architecture—with procedural chains of thought derived from explicit reasoning trajectories trained via reinforcement learning. Unlike standard reasoning models that rely solely on token-level chain-of-thought expansion, Loop-Thinking introduces implicit multi-step computation at the architectural level through a looped Transformer design. In this design, the second iteration refines the hidden states produced by the first iteration using a global–local attention gating mechanism. This results in a nested reasoning mechanism: the loop structure supports iterative representation refinement, while the reasoning-oriented training paradigm injects explicit problem decomposition behavior. It is important to note that this model is not intended to achieve state-of-the-art performance across benchmarks, but rather to validate the complementary roles of loop-based computation and reasoning-oriented training in shaping reasoning structures, and to provide experimental evidence for future model design.
 ## Model Overview
 vllm serve IQuestLab/IQuest-Coder-V1-40B-Thinking --reasoning-parser qwen3 --tensor-parallel-size 8
 ```
+When using tool, `IQuest-Coder-V1-40B-Instruct` and `IQuest-Coder-V1-40B-Loop-Instruct` should use `--tool-parser qwen3`, while `IQuest-Coder-V1-7B-Instruct`, `IQuest-Coder-V1-7B-Thinking`, `IQuest-Coder-V1-14B-Instruct`, `IQuest-Coder-V1-14B-Thinking`, `IQuest-Coder-V1-40B-Thinking` and `IQuest-Coder-V1-40B-Loop-Thinking` should use `--tool-parser qwen3_coder`.
 ### CLI-Like Agents and Tools Usage
+CLI-like agent capabilities are available for the following models: `IQuest-Coder-V1-7B-Instruct`, `IQuest-Coder-V1-7B-Thinking`, `IQuest-Coder-V1-14B-Instruct`, `IQuest-Coder-V1-14B-Thinking`, `IQuest-Coder-V1-40B-Thinking` and `IQuest-Coder-V1-40B-Loop-Thinking`.
+**Step 1:** Deploy the model with vLLM and set tool parser (**Attention: Do not set reasoning parser for Instruct LLMs, otherwise it will cause unexpected errors**):
 ```bash
 vllm serve IQuestLab/IQuest-Coder-V1-7B-Instruct --tool-parser qwen3_coder
 vllm serve IQuestLab/IQuest-Coder-V1-7B-Thinking --tool-parser qwen3_coder --reasoning-parser qwen3
 ```
+**Step 2:** Use Claude Code to enjoy it:
 ```bash
 export ANTHROPIC_BASE_URL="http://iquestcoder.link"
 ```
+## Evaluation Results
+![Evaluation Results](./papers/results-20260302.png)
 ![Evaluation Results](./papers/results.png)
 ### Benchmark Parameters
 | **BigCodeBench** | 0.0 | - |
 | **FullStackBench** | 0.0 | - |
 | **CruxEval** | 0.0 | - |
+| **LiveCodeBench** | 1.0 | 1.0 |
 | **Aider-Polyglot** | 0.95 | 0.85 |
 | **Mercury** | 0.2 | 0.85 |
 | **Bird** | 0.2 | 0.95 |