Text Classification
Transformers
Safetensors
PEFT
English
code
qwen3
text-generation
code-search
reranker
code-retrieval
lora
text-embeddings-inference
Instructions to use hq-bench/coreb-code-reranker with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use hq-bench/coreb-code-reranker with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-classification", model="hq-bench/coreb-code-reranker")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("hq-bench/coreb-code-reranker") model = AutoModelForCausalLM.from_pretrained("hq-bench/coreb-code-reranker") - PEFT
How to use hq-bench/coreb-code-reranker with PEFT:
Task type is invalid.
- Notebooks
- Google Colab
- Kaggle
| license: apache-2.0 | |
| base_model: Qwen/Qwen3-Reranker-4B | |
| tags: | |
| - code-search | |
| - reranker | |
| - code-retrieval | |
| - peft | |
| - lora | |
| language: | |
| - en | |
| - code | |
| datasets: | |
| - hq-bench/coreb | |
| pipeline_tag: text-classification | |
| library_name: transformers | |
| [](https://hq-bench.github.io/coreb-page/) | |
| [](https://arxiv.org/abs/2605.04615) | |
| [](https://huggingface.co/datasets/hq-bench/coreb) | |
| [](https://github.com/hq-bench/coreb) | |
| # CoREB-Reranker | |
| **CoREB-Reranker** is a code reranker fine-tuned from [Qwen3-Reranker-4B](https://huggingface.co/Qwen/Qwen3-Reranker-4B) via LoRA on a mixed reranker corpus. It is the **only reranker we evaluate that achieves consistent gains across all three code search tasks** (text-to-code, code-to-text, and code-to-code). | |
| ## Highlights | |
| - Fine-tuned from Qwen3-Reranker-4B using LoRA (rank=16, alpha=16) on **3.1M training samples** from a mixed corpus | |
| - Evaluated on CoREB v202603 (problem-disjoint from training set, no data leakage) | |
| - Achieves **positive reranking delta on all three tasks**, unlike all off-the-shelf rerankers tested | |
| ## Reranking Results (nDCG@10 Delta %) | |
| Reranking delta on CoREB v202603, using C2LLM-7B as the first-stage retriever: | |
| | Reranker | Text-to-Code | Code-to-Text | Code-to-Code | | |
| |----------|:---:|:---:|:---:| | |
| | Jina Reranker v2 | -8.3 | -22.4 | -8.8 | | |
| | Jina Reranker v3 | -2.2 | -5.0 | -0.1 | | |
| | Qwen3-Reranker-0.6B | -0.6 | -8.2 | -2.3 | | |
| | Qwen3-Reranker-4B | -0.1 | -3.2 | +3.3 | | |
| | **CoREB-Reranker (ours)** | **+1.1** | **+0.8** | **+5.1** | | |
| ## Training Details | |
| - **Base model**: [Qwen/Qwen3-Reranker-4B](https://huggingface.co/Qwen/Qwen3-Reranker-4B) | |
| - **Method**: LoRA (rank=16, alpha=16, dropout=0.05) | |
| - **Target modules**: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj | |
| - **Training data**: A mixed reranker corpus consisting of [CoREB v202602](https://huggingface.co/datasets/hq-bench/coreb), [CodeSearchNet](https://github.com/github/CodeSearchNet) (code-to-code, code-to-text, text-to-code), [APPS](https://github.com/hendrycks/apps), [CosQA](https://github.com/Jun-jie-Huang/CosQA), and [CodeFeedback](https://github.com/OpenCodeInterpreter/OpenCodeInterpreter) (single-turn and multi-turn). Each record is normalized into binary reranking examples (instruction, query, document, yes/no). Positives are duplicated twice; one easy negative and one hard negative are sampled per record. | |
| - **Evaluation data**: CoREB v202603 (problem-disjoint from CoREB v202602 training split; covers a different contest time window) | |
| - **Training samples**: ~3.1M binary reranking examples across text-to-code, code-to-text, and code-to-code tasks | |
| - **Top-k retrieval for reranking**: 128 | |
| ## Usage | |
| CoREB-Reranker follows the same usage pattern as Qwen3-Reranker. The instruction is **task-specific** — use the appropriate one for your retrieval task: | |
| ```python | |
| from enum import Enum | |
| from transformers import AutoModelForCausalLM, AutoTokenizer | |
| import torch | |
| class Task(Enum): | |
| TEXT_TO_CODE = "Given a natural language programming task, retrieve code that correctly solves or implements the task." | |
| CODE_TO_CODE = "Given a code snippet, retrieve code that is semantically equivalent or solves the same task." | |
| CODE_TO_TEXT = "Given a code snippet, retrieve the natural language description or problem statement that best matches the code." | |
| model_id = "hq-bench/coreb-code-reranker" | |
| tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True) | |
| model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.bfloat16, trust_remote_code=True) | |
| model.eval() | |
| PREFIX = '<|im_start|>system\nJudge whether the Document meets the requirements based on the Query and the Instruct provided. Note that the answer can only be "yes" or "no".<|im_end|>\n<|im_start|>user\n' | |
| SUFFIX = "<|im_end|>\n<|im_start|>assistant\n" | |
| yes_id = tokenizer.convert_tokens_to_ids("yes") | |
| no_id = tokenizer.convert_tokens_to_ids("no") | |
| def score(query: str, document: str, task: Task) -> float: | |
| prompt = f"{PREFIX}<Instruct>: {task.value}\n<Query>: {query}\n<Document>: {document}{SUFFIX}" | |
| inputs = tokenizer(prompt, return_tensors="pt", truncation=True, max_length=4096) | |
| with torch.no_grad(): | |
| logits = model(**inputs).logits[0, -1, :] | |
| return (logits[yes_id] - logits[no_id]).item() | |
| # Text-to-Code: natural language query -> code | |
| print(score( | |
| query="binary search implementation", | |
| document="def binary_search(arr, target):\n lo, hi = 0, len(arr) - 1\n ...", | |
| task=Task.TEXT_TO_CODE, | |
| )) | |
| # Code-to-Code: code -> semantically equivalent code | |
| print(score( | |
| query="def binary_search(arr, target): ...", | |
| document="int binarySearch(int[] arr, int target) { ... }", | |
| task=Task.CODE_TO_CODE, | |
| )) | |
| # Code-to-Text: code -> problem description | |
| print(score( | |
| query="def binary_search(arr, target): ...", | |
| document="Find the index of a target value in a sorted array using binary search.", | |
| task=Task.CODE_TO_TEXT, | |
| )) | |
| ``` | |
| For batch reranking with the CoREB evaluation pipeline, see the [CoREB repository](https://github.com/hq-bench/coreb). | |
| ## Citation | |
| ```bibtex | |
| @article{xue2026coreb, | |
| title={Beyond Retrieval: A Multitask Benchmark and Reranker for Code Search}, | |
| author={Xue, Siqiao and Liao, Zihan and Qin, Jin and Zhang, Ziyin and Mu, Yixiang and Zhou, Fan and Yu, Hang}, | |
| journal={arXiv preprint arXiv:2605.04615}, | |
| year={2026}, | |
| url={https://arxiv.org/abs/2605.04615} | |
| } | |
| ``` | |