| --- |
| license: mit |
| language: |
| - en |
| - code |
| tags: |
| - code-search |
| - embeddings |
| - onnx |
| - sentence-similarity |
| - cqs |
| library_name: sentence-transformers |
| pipeline_tag: sentence-similarity |
| base_model: nomic-ai/CodeRankEmbed |
| --- |
| |
| # CodeRankEmbed (ONNX export) |
|
|
| ONNX export of [nomic-ai/CodeRankEmbed](https://huggingface.co/nomic-ai/CodeRankEmbed) — a 137M-parameter code search embedder built on `Snowflake/snowflake-arctic-embed-m-long`. Exported for use with [cqs](https://github.com/jamie8johnson/cqs)'s ONNX Runtime embedding pipeline; no PyTorch dependency required. |
|
|
| This is a faithful conversion of the upstream weights — no fine-tuning, no quantization. License and behavior match the upstream model. |
|
|
| ## Specs |
|
|
| - **Base:** `nomic-ai/CodeRankEmbed` (137M params, 768-dim, 8192 max seq) |
| - **Format:** ONNX (FP32) |
| - **Pooling:** Mean |
| - **Query prefix:** `Represent this query for searching relevant code: ` (required — see usage) |
| - **Document prefix:** none |
|
|
| ## Production Eval (cqs v3.v2 fixture, 2026-05-01) |
|
|
| Run against cqs's production fixture (218 queries: 109 test + 109 dev) on the cqs codebase itself. Numbers are with cqs's full hybrid-search stack (dense + FTS + SPLADE blend, name-boost, type-boost, MMR-off): |
|
|
| | split | metric | BGE-large (1024-dim) | **CodeRankEmbed (768-dim)** | v9-200k (768-dim) | |
| |-------|--------|---------------------:|----------------------------:|------------------:| |
| | test | R@1 | 43.1% | 42.2% | 45.9% | |
| | test | R@5 | 69.7% | **67.9%** | 70.6% | |
| | test | R@20 | **83.5%** | 79.8% | 80.7% | |
| | dev | R@1 | 45.9% | **47.7%** | 46.8% | |
| | dev | R@5 | **77.1%** | 69.7% | 68.8% | |
| | dev | R@20 | **86.2%** | 81.7% | 81.7% | |
|
|
| **Verdict:** edges out BGE-large on dev R@1, otherwise close on test and behind on dev R@5/R@20. Best fit when you want a code-specialist embedder at 1/3 the BGE-large parameter count without trading off too much on diverse natural-language queries. cqs ships it as an opt-in preset (not the default) — set `CQS_EMBEDDING_MODEL=nomic-coderank` or use `cqs slot create coderank --model nomic-coderank`. |
|
|
| ## Usage |
|
|
| ### With cqs |
|
|
| ```bash |
| # Full reindex with this model |
| export CQS_EMBEDDING_MODEL=nomic-coderank |
| cqs index --force |
| |
| # Or, for slot-based comparisons: |
| cqs slot create coderank --model nomic-coderank |
| cqs index --slot coderank --force |
| ``` |
|
|
| cqs handles the query-prefix wiring automatically. Documents are encoded without a prefix per the upstream convention. |
|
|
| ### Direct ONNX |
|
|
| ```python |
| import onnxruntime as ort |
| from transformers import AutoTokenizer |
| import numpy as np |
| |
| session = AutoTokenizer.from_pretrained("jamie8johnson/CodeRankEmbed-onnx") |
| ort_session = ort.InferenceSession("model.onnx") |
| tokenizer = AutoTokenizer.from_pretrained("nomic-ai/CodeRankEmbed") |
| |
| # Query prefix is REQUIRED |
| query = "Represent this query for searching relevant code: find functions that validate email addresses" |
| code = "def validate_email(addr): ..." # no prefix on documents |
| |
| q_inputs = tokenizer(query, return_tensors="np", padding=True, truncation=True, max_length=8192) |
| q_out = ort_session.run(None, dict(q_inputs)) |
| # Mean-pool over the token dimension and L2-normalize for cosine similarity. |
| ``` |
|
|
| ## License |
|
|
| MIT, inherited from the upstream `nomic-ai/CodeRankEmbed` model. |
|
|
| ## Citation |
|
|
| Please cite the upstream model: |
|
|
| ``` |
| @misc{nomic-coderank-embed, |
| author = {Nomic AI}, |
| title = {CodeRankEmbed}, |
| year = {2024}, |
| publisher = {HuggingFace}, |
| url = {https://huggingface.co/nomic-ai/CodeRankEmbed} |
| } |
| ``` |
|
|