File size: 3,774 Bytes
151669b | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 | ---
license: mit
language:
- en
- code
tags:
- code-search
- embeddings
- onnx
- sentence-similarity
- cqs
library_name: sentence-transformers
pipeline_tag: sentence-similarity
base_model: nomic-ai/CodeRankEmbed
---
# CodeRankEmbed (ONNX export)
ONNX export of [nomic-ai/CodeRankEmbed](https://huggingface.co/nomic-ai/CodeRankEmbed) — a 137M-parameter code search embedder built on `Snowflake/snowflake-arctic-embed-m-long`. Exported for use with [cqs](https://github.com/jamie8johnson/cqs)'s ONNX Runtime embedding pipeline; no PyTorch dependency required.
This is a faithful conversion of the upstream weights — no fine-tuning, no quantization. License and behavior match the upstream model.
## Specs
- **Base:** `nomic-ai/CodeRankEmbed` (137M params, 768-dim, 8192 max seq)
- **Format:** ONNX (FP32)
- **Pooling:** Mean
- **Query prefix:** `Represent this query for searching relevant code: ` (required — see usage)
- **Document prefix:** none
## Production Eval (cqs v3.v2 fixture, 2026-05-01)
Run against cqs's production fixture (218 queries: 109 test + 109 dev) on the cqs codebase itself. Numbers are with cqs's full hybrid-search stack (dense + FTS + SPLADE blend, name-boost, type-boost, MMR-off):
| split | metric | BGE-large (1024-dim) | **CodeRankEmbed (768-dim)** | v9-200k (768-dim) |
|-------|--------|---------------------:|----------------------------:|------------------:|
| test | R@1 | 43.1% | 42.2% | 45.9% |
| test | R@5 | 69.7% | **67.9%** | 70.6% |
| test | R@20 | **83.5%** | 79.8% | 80.7% |
| dev | R@1 | 45.9% | **47.7%** | 46.8% |
| dev | R@5 | **77.1%** | 69.7% | 68.8% |
| dev | R@20 | **86.2%** | 81.7% | 81.7% |
**Verdict:** edges out BGE-large on dev R@1, otherwise close on test and behind on dev R@5/R@20. Best fit when you want a code-specialist embedder at 1/3 the BGE-large parameter count without trading off too much on diverse natural-language queries. cqs ships it as an opt-in preset (not the default) — set `CQS_EMBEDDING_MODEL=nomic-coderank` or use `cqs slot create coderank --model nomic-coderank`.
## Usage
### With cqs
```bash
# Full reindex with this model
export CQS_EMBEDDING_MODEL=nomic-coderank
cqs index --force
# Or, for slot-based comparisons:
cqs slot create coderank --model nomic-coderank
cqs index --slot coderank --force
```
cqs handles the query-prefix wiring automatically. Documents are encoded without a prefix per the upstream convention.
### Direct ONNX
```python
import onnxruntime as ort
from transformers import AutoTokenizer
import numpy as np
session = AutoTokenizer.from_pretrained("jamie8johnson/CodeRankEmbed-onnx")
ort_session = ort.InferenceSession("model.onnx")
tokenizer = AutoTokenizer.from_pretrained("nomic-ai/CodeRankEmbed")
# Query prefix is REQUIRED
query = "Represent this query for searching relevant code: find functions that validate email addresses"
code = "def validate_email(addr): ..." # no prefix on documents
q_inputs = tokenizer(query, return_tensors="np", padding=True, truncation=True, max_length=8192)
q_out = ort_session.run(None, dict(q_inputs))
# Mean-pool over the token dimension and L2-normalize for cosine similarity.
```
## License
MIT, inherited from the upstream `nomic-ai/CodeRankEmbed` model.
## Citation
Please cite the upstream model:
```
@misc{nomic-coderank-embed,
author = {Nomic AI},
title = {CodeRankEmbed},
year = {2024},
publisher = {HuggingFace},
url = {https://huggingface.co/nomic-ai/CodeRankEmbed}
}
```
|