--- license: mit language: - en - code tags: - code-search - embeddings - onnx - sentence-similarity - cqs library_name: sentence-transformers pipeline_tag: sentence-similarity base_model: nomic-ai/CodeRankEmbed --- # CodeRankEmbed (ONNX export) ONNX export of [nomic-ai/CodeRankEmbed](https://huggingface.co/nomic-ai/CodeRankEmbed) — a 137M-parameter code search embedder built on `Snowflake/snowflake-arctic-embed-m-long`. Exported for use with [cqs](https://github.com/jamie8johnson/cqs)'s ONNX Runtime embedding pipeline; no PyTorch dependency required. This is a faithful conversion of the upstream weights — no fine-tuning, no quantization. License and behavior match the upstream model. ## Specs - **Base:** `nomic-ai/CodeRankEmbed` (137M params, 768-dim, 8192 max seq) - **Format:** ONNX (FP32) - **Pooling:** Mean - **Query prefix:** `Represent this query for searching relevant code: ` (required — see usage) - **Document prefix:** none ## Production Eval (cqs v3.v2 fixture, 2026-05-01) Run against cqs's production fixture (218 queries: 109 test + 109 dev) on the cqs codebase itself. Numbers are with cqs's full hybrid-search stack (dense + FTS + SPLADE blend, name-boost, type-boost, MMR-off): | split | metric | BGE-large (1024-dim) | **CodeRankEmbed (768-dim)** | v9-200k (768-dim) | |-------|--------|---------------------:|----------------------------:|------------------:| | test | R@1 | 43.1% | 42.2% | 45.9% | | test | R@5 | 69.7% | **67.9%** | 70.6% | | test | R@20 | **83.5%** | 79.8% | 80.7% | | dev | R@1 | 45.9% | **47.7%** | 46.8% | | dev | R@5 | **77.1%** | 69.7% | 68.8% | | dev | R@20 | **86.2%** | 81.7% | 81.7% | **Verdict:** edges out BGE-large on dev R@1, otherwise close on test and behind on dev R@5/R@20. Best fit when you want a code-specialist embedder at 1/3 the BGE-large parameter count without trading off too much on diverse natural-language queries. cqs ships it as an opt-in preset (not the default) — set `CQS_EMBEDDING_MODEL=nomic-coderank` or use `cqs slot create coderank --model nomic-coderank`. ## Usage ### With cqs ```bash # Full reindex with this model export CQS_EMBEDDING_MODEL=nomic-coderank cqs index --force # Or, for slot-based comparisons: cqs slot create coderank --model nomic-coderank cqs index --slot coderank --force ``` cqs handles the query-prefix wiring automatically. Documents are encoded without a prefix per the upstream convention. ### Direct ONNX ```python import onnxruntime as ort from transformers import AutoTokenizer import numpy as np session = AutoTokenizer.from_pretrained("jamie8johnson/CodeRankEmbed-onnx") ort_session = ort.InferenceSession("model.onnx") tokenizer = AutoTokenizer.from_pretrained("nomic-ai/CodeRankEmbed") # Query prefix is REQUIRED query = "Represent this query for searching relevant code: find functions that validate email addresses" code = "def validate_email(addr): ..." # no prefix on documents q_inputs = tokenizer(query, return_tensors="np", padding=True, truncation=True, max_length=8192) q_out = ort_session.run(None, dict(q_inputs)) # Mean-pool over the token dimension and L2-normalize for cosine similarity. ``` ## License MIT, inherited from the upstream `nomic-ai/CodeRankEmbed` model. ## Citation Please cite the upstream model: ``` @misc{nomic-coderank-embed, author = {Nomic AI}, title = {CodeRankEmbed}, year = {2024}, publisher = {HuggingFace}, url = {https://huggingface.co/nomic-ai/CodeRankEmbed} } ```