jamie8johnson
/

CodeRankEmbed-onnx

Sentence Similarity

sentence-transformers

text-embeddings-inference

Model card Files Files and versions

CodeRankEmbed-onnx / README.md

jamie8johnson's picture

Add model card with v3.v2 eval results + usage

151669b verified 5 days ago

|

history blame contribute delete

3.77 kB

	---
	license: mit
	language:
	- en
	- code
	tags:
	- code-search
	- embeddings
	- onnx
	- sentence-similarity
	- cqs
	library_name: sentence-transformers
	pipeline_tag: sentence-similarity
	base_model: nomic-ai/CodeRankEmbed
	---

	# CodeRankEmbed (ONNX export)

	ONNX export of [nomic-ai/CodeRankEmbed](https://huggingface.co/nomic-ai/CodeRankEmbed) — a 137M-parameter code search embedder built on `Snowflake/snowflake-arctic-embed-m-long`. Exported for use with [cqs](https://github.com/jamie8johnson/cqs)'s ONNX Runtime embedding pipeline; no PyTorch dependency required.

	This is a faithful conversion of the upstream weights — no fine-tuning, no quantization. License and behavior match the upstream model.

	## Specs

	- Base: `nomic-ai/CodeRankEmbed` (137M params, 768-dim, 8192 max seq)
	- Format: ONNX (FP32)
	- Pooling: Mean
	- Query prefix: `Represent this query for searching relevant code: ` (required — see usage)
	- Document prefix: none

	## Production Eval (cqs v3.v2 fixture, 2026-05-01)

	Run against cqs's production fixture (218 queries: 109 test + 109 dev) on the cqs codebase itself. Numbers are with cqs's full hybrid-search stack (dense + FTS + SPLADE blend, name-boost, type-boost, MMR-off):

	\| split \| metric \| BGE-large (1024-dim) \| CodeRankEmbed (768-dim) \| v9-200k (768-dim) \|
	\|-------\|--------\|---------------------:\|----------------------------:\|------------------:\|
	\| test \| R@1 \| 43.1% \| 42.2% \| 45.9% \|
	\| test \| R@5 \| 69.7% \| 67.9% \| 70.6% \|
	\| test \| R@20 \| 83.5% \| 79.8% \| 80.7% \|
	\| dev \| R@1 \| 45.9% \| 47.7% \| 46.8% \|
	\| dev \| R@5 \| 77.1% \| 69.7% \| 68.8% \|
	\| dev \| R@20 \| 86.2% \| 81.7% \| 81.7% \|

	Verdict: edges out BGE-large on dev R@1, otherwise close on test and behind on dev R@5/R@20. Best fit when you want a code-specialist embedder at 1/3 the BGE-large parameter count without trading off too much on diverse natural-language queries. cqs ships it as an opt-in preset (not the default) — set `CQS_EMBEDDING_MODEL=nomic-coderank` or use `cqs slot create coderank --model nomic-coderank`.

	## Usage

	### With cqs

	```bash
	# Full reindex with this model
	export CQS_EMBEDDING_MODEL=nomic-coderank
	cqs index --force

	# Or, for slot-based comparisons:
	cqs slot create coderank --model nomic-coderank
	cqs index --slot coderank --force
	```

	cqs handles the query-prefix wiring automatically. Documents are encoded without a prefix per the upstream convention.

	### Direct ONNX

	```python
	import onnxruntime as ort
	from transformers import AutoTokenizer
	import numpy as np

	session = AutoTokenizer.from_pretrained("jamie8johnson/CodeRankEmbed-onnx")
	ort_session = ort.InferenceSession("model.onnx")
	tokenizer = AutoTokenizer.from_pretrained("nomic-ai/CodeRankEmbed")

	# Query prefix is REQUIRED
	query = "Represent this query for searching relevant code: find functions that validate email addresses"
	code = "def validate_email(addr): ..." # no prefix on documents

	q_inputs = tokenizer(query, return_tensors="np", padding=True, truncation=True, max_length=8192)
	q_out = ort_session.run(None, dict(q_inputs))
	# Mean-pool over the token dimension and L2-normalize for cosine similarity.
	```

	## License

	MIT, inherited from the upstream `nomic-ai/CodeRankEmbed` model.

	## Citation

	Please cite the upstream model:

	```
	@misc{nomic-coderank-embed,
	author = {Nomic AI},
	title = {CodeRankEmbed},
	year = {2024},
	publisher = {HuggingFace},
	url = {https://huggingface.co/nomic-ai/CodeRankEmbed}
	}
	```