zembed-1 / README.md

Update README.md

cf957b8 verified 19 days ago

5.01 kB

	---
	license: cc-by-nc-4.0
	language:
	- en
	- multilingual
	base_model:
	- Qwen/Qwen3-4B
	pipeline_tag: feature-extraction
	tags:
	- finance
	- legal
	- healthcare
	- code
	- stem
	- medical
	- multilingual
	library_name: sentence-transformers
	model_max_length: 32768
	---

	<img src="https://i.imgur.com/oxvhvQu.png"/>

	# Releasing zeroentropy/zembed-1

	In retrieval systems, [embedding models determine the quality of your search](https://www.zeroentropy.dev/articles/how-to-overcome-poor-search-results-with-the-right-embedding-solution).

	However, SOTA embedding models are closed-source and proprietary. At ZeroEntropy, we've trained a SOTA 4B open-weight multilingual embedding model that outperforms every competitor we benchmarked, and we're launching it here on HuggingFace.

	This model [outperforms](https://huggingface.co/zeroentropy/zembed-1#evaluations) `OpenAI text-embedding-large`, `Cohere Embed v4`, `gemini-embedding-001`, and `voyage-4-nano` across finance, healthcare, legal, conversational, manufacturing, code, and STEM.

	zembed-1 is distilled directly from our SOTA reranker [zerank-2](https://huggingface.co/zeroentropy/zerank-2) using our [zELO methodology](https://arxiv.org/abs/2509.12541), which models relevance scores as adjusted [Elo ratings](https://en.wikipedia.org/wiki/Elo_rating_system). Standard contrastive training on binary labels can't match this signal. See [our blog post](https://www.zeroentropy.dev/articles/introducing-zembed-1-the-worlds-best-multilingual-text-embedding-model) for details.

	The model supports flexible dimension projections (2560, 1280, 640, 320, 160, 80, 40) and quantization down to binary, compressing a full 8 KB vector to under 128 bytes with a controlled accuracy trade-off. See our Technical Report (Coming soon!) for details on the projection method. zembed-1 is multilingual from the ground up, with over half the training data in non-English languages.

	This model is released under a non-commercial license. If you'd like a commercial license, please contact us at contact@zeroentropy.dev.

	## Model Details

	\| Property \| Value \|
	\|---\|---\|
	\| Parameters \| 4B \|
	\| Context Length \| 32,768 tokens (32k) \|
	\| Base Model \| Qwen/Qwen3-4B \|
	\| Embedding Dimensions \| 2560, 1280, 640, 320, 160, 80, 40 \|
	\| License \| CC-BY-NC-4.0 \|

	## How to Use

	```python
	from sentence_transformers import SentenceTransformer

	# Initialize model
	model = SentenceTransformer(
	"zeroentropy/zembed-1",
	trust_remote_code=True,
	model_kwargs={"torch_dtype": "bfloat16"},
	)

	# Define query and documents
	query = "What is backpropagation?"
	documents = [
	"Backpropagation is a fundamental algorithm for training neural networks by computing gradients.",
	"Gradient descent is used to optimize model parameters during the training process.",
	"Neural network training relies on efficient computation of derivatives through backpropagation.",
	]

	# Encode query and documents (uses task-specific prompts automatically)
	query_embeddings = model.encode_query(query)
	document_embeddings = model.encode_document(documents)
	# (2560,) (3, 2560)

	# Compute cosine similarities
	similarities = model.similarity(query_embeddings, document_embeddings)
	# tensor([[0.7525, 0.5670, 0.6835]])
	```

	The model can also be used through ZeroEntropy's [/models/embed](https://docs.zeroentropy.dev/api-reference/models/embed) endpoint.

	## Evaluations

	NDCG@10 scores between `zembed-1` and competing embedding models, averaged across public and private benchmarks per domain. Full per-benchmark breakdown [here](https://docs.google.com/spreadsheets/d/1qFXGZLMg6-O5tVLIJS3tpf5QNJxCHiiQtj35dZub4vY/edit?gid=0#gid=0).

	\| Domain \| ZeroEntropy zembed-1 \| voyage-4-nano \| Qwen3 4B \| Cohere Embed v4 \| gemini-embed-001 \| jina-v5-small \| OpenAI Large \| bge-m3 \|
	\|------------------\|----------------------\|---------------\|----------\|-----------------\|-------------------\|---------------\|--------------\|--------\|
	\| Finance \| 0.4476 \| 0.4227 \| 0.3715 \| 0.3670 \| 0.3291 \| 0.3576 \| 0.3291 \| 0.3085 \|
	\| Healthcare \| 0.6260 \| 0.5356 \| 0.5134 \| 0.4750 \| 0.5008 \| 0.5132 \| 0.5315 \| 0.3620 \|
	\| Legal \| 0.6723 \| 0.5957 \| 0.5858 \| 0.5894 \| 0.6069 \| 0.5716 \| 0.5099 \| 0.5207 \|
	\| Conversational \| 0.5385 \| 0.4045 \| 0.4034 \| 0.4244 \| 0.4247 \| 0.4430 \| 0.3988 \| 0.3296 \|
	\| Manufacturing \| 0.5556 \| 0.4857 \| 0.4932 \| 0.4919 \| 0.4664 \| 0.4725 \| 0.4736 \| 0.3736 \|
	\| Web Search \| 0.6165 \| 0.5977 \| 0.6914 \| 0.7242 \| 0.5881 \| 0.6772 \| 0.6750 \| 0.6311 \|
	\| Code \| 0.6452 \| 0.6415 \| 0.6379 \| 0.6277 \| 0.6305 \| 0.6354 \| 0.6155 \| 0.5584 \|
	\| STEM & Math \| 0.5283 \| 0.5012 \| 0.5219 \| 0.4698 \| 0.4840 \| 0.3780 \| 0.3905 \| 0.3399 \|
	\| Enterprise \| 0.3750 \| 0.3600 \| 0.2935 \| 0.2915 \| 0.3224 \| 0.3012 \| 0.3307 \| 0.2213 \|
	\| Average \| 0.5561 \| 0.5050 \| 0.5013 \| 0.4957 \| 0.4837 \| 0.4833 \| 0.4727 \| 0.4050 \|

	<img src="assets/zembed_eval_chart.png" alt="Bar chart comparing zembed-1 NDCG@10 scores against competing embedding models across domains" width="1000"/>