Text Ranking
sentence-transformers
Safetensors
English
qwen3
finance
legal
code
stem
medical
custom_code
Instructions to use zeroentropy/zerank-1-reranker with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- sentence-transformers
How to use zeroentropy/zerank-1-reranker with sentence-transformers:
from sentence_transformers import CrossEncoder model = CrossEncoder("zeroentropy/zerank-1-reranker", trust_remote_code=True) query = "Which planet is known as the Red Planet?" passages = [ "Venus is often called Earth's twin because of its similar size and proximity.", "Mars, known for its reddish appearance, is often referred to as the Red Planet.", "Jupiter, the largest planet in our solar system, has a prominent red spot.", "Saturn, famous for its rings, is sometimes mistaken for the Red Planet." ] scores = model.predict([(query, passage) for passage in passages]) print(scores) - Notebooks
- Google Colab
- Kaggle
Add model_max_length (32768) to YAML and Model Details section
#6
by dilawarm - opened
README.md
CHANGED
|
@@ -12,6 +12,7 @@ tags:
|
|
| 12 |
- stem
|
| 13 |
- medical
|
| 14 |
library_name: sentence-transformers
|
|
|
|
| 15 |
---
|
| 16 |
|
| 17 |
<img src="https://i.imgur.com/oxvhvQu.png"/>
|
|
@@ -30,6 +31,15 @@ This model is released under a non-commercial license. If you'd like a commercia
|
|
| 30 |
|
| 31 |
For this model's smaller twin, see [zerank-1-small](https://huggingface.co/zeroentropy/zerank-1-small), which we've fully open-sourced under an Apache 2.0 License.
|
| 32 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 33 |
## How to Use
|
| 34 |
|
| 35 |
```python
|
|
@@ -43,7 +53,6 @@ query_documents = [
|
|
| 43 |
]
|
| 44 |
|
| 45 |
scores = model.predict(query_documents)
|
| 46 |
-
|
| 47 |
print(scores)
|
| 48 |
```
|
| 49 |
|
|
@@ -53,15 +62,13 @@ The model can also be inferenced using ZeroEntropy's [/models/rerank](https://do
|
|
| 53 |
|
| 54 |
NDCG@10 scores between `zerank-1` and competing closed-source proprietary rerankers. Since we are evaluating rerankers, OpenAI's `text-embedding-3-small` is used as an initial retriever for the Top 100 candidate documents.
|
| 55 |
|
| 56 |
-
| Task
|
| 57 |
|----------------|-----------|--------------------|--------------------------|----------------|--------------|
|
| 58 |
-
| Code
|
| 59 |
-
| Conversational |
|
| 60 |
-
| Finance
|
| 61 |
-
| Legal
|
| 62 |
-
| Medical
|
| 63 |
-
| STEM
|
| 64 |
-
|
| 65 |
-
Comparing BM25 and Hybrid Search without and with zerank-1:
|
| 66 |
-
|
| 67 |
-
<img src="https://cdn-uploads.huggingface.co/production/uploads/67776f9dcd9c9435499eafc8/2GPVHFrI39FspnSNklhsM.png" alt="Description" width="400"/> <img src="https://cdn-uploads.huggingface.co/production/uploads/67776f9dcd9c9435499eafc8/dwYo2D7hoL8QiE8u3yqr9.png" alt="Description" width="400"/>
|
|
|
|
| 12 |
- stem
|
| 13 |
- medical
|
| 14 |
library_name: sentence-transformers
|
| 15 |
+
model_max_length: 32768
|
| 16 |
---
|
| 17 |
|
| 18 |
<img src="https://i.imgur.com/oxvhvQu.png"/>
|
|
|
|
| 31 |
|
| 32 |
For this model's smaller twin, see [zerank-1-small](https://huggingface.co/zeroentropy/zerank-1-small), which we've fully open-sourced under an Apache 2.0 License.
|
| 33 |
|
| 34 |
+
## Model Details
|
| 35 |
+
|
| 36 |
+
| Property | Value |
|
| 37 |
+
|---|---|
|
| 38 |
+
| Parameters | 4B |
|
| 39 |
+
| Context Length | 32,768 tokens (32k) |
|
| 40 |
+
| Base Model | Qwen/Qwen3-4B |
|
| 41 |
+
| License | CC-BY-NC-4.0 |
|
| 42 |
+
|
| 43 |
## How to Use
|
| 44 |
|
| 45 |
```python
|
|
|
|
| 53 |
]
|
| 54 |
|
| 55 |
scores = model.predict(query_documents)
|
|
|
|
| 56 |
print(scores)
|
| 57 |
```
|
| 58 |
|
|
|
|
| 62 |
|
| 63 |
NDCG@10 scores between `zerank-1` and competing closed-source proprietary rerankers. Since we are evaluating rerankers, OpenAI's `text-embedding-3-small` is used as an initial retriever for the Top 100 candidate documents.
|
| 64 |
|
| 65 |
+
| Task | Embedding | cohere-rerank-v3.5 | Salesforce/Llama-rank-v1 | zerank-1-small | **zerank-1** |
|
| 66 |
|----------------|-----------|--------------------|--------------------------|----------------|--------------|
|
| 67 |
+
| Code | 0.678 | 0.724 | 0.694 | 0.730 | **0.754** |
|
| 68 |
+
| Conversational | 0.250 | 0.571 | 0.484 | 0.556 | **0.596** |
|
| 69 |
+
| Finance | 0.839 | 0.824 | 0.828 | 0.861 | **0.894** |
|
| 70 |
+
| Legal | 0.703 | 0.804 | 0.767 | 0.817 | **0.821** |
|
| 71 |
+
| Medical | 0.619 | 0.750 | 0.719 | 0.773 | **0.796** |
|
| 72 |
+
| STEM | 0.401 | 0.510 | 0.595 | 0.680 | **0.694** |
|
| 73 |
+
|
| 74 |
+
Comparing BM25 and Hybrid Search without and with zerank-1: Description Description
|
|
|
|
|
|