Add model_max_length (32768) to YAML and Model Details section
#6
by
dilawarm - opened
README.md
CHANGED
|
@@ -12,6 +12,7 @@ tags:
|
|
| 12 |
- stem
|
| 13 |
- medical
|
| 14 |
library_name: sentence-transformers
|
|
|
|
| 15 |
---
|
| 16 |
|
| 17 |
<img src="https://i.imgur.com/oxvhvQu.png"/>
|
|
@@ -30,6 +31,15 @@ This model is released under a non-commercial license. If you'd like a commercia
|
|
| 30 |
|
| 31 |
For this model's smaller twin, see [zerank-1-small](https://huggingface.co/zeroentropy/zerank-1-small), which we've fully open-sourced under an Apache 2.0 License.
|
| 32 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 33 |
## How to Use
|
| 34 |
|
| 35 |
```python
|
|
@@ -43,7 +53,6 @@ query_documents = [
|
|
| 43 |
]
|
| 44 |
|
| 45 |
scores = model.predict(query_documents)
|
| 46 |
-
|
| 47 |
print(scores)
|
| 48 |
```
|
| 49 |
|
|
@@ -53,15 +62,13 @@ The model can also be inferenced using ZeroEntropy's [/models/rerank](https://do
|
|
| 53 |
|
| 54 |
NDCG@10 scores between `zerank-1` and competing closed-source proprietary rerankers. Since we are evaluating rerankers, OpenAI's `text-embedding-3-small` is used as an initial retriever for the Top 100 candidate documents.
|
| 55 |
|
| 56 |
-
| Task
|
| 57 |
|----------------|-----------|--------------------|--------------------------|----------------|--------------|
|
| 58 |
-
| Code
|
| 59 |
-
| Conversational |
|
| 60 |
-
| Finance
|
| 61 |
-
| Legal
|
| 62 |
-
| Medical
|
| 63 |
-
| STEM
|
| 64 |
-
|
| 65 |
-
Comparing BM25 and Hybrid Search without and with zerank-1:
|
| 66 |
-
|
| 67 |
-
<img src="https://cdn-uploads.huggingface.co/production/uploads/67776f9dcd9c9435499eafc8/2GPVHFrI39FspnSNklhsM.png" alt="Description" width="400"/> <img src="https://cdn-uploads.huggingface.co/production/uploads/67776f9dcd9c9435499eafc8/dwYo2D7hoL8QiE8u3yqr9.png" alt="Description" width="400"/>
|
|
|
|
| 12 |
- stem
|
| 13 |
- medical
|
| 14 |
library_name: sentence-transformers
|
| 15 |
+
model_max_length: 32768
|
| 16 |
---
|
| 17 |
|
| 18 |
<img src="https://i.imgur.com/oxvhvQu.png"/>
|
|
|
|
| 31 |
|
| 32 |
For this model's smaller twin, see [zerank-1-small](https://huggingface.co/zeroentropy/zerank-1-small), which we've fully open-sourced under an Apache 2.0 License.
|
| 33 |
|
| 34 |
+
## Model Details
|
| 35 |
+
|
| 36 |
+
| Property | Value |
|
| 37 |
+
|---|---|
|
| 38 |
+
| Parameters | 4B |
|
| 39 |
+
| Context Length | 32,768 tokens (32k) |
|
| 40 |
+
| Base Model | Qwen/Qwen3-4B |
|
| 41 |
+
| License | CC-BY-NC-4.0 |
|
| 42 |
+
|
| 43 |
## How to Use
|
| 44 |
|
| 45 |
```python
|
|
|
|
| 53 |
]
|
| 54 |
|
| 55 |
scores = model.predict(query_documents)
|
|
|
|
| 56 |
print(scores)
|
| 57 |
```
|
| 58 |
|
|
|
|
| 62 |
|
| 63 |
NDCG@10 scores between `zerank-1` and competing closed-source proprietary rerankers. Since we are evaluating rerankers, OpenAI's `text-embedding-3-small` is used as an initial retriever for the Top 100 candidate documents.
|
| 64 |
|
| 65 |
+
| Task | Embedding | cohere-rerank-v3.5 | Salesforce/Llama-rank-v1 | zerank-1-small | **zerank-1** |
|
| 66 |
|----------------|-----------|--------------------|--------------------------|----------------|--------------|
|
| 67 |
+
| Code | 0.678 | 0.724 | 0.694 | 0.730 | **0.754** |
|
| 68 |
+
| Conversational | 0.250 | 0.571 | 0.484 | 0.556 | **0.596** |
|
| 69 |
+
| Finance | 0.839 | 0.824 | 0.828 | 0.861 | **0.894** |
|
| 70 |
+
| Legal | 0.703 | 0.804 | 0.767 | 0.817 | **0.821** |
|
| 71 |
+
| Medical | 0.619 | 0.750 | 0.719 | 0.773 | **0.796** |
|
| 72 |
+
| STEM | 0.401 | 0.510 | 0.595 | 0.680 | **0.694** |
|
| 73 |
+
|
| 74 |
+
Comparing BM25 and Hybrid Search without and with zerank-1: Description Description
|
|
|
|
|
|