Add model_max_length (32768) to YAML and Model Details section
Browse files
README.md
CHANGED
|
@@ -12,6 +12,7 @@ tags:
|
|
| 12 |
- stem
|
| 13 |
- medical
|
| 14 |
library_name: sentence-transformers
|
|
|
|
| 15 |
---
|
| 16 |
|
| 17 |
<img src="https://i.imgur.com/oxvhvQu.png"/>
|
|
@@ -28,6 +29,15 @@ At ZeroEntropy we've developed an innovative multi-stage pipeline that models qu
|
|
| 28 |
|
| 29 |
This model is released under a non-commercial license. If you'd like a commercial license, please contact us at contact@zeroentropy.dev.
|
| 30 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 31 |
## How to Use
|
| 32 |
|
| 33 |
```python
|
|
@@ -41,7 +51,6 @@ query_documents = [
|
|
| 41 |
]
|
| 42 |
|
| 43 |
scores = model.predict(query_documents)
|
| 44 |
-
|
| 45 |
print(scores)
|
| 46 |
```
|
| 47 |
|
|
@@ -51,15 +60,15 @@ The model can also be inferenced using ZeroEntropy's [/models/rerank](https://do
|
|
| 51 |
|
| 52 |
NDCG@10 scores between `zerank-2` and competing closed-source proprietary rerankers. Since we are evaluating rerankers, OpenAI's `text-embedding-3-small` is used as an initial retriever for the Top 100 candidate documents.
|
| 53 |
|
| 54 |
-
| Domain
|
| 55 |
|------------------|-------------------|----------------------|----------------------|-----------------------------|-------------------|
|
| 56 |
-
| Web
|
| 57 |
-
| Conversational
|
| 58 |
-
| STEM & Logic
|
| 59 |
-
| Code
|
| 60 |
-
| Legal
|
| 61 |
-
| Biomedical
|
| 62 |
-
| Finance
|
| 63 |
-
| **Average**
|
| 64 |
-
|
| 65 |
-
|
|
|
|
| 12 |
- stem
|
| 13 |
- medical
|
| 14 |
library_name: sentence-transformers
|
| 15 |
+
model_max_length: 32768
|
| 16 |
---
|
| 17 |
|
| 18 |
<img src="https://i.imgur.com/oxvhvQu.png"/>
|
|
|
|
| 29 |
|
| 30 |
This model is released under a non-commercial license. If you'd like a commercial license, please contact us at contact@zeroentropy.dev.
|
| 31 |
|
| 32 |
+
## Model Details
|
| 33 |
+
|
| 34 |
+
| Property | Value |
|
| 35 |
+
|---|---|
|
| 36 |
+
| Parameters | 4B |
|
| 37 |
+
| Context Length | 32,768 tokens (32k) |
|
| 38 |
+
| Base Model | Qwen/Qwen3-4B |
|
| 39 |
+
| License | CC-BY-NC-4.0 |
|
| 40 |
+
|
| 41 |
## How to Use
|
| 42 |
|
| 43 |
```python
|
|
|
|
| 51 |
]
|
| 52 |
|
| 53 |
scores = model.predict(query_documents)
|
|
|
|
| 54 |
print(scores)
|
| 55 |
```
|
| 56 |
|
|
|
|
| 60 |
|
| 61 |
NDCG@10 scores between `zerank-2` and competing closed-source proprietary rerankers. Since we are evaluating rerankers, OpenAI's `text-embedding-3-small` is used as an initial retriever for the Top 100 candidate documents.
|
| 62 |
|
| 63 |
+
| Domain | OpenAI embeddings | ZeroEntropy zerank-2 | ZeroEntropy zerank-1 | Gemini 2.5 Flash (Listwise) | Cohere rerank-3.5 |
|
| 64 |
|------------------|-------------------|----------------------|----------------------|-----------------------------|-------------------|
|
| 65 |
+
| Web | 0.3819 | **0.6346** | 0.6069 | 0.5765 | 0.5594 |
|
| 66 |
+
| Conversational | 0.4305 | **0.6140** | 0.5801 | 0.6021 | 0.5648 |
|
| 67 |
+
| STEM & Logic | 0.3744 | **0.6521** | 0.6283 | 0.5447 | 0.5418 |
|
| 68 |
+
| Code | 0.4582 | **0.6528** | 0.6310 | 0.6128 | 0.5364 |
|
| 69 |
+
| Legal | 0.4101 | **0.6644** | 0.6222 | 0.5565 | 0.5257 |
|
| 70 |
+
| Biomedical | 0.4783 | **0.7217** | 0.6967 | 0.5371 | 0.6246 |
|
| 71 |
+
| Finance | 0.6232 | 0.7600 | 0.7539 | **0.7694** | 0.7402 |
|
| 72 |
+
| **Average** | **0.4509** | **0.6714** | **0.6456** | **0.5999** | **0.5847** |
|
| 73 |
+
|
| 74 |
+
Graph showing the same table
|