| | --- |
| | pipeline_tag: sentence-similarity |
| | tags: |
| | - sentence-similarity |
| | - sentence-transformers |
| | license: mit |
| | language: |
| | - multilingual |
| | - af |
| | - am |
| | - ar |
| | - as |
| | - az |
| | - be |
| | - bg |
| | - bn |
| | - br |
| | - bs |
| | - ca |
| | - cs |
| | - cy |
| | - da |
| | - de |
| | - el |
| | - en |
| | - eo |
| | - es |
| | - et |
| | - eu |
| | - fa |
| | - fi |
| | - fr |
| | - fy |
| | - ga |
| | - gd |
| | - gl |
| | - gu |
| | - ha |
| | - he |
| | - hi |
| | - hr |
| | - hu |
| | - hy |
| | - id |
| | - is |
| | - it |
| | - ja |
| | - jv |
| | - ka |
| | - kk |
| | - km |
| | - kn |
| | - ko |
| | - ku |
| | - ky |
| | - la |
| | - lo |
| | - lt |
| | - lv |
| | - mg |
| | - mk |
| | - ml |
| | - mn |
| | - mr |
| | - ms |
| | - my |
| | - ne |
| | - nl |
| | - no |
| | - om |
| | - or |
| | - pa |
| | - pl |
| | - ps |
| | - pt |
| | - ro |
| | - ru |
| | - sa |
| | - sd |
| | - si |
| | - sk |
| | - sl |
| | - so |
| | - sq |
| | - sr |
| | - su |
| | - sv |
| | - sw |
| | - ta |
| | - te |
| | - th |
| | - tl |
| | - tr |
| | - ug |
| | - uk |
| | - ur |
| | - uz |
| | - vi |
| | - xh |
| | - yi |
| | - zh |
| | --- |
| | |
| | A quantized version of [multilingual-e5-small](https://huggingface.co/intfloat/multilingual-e5-small). Quantization was performed per-layer under the same conditions as our ELSERv2 model, as described [here](https://www.elastic.co/search-labs/blog/articles/introducing-elser-v2-part-1#quantization). |
| |
|
| | [Text Embeddings by Weakly-Supervised Contrastive Pre-training](https://arxiv.org/pdf/2212.03533.pdf). |
| | Liang Wang, Nan Yang, Xiaolong Huang, Binxing Jiao, Linjun Yang, Daxin Jiang, Rangan Majumder, Furu Wei, arXiv 2022 |
| |
|
| | ## Benchmarks |
| |
|
| | We performed a number of small benchmarks to assess both the changes in quality as well as inference latency against the baseline original model. |
| |
|
| | ### Quality |
| |
|
| | Measuring NDCG@10 using the dev split of the MIRACL datasets for select languages, we see mostly a marginal change in quality of the quantized model. |
| |
|
| | | | de | yo| ru | ar | es | th | |
| | | --- | --- | ---| --- | --- | --- | --- | |
| | | multilingual-e5-small | 0.75862 | 0.56193 | 0.80309 | 0.82778 | 0.81672 | 0.85072 | |
| | | multilingual-e5-small-optimized | 0.75992 | 0.48934 | 0.79668 | 0.82017 | 0.8135 | 0.84316 | |
| |
|
| | To test the English out-of-domain performance, we used the test split of various datasets in the BEIR evaluation. Measuring NDCG@10, we see a larger change in SCIFACT, but marginal in the other datasets evaluated. |
| |
|
| | | | FIQA | SCIFACT | nfcorpus | |
| | | --- | --- | --- | --- | |
| | | multilingual-e5-small | 0.33126 | 0.677 | 0.31004 | |
| | | multilingual-e5-small-optimized | 0.31734 | 0.65484 | 0.30126 | |
| |
|
| | ### Performance |
| |
|
| | Using a PyTorch model traced for Linux and Intel CPUs, we performed performance benchmarking with various lengths of input. Overall, we see on average a 50-20% performance improvement with the optimized model. |
| |
|
| | | input length (characters) | multilingual-e5-small | multilingual-e5-small-optimized | speedup | |
| | | --- | --- | --- | --- | |
| | | 0 - 50 | 0.0181 | 0.00826 | 54.36% | |
| | | 50 - 100 | 0.0275 | 0.0164 | 40.36% | |
| | | 100 - 150 | 0.0366 | 0.0237 | 35.25% | |
| | | 150 - 200 | 0.0435 | 0.0301 | 30.80% | |
| | | 200 - 250 | 0.0514 | 0.0379 | 26.26% | |
| | | 250 - 300 | 0.0569 | 0.043 | 24.43% | |
| | | 300 - 350 | 0.0663 | 0.0513 | 22.62% | |
| | | 350 - 400 | 0.0737 | 0.0576 | 21.85% | |
| |
|
| | ### Disclaimer |
| |
|
| | This e5 model, as defined, hosted, integrated and used in conjunction with our other Elastic Software is covered by our standard warranty. |
| |
|