| | --- |
| | license: apache-2.0 |
| | pipeline_tag: text-ranking |
| | library_name: lightning-ir |
| | base_model: |
| | - google-bert/bert-base-uncased |
| | tags: |
| | - bi-encoder |
| | --- |
| | |
| | # Lightning IR ColBERT |
| |
|
| | This model is a ColBERT[^1] model fine-tuned using [Lightning IR](https://github.com/webis-de/lightning-ir). |
| |
|
| | See the [Lightning IR Model Zoo](https://webis-de.github.io/lightning-ir/models.html) for a comparison with other models. |
| |
|
| | ## Reproduction |
| |
|
| | To reproduce the model training, install Lightning IR and run the following command using the [fine-tune.yaml](./configs/fine-tune.yaml) configuration file: |
| |
|
| | ```bash |
| | lightning-ir fit --config fine-tune.yaml |
| | ``` |
| |
|
| | To index MS~MARCO passages, use the following command and the [index.yaml](./configs/index.yaml) configuration file: |
| |
|
| | ```bash |
| | lightning-ir index --config index.yaml |
| | ``` |
| |
|
| | After indexing, to evaluate the model on TREC Deep Learning 2019 and 2020, use the following command and the [search.yaml](./configs/search.yaml) configuration file: |
| |
|
| | ```bash |
| | lightning-ir search --config search.yaml |
| | ``` |
| |
|
| | [^1]: Khattab and Zaharia, [ColBERT: Efficient and Effective Passage Search via Contextualized Late Interaction over BERT](https://dl.acm.org/doi/abs/10.1145/3397271.3401075) |