| | --- |
| | base_model: |
| | - google-bert/bert-base-uncased |
| | datasets: |
| | - microsoft/ms_marco |
| | language: |
| | - en |
| | library_name: transformers |
| | license: apache-2.0 |
| | pipeline_tag: feature-extraction |
| | --- |
| | |
| | # Model Card |
| | This is the official model from the paper [Hypencoder: Hypernetworks for Information Retrieval](https://arxiv.org/abs/2502.05364). |
| |
|
| | ## Model Details |
| | This is a Hypencoder Dual Encoder. It contains two trunks the text encoder and Hypencoder. The text encoder converts items into 768 dimension vectors while the Hypencoder converts text into a small neural network which takes the 768 dimension vector from the text encoder as input. This small network is then used to output a relevance score. To use this model please take a look at the [Github](https://github.com/jfkback/hypencoder-paper) page which contains the required code and details on how to run the model. |
| |
|
| | ### Model Variants |
| | We released the four models used in the paper. Each model is identical except the small neural networks, which we refer to as q-nets, have different numbers of hidden layers. |
| |
|
| | | Huggingface Repo | Number of Layers | |
| | |:------------------:|:------------------:| |
| | | [jfkback/hypencoder.2_layer](https://huggingface.co/jfkback/hypencoder.2_layer) | 2 | |
| | | [jfkback/hypencoder.4_layer](https://huggingface.co/jfkback/hypencoder.4_layer) | 4 | |
| | | [jfkback/hypencoder.6_layer](https://huggingface.co/jfkback/hypencoder.6_layer) | 6 | |
| | | [jfkback/hypencoder.8_layer](https://huggingface.co/jfkback/hypencoder.8_layer) | 8 | |
| |
|
| | ## Quick Start |
| | #### Using the pretrained Hypencoders as stand-alone models |
| | ```python |
| | from hypencoder_cb.modeling.hypencoder import Hypencoder, HypencoderDualEncoder, TextEncoder |
| | from transformers import AutoTokenizer |
| | |
| | dual_encoder = HypencoderDualEncoder.from_pretrained("jfkback/hypencoder.6_layer") |
| | tokenizer = AutoTokenizer.from_pretrained("jfkback/hypencoder.6_layer") |
| | |
| | query_encoder: Hypencoder = dual_encoder.query_encoder |
| | passage_encoder: TextEncoder = dual_encoder.passage_encoder |
| | |
| | queries = [ |
| | "how many states are there in india", |
| | "when do concussion symptoms appear", |
| | ] |
| | |
| | passages = [ |
| | "India has 28 states and 8 union territories.", |
| | "Concussion symptoms can appear immediately or up to 72 hours after the injury.", |
| | ] |
| | |
| | query_inputs = tokenizer(queries, return_tensors="pt", padding=True, truncation=True) |
| | passage_inputs = tokenizer(passages, return_tensors="pt", padding=True, truncation=True) |
| | |
| | q_nets = query_encoder(input_ids=query_inputs["input_ids"], attention_mask=query_inputs["attention_mask"]).representation |
| | passage_embeddings = passage_encoder(input_ids=passage_inputs["input_ids"], attention_mask=passage_inputs["attention_mask"]).representation |
| | |
| | # The passage_embeddings has shape (2, 768), but the q_nets expect the shape |
| | # (num_queries, num_items_per_query, input_hidden_size) so we need to reshape |
| | # the passage_embeddings. |
| | |
| | # In the simple case where each q_net only takes one passage, we can just |
| | # reshape the passage_embeddings to (num_queries, 1, input_hidden_size). |
| | passage_embeddings_single = passage_embeddings.unsqueeze(1) |
| | scores = q_nets(passage_embeddings_single) # Shape (2, 1, 1) |
| | # [ |
| | # [[-12.1192]], |
| | # [[-13.5832]] |
| | # ] |
| | |
| | # In the case where each q_net takes both passages we can reshape the |
| | # passage_embeddings to (num_queries, 2, input_hidden_size). |
| | passage_embeddings_double = passage_embeddings.repeat(2, 1).reshape(2, 2, -1) |
| | scores = q_nets(passage_embeddings_double) # Shape (2, 2, 1) |
| | # [ |
| | # [[-12.1192], [-32.7046]], |
| | # [[-34.0934], [-13.5832]] |
| | # ] |
| | ``` |
| |
|
| | ## Citation |
| | **BibTeX:** |
| | ``` |
| | @misc{killingback2025hypencoderhypernetworksinformationretrieval, |
| | title={Hypencoder: Hypernetworks for Information Retrieval}, |
| | author={Julian Killingback and Hansi Zeng and Hamed Zamani}, |
| | year={2025}, |
| | eprint={2502.05364}, |
| | archivePrefix={arXiv}, |
| | primaryClass={cs.IR}, |
| | url={https://arxiv.org/abs/2502.05364}, |
| | } |
| | ``` |