AbAffinity: A Large Language Model for Predicting Antibody Binding Affinity against SARS-CoV-2
Paper
• 2603.04480 • Published
This repository contains the model presented in the paper AbAffinity: A Large Language Model for Predicting Antibody Binding Affinity against SARS-CoV-2.
GitHub Repository: ucrbioinfo/AbAffinity
AbAffinity is a Large Language Model designed to predict the binding affinity of scFv antibody sequences against the SARS-CoV-2 HR2 peptide. It takes the antibody heavy and light chain sequences as input and predicts the binding affinity against a peptide common to all SARS-CoV-2 variants.
You can install AbAffinity from Hugging Face:
pip install git+https://huggingface.co/faisalashraf/abaffinity
You can also install it in a local folder:
git lfs install
git clone https://huggingface.co/faisalashraf/abaffinity
cd abaffinity
pip install .
Here's a quick example to get started:
from abaffinity import AbAffinity
# Example usage
abmodel=AbAffinity()
# The model takes complete scFv sequences as input. Heavy and Light chain are connected with a linker sequence.
# Use make_scFv() method from the model to get the complete scFv sequence from heavy chain and light chain sequence.
heavy_seq = 'EVQLVESGAEVKKPGASVKVSCKASGYTFTSYGISWVRQAPGQGLEWMGWISAYNGNTNYAQKLQGRVTMTTDTSTSTAYMELRSLRSDDTAVYYCARVGRGVIDHWGQGTLVTVSS'
light_seq = 'SSELTQDPAVSVALGQTVRITCEGDSLDYYYANWYQQKPGQAPILVIYGKNNRPSGIADRFSGSNSGDTSSLIITGAQAEDEADYYCSSRDSSGFEVTFGAGTKLTVL'
scFv_seq = abmodel.make_scFv(heavy_seq, light_seq)
print(scFv_seq) # Output: EVQLVESGAEVKKPGASVKVSCKASGYTFTSYGISWVRQAPGQGLEWMGWISAYNGNTNYAQKLQGRVTMTTDTSTSTAYMELRSLRSDDTAVYYCARVGRGVIDHWGQGTLVTVSSGGGGSGGGGSGGGGSSSELTQDPAVSVALGQTVRITCEGDSLDYYYANWYQQKPGQAPILVIYGKNNRPSGIADRFSGSNSGDTSSLIITGAQAEDEADYYCSSRDSSGFEVTFGAGTKLTVL
# Use `get_affinity()` method to get the predicted binding affinity of the antibody sequence.
pred_affinity = abmodel.get_affinity(scFv_seq)
print(pred_affinity) # Output: tensor([3.1595])
# Use `get_embeddings()` method to get the embeddings for input sequences.
# Use `mode='res'` to get residue wise embeddings, and `mode='seq'` will give sequence embedding.
res_emb = abmodel.get_embeddings(scFv_seq, mode='res')
print(res_emb.shape) # Output: torch.Size([258, 1280])
seq_emb = abmodel.get_embeddings(scFv_seq, mode='seq')
print(seq_emb.shape) # Output: torch.Size([1280])
# Use `get_contact_map()` method to get the contact maps of the given antibody sequence.
# Use `mode='VH-VL'` if you want to plot the contacts for heavy chain and light chain separately, and `mode='scFv'` to plot single contacts for the entire scFv sequence.
contacts = abmodel.get_contact_map(scFv_seq, mode = 'scFv')
print(contacts.shape) # Output: contact map figure, (240, 240)
This project is licensed under the MIT License.
If you find this work useful, please cite:
@article{ashraf2024large,
title={A Large Language Model Guides the Affinity Maturation of Variant Antibodies Generated by Combinatorial Optimization},
author={Ashraf, Faisal Bin and Zhang, Zihao and Paco, Karen and Mendivil, Mariana P and Lay, Jordan A and Ray, Animesh and Lonardi, Stefano},
journal={bioRxiv},
pages={2024--12},
year={2024},
publisher={Cold Spring Harbor Laboratory}
}
@article{ashraf2026abaffinity,
title={AbAffinity: A Large Language Model for Predicting Antibody Binding Affinity against SARS-CoV-2},
author={Ashraf, Faisal Bin and Ray, Animesh and Lonardi, Stefano},
journal={arXiv preprint arXiv:2603.04480},
year={2026}
}