DeepPharm: Multi-Modal Transfer Learning for Drug-Target Affinity Prediction

Model Description

DeepPharm is a multi-modal deep learning framework for predicting protein–ligand binding affinity ($pK$). It combines:

GATv2 molecular graph encoder (3 layers, 4 heads)
ECFP4 fingerprint MLP encoder (2048→128)
Gated Fusion mechanism for adaptive ligand representation
ESM-2 protein language model (150M params, fine-tuned)
Stacked Cross-Attention (2 layers, 4 heads) for drug-protein interaction
Residual Prediction Head with SiLU activation

Two Modes of Operation

Mode	Task	Input	Output
Mode A	Supervised affinity prediction	Drug SMILES + Protein sequence	pK value
Mode B	Weakly supervised drug repurposing	Drug + Disease signature	Ranked candidates

Performance

Systematic Ablation (PDBbind v2020, $N_{test}=3{,}775$)

Config	RMSE ↓	Pearson ↑	Spearman ↑
V1 Baseline (ESM-35M)	1.266	0.743	0.743
V2 Architecture	1.258	0.748	0.746
V2 + CosineWR	1.244	0.753	0.750
V2 + ESM-150M (Best)	1.229	0.762	0.760
V2 + EMA	1.247	0.753	0.753

Five-Seed Ensemble (Best Configuration)

Metric	Mean ± Std
RMSE	1.246 ± 0.005
Pearson r	0.751 ± 0.002
Spearman ρ	0.750 ± 0.002

CV < 0.4% confirms high reproducibility.

Baselines (all re-implemented on same split)

Model	RMSE ↓	Pearson ↑
DeepDTA (CNN)	1.48	0.61
GraphDTA (GCN)	1.39	0.67
MolCLR*	1.30	0.74
DrugBAN	1.28	0.76
DeepPharm V2	1.23	0.76

Intended Use

High-throughput virtual screening of drug candidates
Binding affinity prediction for drug-target pairs
Hypothesis generation for drug repurposing in orphan diseases
Research and academic purposes

Limitations

2D topological encoder; cannot distinguish stereoisomers
Trained on PDBbind v2020, which overrepresents kinases
Mode B uses drug priors (guilt-by-association), not zero-shot inference
Predictions require experimental validation

Training Details

Dataset: PDBbind v2020 General Set (15,100 train / 3,775 test, seed=42)
Hardware: 1× NVIDIA H100 80 GB
Optimizer: AdamW (backbone LR: 5e-6, head LR: 8e-4)
Scheduler: CosineAnnealing with Warm Restarts ($T_0$=10, $T_{mult}$=2)
Loss: MSE + 0.3·RankingLoss + 0.2·HuberLoss
Training time: ~11 min/epoch (ESM-2 150M), best checkpoint at epoch 18

Available Checkpoints

File	Description	RMSE
`best_v2_esm150m.pt`	Best V2 model (ESM-2 150M)	1.229
`best_v1_esm35m.pt`	V1 Baseline (ESM-2 35M)	1.266

How to Use

from huggingface_hub import hf_hub_download

# Download the best model
path = hf_hub_download("chamoso/DeepPharm", "best_v2_esm150m.pt")

# Load in PyTorch
import torch
checkpoint = torch.load(path, map_location="cpu")

For full inference with data preprocessing:

git clone https://github.com/chamoso/DeepPharm.git
cd DeepPharm
python scripts/predict.py \
    --checkpoint weights/best_v2_esm150m.pt \
    --smiles "CC(=O)Oc1ccccc1C(=O)O" \
    --sequence "MKTAYIAKQRQISFVKSHFSRQLE..."

Citation

Preprint coming soon.

License

MIT License

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

Other

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

chamoso
/

DeepPharm