DeepPharm: Multi-Modal Transfer Learning for Drug-Target Affinity Prediction

Model Description

DeepPharm is a multi-modal deep learning framework for predicting protein–ligand binding affinity ($pK$). It combines:

  • GATv2 molecular graph encoder (3 layers, 4 heads)
  • ECFP4 fingerprint MLP encoder (2048→128)
  • Gated Fusion mechanism for adaptive ligand representation
  • ESM-2 protein language model (150M params, fine-tuned)
  • Stacked Cross-Attention (2 layers, 4 heads) for drug-protein interaction
  • Residual Prediction Head with SiLU activation

Two Modes of Operation

Mode Task Input Output
Mode A Supervised affinity prediction Drug SMILES + Protein sequence pK value
Mode B Weakly supervised drug repurposing Drug + Disease signature Ranked candidates

Performance

Systematic Ablation (PDBbind v2020, $N_{test}=3{,}775$)

Config RMSE ↓ Pearson ↑ Spearman ↑
V1 Baseline (ESM-35M) 1.266 0.743 0.743
V2 Architecture 1.258 0.748 0.746
V2 + CosineWR 1.244 0.753 0.750
V2 + ESM-150M (Best) 1.229 0.762 0.760
V2 + EMA 1.247 0.753 0.753

Five-Seed Ensemble (Best Configuration)

Metric Mean ± Std
RMSE 1.246 ± 0.005
Pearson r 0.751 ± 0.002
Spearman ρ 0.750 ± 0.002

CV < 0.4% confirms high reproducibility.

Baselines (all re-implemented on same split)

Model RMSE ↓ Pearson ↑
DeepDTA (CNN) 1.48 0.61
GraphDTA (GCN) 1.39 0.67
MolCLR* 1.30 0.74
DrugBAN 1.28 0.76
DeepPharm V2 1.23 0.76

Intended Use

  • High-throughput virtual screening of drug candidates
  • Binding affinity prediction for drug-target pairs
  • Hypothesis generation for drug repurposing in orphan diseases
  • Research and academic purposes

Limitations

  • 2D topological encoder; cannot distinguish stereoisomers
  • Trained on PDBbind v2020, which overrepresents kinases
  • Mode B uses drug priors (guilt-by-association), not zero-shot inference
  • Predictions require experimental validation

Training Details

  • Dataset: PDBbind v2020 General Set (15,100 train / 3,775 test, seed=42)
  • Hardware: 1× NVIDIA H100 80 GB
  • Optimizer: AdamW (backbone LR: 5e-6, head LR: 8e-4)
  • Scheduler: CosineAnnealing with Warm Restarts ($T_0$=10, $T_{mult}$=2)
  • Loss: MSE + 0.3·RankingLoss + 0.2·HuberLoss
  • Training time: ~11 min/epoch (ESM-2 150M), best checkpoint at epoch 18

Available Checkpoints

File Description RMSE
best_v2_esm150m.pt Best V2 model (ESM-2 150M) 1.229
best_v1_esm35m.pt V1 Baseline (ESM-2 35M) 1.266

How to Use

from huggingface_hub import hf_hub_download

# Download the best model
path = hf_hub_download("chamoso/DeepPharm", "best_v2_esm150m.pt")

# Load in PyTorch
import torch
checkpoint = torch.load(path, map_location="cpu")

For full inference with data preprocessing:

git clone https://github.com/chamoso/DeepPharm.git
cd DeepPharm
python scripts/predict.py \
    --checkpoint weights/best_v2_esm150m.pt \
    --smiles "CC(=O)Oc1ccccc1C(=O)O" \
    --sequence "MKTAYIAKQRQISFVKSHFSRQLE..."

Links

Citation

Preprint coming soon.

License

MIT License

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support