| --- |
| library_name: pytorch |
| framework: pytorch |
| tags: |
| - pytorch |
| - pytorch-lightning |
| - bioinformatics |
| - rna-binding-proteins |
| - explainability |
| - alternative-splicing |
| - deep-learning |
| license: mit |
| --- |
| |
| # DeepRBP Predictor (pretrained) |
|
|
| This repository provides a **pretrained DeepRBP predictor model**, a deep learning framework designed to infer **RNA-binding protein (RBP)–transcript and RBP–gene regulatory relationships** from expression data. |
|
|
| DeepRBP was introduced in the following preprint: |
|
|
| > **DeepRBP: A deep neural network for inferring splicing regulation** |
| > https://doi.org/10.1101/2024.04.11.589004 |
|
|
| The model is intended to be used **directly for inference and explainability**, without retraining. |
|
|
| --- |
|
|
| ## Model overview |
|
|
| DeepRBP is composed of two conceptual stages: |
|
|
| 1. **Prediction stage** |
| A neural network predicts transcript abundances from: |
| - RBP expression |
| - Gene expression |
|
|
| 2. **Explainability stage** |
| Feature attribution methods (e.g., DeepLIFT) are applied on the trained predictor to compute: |
| - Transcript × RBP (TxRBP) scores |
| - Gene × RBP (GxRBP) scores |
|
|
| This repository contains **only the pretrained predictor and the required preprocessing artifacts** needed to use it. |
|
|
| --- |
|
|
| ## Files in this repository |
|
|
| ⚠️ **All files are required for correct inference and explainability.** |
|
|
| | File | Description | |
| |-----|-------------| |
| | `model.ckpt` | PyTorch Lightning checkpoint of the pretrained DeepRBP predictor | |
| | `scaler.joblib` | Fitted input scaler used during model training | |
| | `sigma.npy` | Scaling parameter required to reconstruct transcript abundance values | |
| | `DeepRBP_feature_spec.xlsx` | Feature manifest defining the RBPs/genes/transcripts and their exact order | |
|
|
| The scaler and sigma are **part of the trained model state** and must be used together with the checkpoint. |
|
|
| The feature specification file is part of the **model compatibility contract**: input matrices must be aligned to the same feature set **and order** used during training. |
|
|
| --- |
|
|
| ## Intended use |
|
|
| This pretrained model is intended for: |
|
|
| - Computing transcript abundance predictions |
| - Running explainability analyses (e.g., DeepLIFT-based attribution) |
| - Identifying candidate RBP–transcript and RBP–gene regulatory relationships |
| - Downstream biological interpretation and hypothesis generation |
|
|
| Typical applications include: |
| - Cancer transcriptomics (e.g., TCGA) |
| - Perturbation studies (e.g., RBP knockdowns) |
| - Comparative regulatory analyses across conditions |
|
|
| --- |
|
|
| ## Usage |
|
|
| This repository **does not provide a standalone inference script**. |
|
|
| Please refer to the **main DeepRBP code repository** for: |
| - Data preprocessing |
| - Model loading |
| - Running prediction and explainability pipelines |
|
|
| 👉 **Main repository:** |
| https://github.com/ML4BM-Lab/DeepRBP |
|
|
| The main repository contains: |
| - End-to-end examples |
| - Command-line interfaces |
| - Explainability workflows |
| - Validation pipelines |
|
|
| --- |
|
|
| ## Reproducibility notes |
|
|
| - The model was trained on public datasets (TCGA). |
| - The provided scaler and sigma ensure: |
| - consistent input normalization, |
| - comparable predictions and attribution scores across users. |
| - The provided feature specification (`DeepRBP_feature_spec.xlsx`) defines the exact feature set and ordering used during training. |
| Using inputs that are not aligned to this specification will break compatibility and comparability. |
|
|
| --- |
|
|
| ## Limitations |
|
|
| - The model was trained on bulk RNA-seq data and may not generalize to: |
| - single-cell RNA-seq |
| - extremely low-coverage datasets |
| - Predictions represent **associations**, not direct causal regulation. |
| - Experimental validation is required before biological conclusions. |
|
|
| --- |
|
|
| ## License |
|
|
| This model is released under the **MIT License**. |
|
|
| You are free to use, modify and redistribute it, provided that the license and copyright notice are preserved. |
|
|
| --- |
|
|
| ## Citation |
|
|
| If you use DeepRBP in your work, please cite: |
|
|
| DeepRBP: A deep neural network for inferring splicing regulation |
| bioRxiv (2024) |
| https://doi.org/10.1101/2024.04.11.589004 |
|
|