AlexanderKroll
/

foldvision-encoder

Feature Extraction

structural-biology

representation-learning

Model card Files Files and versions

AlexanderKroll commited on Feb 16

Commit

7372d4e

·

verified ·

1 Parent(s): 248ae61

Update README.md

Files changed (1) hide show

README.md +1 -41

README.md CHANGED Viewed

@@ -23,7 +23,7 @@ Typical downstream tasks (with finetuning heads):
 - Protein-only regression/classification.
 - PSI (**protein-small molecule interactions**) prediction when combined with a SMILES encoder.
-GitHub code: [foldvision_github](https://github.com/<YOUR_ORG_OR_USER>/foldvision_github)
 ## Model Details
@@ -33,18 +33,6 @@ GitHub code: [foldvision_github](https://github.com/<YOUR_ORG_OR_USER>/foldvisio
 - Input channels: 5 atom-type channels (`C`, `N`, `S`, `O`, `P`)
 - Output: `(B, 1024)` embedding
-## Intended Use
-Use this model to compute protein structure embeddings for:
-- similarity and retrieval workflows,
-- downstream supervised tasks (classification/regression),
-- multimodal PSI pipelines with a molecule language model.
-## Out-of-Scope Use
-- Clinical decision making.
-- Any safety-critical use without task-specific validation.
-- Interpretation as direct biochemical or medical truth without experimental verification.
 ## Input and Preprocessing
@@ -84,34 +72,6 @@ FoldVision pipelines support repeated runs with random 3D rotations (test-time a
   - per-run predictions can be used to inspect spread/uncertainty,
   - averaged predictions are recommended for reporting.
-## Training and Evaluation Data
-Please document here the exact datasets used for pretraining and downstream evaluation.
-Example datasets referenced in this repository:
-- PTEN activity
-- SPOT
-- Davis
-- small dummy data files for smoke tests (not representative for benchmarking)
-## Metrics
-Report the official metrics from your manuscript for your release version.
-Suggested metrics by task:
-- Regression: Spearman, Pearson, MAE, RMSE, R2
-- Binary: Accuracy, MCC, ROC-AUC
-## Limitations
-- Performance depends strongly on preprocessing consistency.
-- Rotational augmentation can change single-run outputs; use multi-run means for stability.
-- Generalization to new protein families/domains must be validated per task.
-## Risks and Biases
-- Dataset composition can bias performance across protein classes.
-- Downstream labels and splits can introduce benchmark-specific bias.
 ## Citation

 - Protein-only regression/classification.
 - PSI (**protein-small molecule interactions**) prediction when combined with a SMILES encoder.
+GitHub code: [foldvision_github](https://github.com/AlexanderKroll/foldvision)
 ## Model Details
 - Input channels: 5 atom-type channels (`C`, `N`, `S`, `O`, `P`)
 - Output: `(B, 1024)` embedding
 ## Input and Preprocessing
   - per-run predictions can be used to inspect spread/uncertainty,
   - averaged predictions are recommended for reporting.
 ## Citation