AlexanderKroll commited on
Commit
7372d4e
·
verified ·
1 Parent(s): 248ae61

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -41
README.md CHANGED
@@ -23,7 +23,7 @@ Typical downstream tasks (with finetuning heads):
23
  - Protein-only regression/classification.
24
  - PSI (**protein-small molecule interactions**) prediction when combined with a SMILES encoder.
25
 
26
- GitHub code: [foldvision_github](https://github.com/<YOUR_ORG_OR_USER>/foldvision_github)
27
 
28
  ## Model Details
29
 
@@ -33,18 +33,6 @@ GitHub code: [foldvision_github](https://github.com/<YOUR_ORG_OR_USER>/foldvisio
33
  - Input channels: 5 atom-type channels (`C`, `N`, `S`, `O`, `P`)
34
  - Output: `(B, 1024)` embedding
35
 
36
- ## Intended Use
37
-
38
- Use this model to compute protein structure embeddings for:
39
- - similarity and retrieval workflows,
40
- - downstream supervised tasks (classification/regression),
41
- - multimodal PSI pipelines with a molecule language model.
42
-
43
- ## Out-of-Scope Use
44
-
45
- - Clinical decision making.
46
- - Any safety-critical use without task-specific validation.
47
- - Interpretation as direct biochemical or medical truth without experimental verification.
48
 
49
  ## Input and Preprocessing
50
 
@@ -84,34 +72,6 @@ FoldVision pipelines support repeated runs with random 3D rotations (test-time a
84
  - per-run predictions can be used to inspect spread/uncertainty,
85
  - averaged predictions are recommended for reporting.
86
 
87
- ## Training and Evaluation Data
88
-
89
- Please document here the exact datasets used for pretraining and downstream evaluation.
90
-
91
- Example datasets referenced in this repository:
92
- - PTEN activity
93
- - SPOT
94
- - Davis
95
- - small dummy data files for smoke tests (not representative for benchmarking)
96
-
97
- ## Metrics
98
-
99
- Report the official metrics from your manuscript for your release version.
100
-
101
- Suggested metrics by task:
102
- - Regression: Spearman, Pearson, MAE, RMSE, R2
103
- - Binary: Accuracy, MCC, ROC-AUC
104
-
105
- ## Limitations
106
-
107
- - Performance depends strongly on preprocessing consistency.
108
- - Rotational augmentation can change single-run outputs; use multi-run means for stability.
109
- - Generalization to new protein families/domains must be validated per task.
110
-
111
- ## Risks and Biases
112
-
113
- - Dataset composition can bias performance across protein classes.
114
- - Downstream labels and splits can introduce benchmark-specific bias.
115
 
116
  ## Citation
117
 
 
23
  - Protein-only regression/classification.
24
  - PSI (**protein-small molecule interactions**) prediction when combined with a SMILES encoder.
25
 
26
+ GitHub code: [foldvision_github](https://github.com/AlexanderKroll/foldvision)
27
 
28
  ## Model Details
29
 
 
33
  - Input channels: 5 atom-type channels (`C`, `N`, `S`, `O`, `P`)
34
  - Output: `(B, 1024)` embedding
35
 
 
 
 
 
 
 
 
 
 
 
 
 
36
 
37
  ## Input and Preprocessing
38
 
 
72
  - per-run predictions can be used to inspect spread/uncertainty,
73
  - averaged predictions are recommended for reporting.
74
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
75
 
76
  ## Citation
77