Instructions to use Synthyra/FastESM2_650 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Synthyra/FastESM2_650 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("fill-mask", model="Synthyra/FastESM2_650", trust_remote_code=True)# Load model directly from transformers import AutoModelForMaskedLM model = AutoModelForMaskedLM.from_pretrained("Synthyra/FastESM2_650", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
Update README.md
Browse files
README.md
CHANGED
|
@@ -8,7 +8,7 @@ tags: []
|
|
| 8 |
|
| 9 |
FastESM is a Huggingface compatible plug in version of ESM2-650M rewritten with a newer PyTorch attention implementation.
|
| 10 |
|
| 11 |
-
To enhance the weights with longer context and better fp16 support, we trained ESM2-650 50000 additional steps with a traditional MLM objective (20% masking) in fp16 mixed precision on [OMGprot50](tattabio/OMG_prot50) up to sequence length of **2048**.
|
| 12 |
|
| 13 |
Outputting attention maps (or the contact prediction head) is not natively possible with SDPA. You can still pass ```output_attentions``` to have attention calculated manually and returned.
|
| 14 |
Various other optimizations also make the base implementation slightly different than the one in transformers.
|
|
|
|
| 8 |
|
| 9 |
FastESM is a Huggingface compatible plug in version of ESM2-650M rewritten with a newer PyTorch attention implementation.
|
| 10 |
|
| 11 |
+
To enhance the weights with longer context and better fp16 support, we trained ESM2-650 50000 additional steps with a traditional MLM objective (20% masking) in fp16 mixed precision on [OMGprot50](https://huggingface.co/datasets/tattabio/OMG_prot50) up to sequence length of **2048**.
|
| 12 |
|
| 13 |
Outputting attention maps (or the contact prediction head) is not natively possible with SDPA. You can still pass ```output_attentions``` to have attention calculated manually and returned.
|
| 14 |
Various other optimizations also make the base implementation slightly different than the one in transformers.
|