Improve model card: add metadata, paper, and code links
Browse filesHi, I'm Niels from the Hugging Face community science team. This PR improves the model card for the Cuttlefish-Encoder by adding the `graph-ml` pipeline tag, providing links to the original paper and official GitHub repository, and adding the BibTeX citation. These updates make the model more discoverable and provide researchers with the necessary context for usage and attribution.
README.md
CHANGED
|
@@ -1,26 +1,34 @@
|
|
| 1 |
---
|
| 2 |
license: apache-2.0
|
|
|
|
| 3 |
tags:
|
| 4 |
-
|
| 5 |
-
|
| 6 |
-
|
| 7 |
-
|
| 8 |
-
|
| 9 |
-
|
| 10 |
---
|
| 11 |
|
| 12 |
# Cuttlefish-Encoder
|
| 13 |
|
| 14 |
-
Graph encoder component of
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 15 |
|
| 16 |
## Usage
|
| 17 |
|
|
|
|
|
|
|
| 18 |
```python
|
| 19 |
from huggingface_hub import snapshot_download
|
| 20 |
encoder_dir = snapshot_download("zihaojing/Cuttlefish-Encoder")
|
| 21 |
|
| 22 |
# Load via the Cuttlefish codebase
|
| 23 |
-
# See https://github.com/
|
| 24 |
```
|
| 25 |
|
| 26 |
## Pretraining data
|
|
@@ -32,9 +40,9 @@ Pretrained on **[Cuttlefish-Encoder-Data](https://huggingface.co/datasets/zihaoj
|
|
| 32 |
|
| 33 |
## Model details
|
| 34 |
|
| 35 |
-
- Architecture: All-atom graph encoder with
|
| 36 |
-
- Encoder hidden dim: 256
|
| 37 |
-
- Modalities: molecule, protein, dna, rna
|
| 38 |
|
| 39 |
## Related resources
|
| 40 |
|
|
@@ -43,3 +51,15 @@ Pretrained on **[Cuttlefish-Encoder-Data](https://huggingface.co/datasets/zihaoj
|
|
| 43 |
| Full Cuttlefish LLM | [zihaojing/Cuttlefish](https://huggingface.co/zihaojing/Cuttlefish) |
|
| 44 |
| SFT instruction data | [zihaojing/Cuttlefish-SFT-Data](https://huggingface.co/datasets/zihaojing/Cuttlefish-SFT-Data) |
|
| 45 |
| Encoder pretraining data | [zihaojing/Cuttlefish-Encoder-Data](https://huggingface.co/datasets/zihaojing/Cuttlefish-Encoder-Data) |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
---
|
| 2 |
license: apache-2.0
|
| 3 |
+
pipeline_tag: graph-ml
|
| 4 |
tags:
|
| 5 |
+
- biology
|
| 6 |
+
- protein
|
| 7 |
+
- molecule
|
| 8 |
+
- dna
|
| 9 |
+
- rna
|
| 10 |
+
- graph-neural-network
|
| 11 |
---
|
| 12 |
|
| 13 |
# Cuttlefish-Encoder
|
| 14 |
|
| 15 |
+
Graph encoder component of **Cuttlefish**, a unified all-atom LLM that grounds language reasoning in geometric cues while scaling modality tokens with structural complexity.
|
| 16 |
+
|
| 17 |
+
This model was presented in the paper [Scaling-Aware Adapter for Structure-Grounded LLM Reasoning](https://arxiv.org/abs/2602.02780).
|
| 18 |
+
|
| 19 |
+
- **Code:** [GitHub - zihao-jing/Cuttlefish](https://github.com/zihao-jing/Cuttlefish)
|
| 20 |
+
- **Pretrained with:** Masked reconstruction on all-atom structures.
|
| 21 |
|
| 22 |
## Usage
|
| 23 |
|
| 24 |
+
You can download the encoder using the `huggingface_hub` library:
|
| 25 |
+
|
| 26 |
```python
|
| 27 |
from huggingface_hub import snapshot_download
|
| 28 |
encoder_dir = snapshot_download("zihaojing/Cuttlefish-Encoder")
|
| 29 |
|
| 30 |
# Load via the Cuttlefish codebase
|
| 31 |
+
# See https://github.com/zihao-jing/Cuttlefish for full usage
|
| 32 |
```
|
| 33 |
|
| 34 |
## Pretraining data
|
|
|
|
| 40 |
|
| 41 |
## Model details
|
| 42 |
|
| 43 |
+
- **Architecture**: All-atom graph encoder with Scaling-Aware Patching.
|
| 44 |
+
- **Encoder hidden dim**: 256
|
| 45 |
+
- **Modalities**: molecule, protein, dna, rna
|
| 46 |
|
| 47 |
## Related resources
|
| 48 |
|
|
|
|
| 51 |
| Full Cuttlefish LLM | [zihaojing/Cuttlefish](https://huggingface.co/zihaojing/Cuttlefish) |
|
| 52 |
| SFT instruction data | [zihaojing/Cuttlefish-SFT-Data](https://huggingface.co/datasets/zihaojing/Cuttlefish-SFT-Data) |
|
| 53 |
| Encoder pretraining data | [zihaojing/Cuttlefish-Encoder-Data](https://huggingface.co/datasets/zihaojing/Cuttlefish-Encoder-Data) |
|
| 54 |
+
|
| 55 |
+
## Citation
|
| 56 |
+
|
| 57 |
+
```bibtex
|
| 58 |
+
@article{jing2026cuttlefish,
|
| 59 |
+
title = {Cuttlefish: Scaling-Aware Adapter for Structure-Grounded LLM Reasoning},
|
| 60 |
+
author = {Jing, Zihao and Zeng, Qiuhao and Fang, Ruiyi and Li, Yan Yi and Sun, Yan Table, Boyu and Hu, Pingzhao},
|
| 61 |
+
booktitle = {Proceedings of the 43rd International Conference on Machine Learning (ICML)},
|
| 62 |
+
year = {2026},
|
| 63 |
+
url = {https://arxiv.org/abs/2602.02780}
|
| 64 |
+
}
|
| 65 |
+
```
|