Cuttlefish-Encoder / README.md
nielsr's picture
nielsr HF Staff
Improve model card: add metadata, paper, and code links
caf986d verified
|
Raw
History Blame
2.15 kB
---
license: apache-2.0
pipeline_tag: graph-ml
tags:
- biology
- protein
- molecule
- dna
- rna
- graph-neural-network
---
# Cuttlefish-Encoder
Graph encoder component of **Cuttlefish**, a unified all-atom LLM that grounds language reasoning in geometric cues while scaling modality tokens with structural complexity.
This model was presented in the paper [Scaling-Aware Adapter for Structure-Grounded LLM Reasoning](https://arxiv.org/abs/2602.02780).
- **Code:** [GitHub - zihao-jing/Cuttlefish](https://github.com/zihao-jing/Cuttlefish)
- **Pretrained with:** Masked reconstruction on all-atom structures.
## Usage
You can download the encoder using the `huggingface_hub` library:
```python
from huggingface_hub import snapshot_download
encoder_dir = snapshot_download("zihaojing/Cuttlefish-Encoder")
# Load via the Cuttlefish codebase
# See https://github.com/zihao-jing/Cuttlefish for full usage
```
## Pretraining data
Pretrained on **[Cuttlefish-Encoder-Data](https://huggingface.co/datasets/zihaojing/Cuttlefish-Encoder-Data)**, covering:
- Molecules (SMILES → 3D graph)
- Proteins (PDB/CIF → all-atom graph)
- DNA and RNA sequences
## Model details
- **Architecture**: All-atom graph encoder with Scaling-Aware Patching.
- **Encoder hidden dim**: 256
- **Modalities**: molecule, protein, dna, rna
## Related resources
| Resource | Link |
|---|---|
| Full Cuttlefish LLM | [zihaojing/Cuttlefish](https://huggingface.co/zihaojing/Cuttlefish) |
| SFT instruction data | [zihaojing/Cuttlefish-SFT-Data](https://huggingface.co/datasets/zihaojing/Cuttlefish-SFT-Data) |
| Encoder pretraining data | [zihaojing/Cuttlefish-Encoder-Data](https://huggingface.co/datasets/zihaojing/Cuttlefish-Encoder-Data) |
## Citation
```bibtex
@article{jing2026cuttlefish,
title = {Cuttlefish: Scaling-Aware Adapter for Structure-Grounded LLM Reasoning},
author = {Jing, Zihao and Zeng, Qiuhao and Fang, Ruiyi and Li, Yan Yi and Sun, Yan Table, Boyu and Hu, Pingzhao},
booktitle = {Proceedings of the 43rd International Conference on Machine Learning (ICML)},
year = {2026},
url = {https://arxiv.org/abs/2602.02780}
}
```