| license: apache-2.0 | |
| tags: | |
| - biology | |
| <div align="center"> | |
| <img src="https://cdn-uploads.huggingface.co/production/uploads/65a9e8563b9e1f0f308378b7/H2qI2OOSl-KqOlg01fRGR.png" width="50%" /> | |
| </div> | |
| # OneGenome-Rice (OGR) | |
| OGR is a foundational model for AI-driven precision breeding and functional genomics in rice. It is a generative genomic foundation model trained to process DNA sequences up to **1 million** base pairs in length, with **1.25B** total parameters and a **Mixture-of-Experts (MoE)** architecture. It was pre-trained on a curated corpus of **422** rice genomes spanning cultivated and wild *Oryza* diversity. | |
| For instructions, details, and examples, see the project repository [OGR GitHub](https://github.com/zhejianglab/OneGenome-Rice). | |
| The table below summarizes training scale and key hyperparameters. | |
| <div align="center"> | |
| <table> | |
| <thead> | |
| <tr> | |
| <th align="center"><strong>Model Specification</strong></th> | |
| <th align="center"><strong>OneGenomeRice (OGR)</strong></th> | |
| </tr> | |
| </thead> | |
| <tbody> | |
| <tr> | |
| <td align="center" colspan="2"><strong>Model Scale</strong></td> | |
| </tr> | |
| <tr> | |
| <td align="center">Total Parameters</td> | |
| <td align="center">1.25B</td> | |
| </tr> | |
| <tr> | |
| <td align="center">Activated Parameters</td> | |
| <td align="center">0.33B</td> | |
| </tr> | |
| <tr> | |
| <td align="center" colspan="2"><strong>Architecture</strong></td> | |
| </tr> | |
| <tr> | |
| <td align="center">Architecture</td> | |
| <td align="center">MoE</td> | |
| </tr> | |
| <tr> | |
| <td align="center">Number of Experts</td> | |
| <td align="center">8</td> | |
| </tr> | |
| <tr> | |
| <td align="center">Selected Experts per Token</td> | |
| <td align="center">2</td> | |
| </tr> | |
| <tr> | |
| <td align="center">Number of Layers</td> | |
| <td align="center">12</td> | |
| </tr> | |
| <tr> | |
| <td align="center">Attention Hidden Dimension</td> | |
| <td align="center">1024</td> | |
| </tr> | |
| <tr> | |
| <td align="center">Number of Attention Heads</td> | |
| <td align="center">16 (GQA, 8 KV groups)</td> | |
| </tr> | |
| <tr> | |
| <td align="center">MoE Hidden Dimension (per Expert)</td> | |
| <td align="center">4096</td> | |
| </tr> | |
| <tr> | |
| <td align="center">Vocabulary Size</td> | |
| <td align="center">128 (padded)</td> | |
| </tr> | |
| <tr> | |
| <td align="center">Context Length</td> | |
| <td align="center">up to 1Mb</td> | |
| </tr> | |
| </tbody> | |
| </table> | |
| </div> |