ZhejiangLab
/

OneGenome-Rice

Model card Files Files and versions

OneGenome-Rice / README.md

zhejianglab-ospo's picture

zhejianglab-ospo

Update README.md

6d35880 verified 6 days ago

|

history blame contribute delete

2.43 kB

	---
	license: apache-2.0
	tags:
	- biology
	---
	<div align="center">
	<img src="https://cdn-uploads.huggingface.co/production/uploads/65a9e8563b9e1f0f308378b7/H2qI2OOSl-KqOlg01fRGR.png" width="50%" />
	</div>

	# OneGenome-Rice (OGR)

	OGR is a foundational model for AI-driven precision breeding and functional genomics in rice. It is a generative genomic foundation model trained to process DNA sequences up to 1 million base pairs in length, with 1.25B total parameters and a Mixture-of-Experts (MoE) architecture. It was pre-trained on a curated corpus of 422 rice genomes spanning cultivated and wild Oryza diversity.

	For instructions, details, and examples, see the project repository [OGR GitHub](https://github.com/zhejianglab/OneGenome-Rice).

	The table below summarizes training scale and key hyperparameters.

	<div align="center">

	<table>
	<thead>
	<tr>
	<th align="center"><strong>Model Specification</strong></th>
	<th align="center"><strong>OneGenomeRice (OGR)</strong></th>
	</tr>
	</thead>
	<tbody>
	<tr>
	<td align="center" colspan="2"><strong>Model Scale</strong></td>
	</tr>
	<tr>
	<td align="center">Total Parameters</td>
	<td align="center">1.25B</td>
	</tr>
	<tr>
	<td align="center">Activated Parameters</td>
	<td align="center">0.33B</td>
	</tr>
	<tr>
	<td align="center" colspan="2"><strong>Architecture</strong></td>
	</tr>
	<tr>
	<td align="center">Architecture</td>
	<td align="center">MoE</td>
	</tr>
	<tr>
	<td align="center">Number of Experts</td>
	<td align="center">8</td>
	</tr>
	<tr>
	<td align="center">Selected Experts per Token</td>
	<td align="center">2</td>
	</tr>
	<tr>
	<td align="center">Number of Layers</td>
	<td align="center">12</td>
	</tr>
	<tr>
	<td align="center">Attention Hidden Dimension</td>
	<td align="center">1024</td>
	</tr>
	<tr>
	<td align="center">Number of Attention Heads</td>
	<td align="center">16 (GQA, 8 KV groups)</td>
	</tr>
	<tr>
	<td align="center">MoE Hidden Dimension (per Expert)</td>
	<td align="center">4096</td>
	</tr>
	<tr>
	<td align="center">Vocabulary Size</td>
	<td align="center">128 (padded)</td>
	</tr>
	<tr>
	<td align="center">Context Length</td>
	<td align="center">up to 1Mb</td>
	</tr>
	</tbody>
	</table>

	</div>