| This repository contains AWS Inferentia2 and neuronx compatible checkpoints for [Mistral-Large-Instruct](https://huggingface.co/mistralai/Mistral-Large-Instruct-2407). You can find detailed information about the base model on its [Model Card](https://huggingface.co/mistralai/Mistral-Large-Instruct-2407). |
|
|
| This model has been exported to the neuron format using specific input_shapes and compiler parameters detailed in the paragraphs below. |
| |
| It has been compiled to run on an inf2.48xlarge instance on AWS. Note that while the inf2.48xlarge has 24 cores, this compilation uses 24. |
| --- |
| |
| SEQUENCE_LENGTH = 4096 |
|
|
| BATCH_SIZE = 4 |
| |
| NUM_CORES = 24 |
|
|
| PRECISION = "bf16" |
|
|
| license: other |
| license_name: mrl |
| license_link: LICENSE |
| language: |
| - en |
| base_model: |
| - mistralai/Mistral-Large-Instruct-2407 |
| pipeline_tag: text-generation |
| --- |