| | --- |
| | license: cc-by-nc-4.0 |
| | base_model: bigcode/starcoder2-7b |
| | language: |
| | - en |
| | library_name: transformers |
| | pipeline_tag: text-generation |
| | tags: |
| | - code |
| | - starcoder |
| | - bigcode |
| | - sft |
| | - 7b |
| | --- |
| | # Starcoder-2-chat |
| |
|
| |  |
| |
|
| |
|
| |
|
| | <!-- Provide a quick summary of what the model is/does. --> |
| | Starcoder-2-chat is an instruction fine-tuned of [bigcode/starcoder2-7b](https://huggingface.co/bigcode/starcoder2-7b) using the [glaiveai/glaive-code-assistant-v2](https://huggingface.co/datasets/glaiveai/glaive-code-assistant-v2) dataset using LoRA. |
| |
|
| |
|
| | ## 🏆 Evaluation results |
| |
|
| | Thanks to [Muhammad Bin Usman](https://www.linkedin.com/in/muhammad-bin-usman/) for running evals on Starcoder2-chat. |
| |
|
| | ### HUMANEVAL |
| | 0.3231707317073171 |
| | |
| | ### HUMANEVALPLUS |
| | 0.25609756097560976 |
| | |
| | ### INSTRUCT-HUMANEVAL |
| | 0.3231707317073171 |
| | |
| | ### Training hyperparameters |
| |
|
| | The following hyperparameters were used during training: |
| | - learning_rate: 5e-7 |
| | - train_batch_size: 2 |
| | - eval_batch_size: Not specified |
| | - seed: Not specified |
| | - gradient_accumulation_steps: 8 |
| | - total_train_batch_size: Not specified |
| | - optimizer: PagedAdamW with 32-bit precision |
| | - lr_scheduler_type: Cosine |
| | - lr_scheduler_warmup_steps: 100 |
| | - training_epoch: 1 |
| |
|
| |
|
| | ### Framework versions |
| |
|
| | - Transformers 4.39.0.dev0 |
| | - Peft 0.9.1.dev0 |
| | - Datasets 2.18.0 |
| | - torch 2.2.0 |
| | - accelerate 0.27.2 |