Instructions to use bigcode/starcoderbase with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use bigcode/starcoderbase with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="bigcode/starcoderbase")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("bigcode/starcoderbase") model = AutoModelForCausalLM.from_pretrained("bigcode/starcoderbase") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use bigcode/starcoderbase with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "bigcode/starcoderbase" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "bigcode/starcoderbase", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/bigcode/starcoderbase
- SGLang
How to use bigcode/starcoderbase with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "bigcode/starcoderbase" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "bigcode/starcoderbase", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "bigcode/starcoderbase" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "bigcode/starcoderbase", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use bigcode/starcoderbase with Docker Model Runner:
docker model run hf.co/bigcode/starcoderbase
May I ask where the training code is and does the training data include ABAP language?
May I ask where the training code is and does the training data include ABAP language
You can find the training codebase here: https://github.com/bigcode-project/Megatron-LM/tree/multi-query-attention There's also a repo for fine-tuning with PEFT or DeepSpeed: https://github.com/bigcode-project/starcoder
We didn't include ABAP you can find the full list of languages included in the training in the paper in table 1. But we do have ABAP in the Stack dataset if you want to try fine-tuning.
You can find the training codebase here: https://github.com/bigcode-project/Megatron-LM/tree/multi-query-attention There's also a repo for fine-tuning with PEFT or DeepSpeed: https://github.com/bigcode-project/starcoder
We didn't include ABAP you can find the full list of languages included in the training in the paper in table 1. But we do have ABAP in the Stack dataset if you want to try fine-tuning.
Thank you very much for your reply.
Do you have any suitable parameter suggestions for fine-tuning?
You can start from the default parameters in the repo and tune them if needed
You can start from the default parameters in the repo and tune them if needed
If using ABAP data from the Stack dataset, how long is the estimated fine-tuning time? I have an A100 80G machine.
I would like to do a budget assessment first
I saw a file for fine-tuning starcoder. Is this file about fine-tuning starcoderbase to starcoder?
https://github.com/bigcode-project/Megatron-LM/blob/finetune-starcoder/examples/finetune_bigcode_model.slurm
Where can I see the paths contained in the file?
STARCODER_PATH=/fsx/boomcode/starcoder/
CHECKPOINT_PATH=/fsx/boomcode/starcoderpy/$SLURM_JOB_ID
TOKENIZER_FILE=/fsx/boomcode/tokenizer-starcoder/tokenizer.json
WEIGHTS_TRAIN=/fsx/boomcode/datamix_python/train_data_paths.txt.tmp
WEIGHTS_VALID=/fsx/boomcode/datamix_python/valid_data_paths.txt.tmp
DATA_PATH=/fsx/boomcode/tokenized/python/
I saw a file for fine-tuning starcoder. Is this file about fine-tuning starcoderbase to starcoder?
https://github.com/bigcode-project/Megatron-LM/blob/finetune-starcoder/examples/finetune_bigcode_model.slurmWhere can I see the paths contained in the file?
STARCODER_PATH=/fsx/boomcode/starcoder/ CHECKPOINT_PATH=/fsx/boomcode/starcoderpy/$SLURM_JOB_ID TOKENIZER_FILE=/fsx/boomcode/tokenizer-starcoder/tokenizer.json WEIGHTS_TRAIN=/fsx/boomcode/datamix_python/train_data_paths.txt.tmp WEIGHTS_VALID=/fsx/boomcode/datamix_python/valid_data_paths.txt.tmp DATA_PATH=/fsx/boomcode/tokenized/python/
I have the same questions? where to get or generate these files?
WEIGHTS_TRAIN=/fsx/boomcode/datamix_python/train_data_paths.txt.tmp
WEIGHTS_VALID=/fsx/boomcode/datamix_python/valid_data_paths.txt.tmp
To generate the data weights you can use this repo: https://github.com/bigcode-project/bigcode-data-mix#2---substitute-the-data-path.
For short trainings, or non distributed (1 A100 in your case) using PEFT indicated here: https://github.com/bigcode-project/starcoder would be faster and easier to setup. Otherwise full fine-tuning could be expensive, for reference the fine-tuning of StarCoderBase on 35B of Python tokens to get StarCoder took ~2 days on 512 GPUs (in your case ABAP has much less data than Python so it would take much less time, but full-finetuning could be slow for one A100).
To generate the data weights you can use this repo: https://github.com/bigcode-project/bigcode-data-mix#2---substitute-the-data-path.
For short trainings, or non distributed (1 A100 in your case) using PEFT indicated here: https://github.com/bigcode-project/starcoder would be faster and easier to setup. Otherwise full fine-tuning could be expensive, for reference the fine-tuning of StarCoderBase on 35B of Python tokens to get StarCoder took ~2 days on 512 GPUs (in your case ABAP has much less data than Python so it would take much less time, but full-finetuning could be slow for one A100).
thx, but how to generate the files in "gpt2-preprocessed_content_with_meta_document" folder from raw .parquet files
you need to tokenize the data with Megatron-LM, see their readme
To generate the data weights you can use this repo: https://github.com/bigcode-project/bigcode-data-mix#2---substitute-the-data-path.
For short trainings, or non distributed (1 A100 in your case) using PEFT indicated here: https://github.com/bigcode-project/starcoder would be faster and easier to setup. Otherwise full fine-tuning could be expensive, for reference the fine-tuning of StarCoderBase on 35B of Python tokens to get StarCoder took ~2 days on 512 GPUs (in your case ABAP has much less data than Python so it would take much less time, but full-finetuning could be slow for one A100).
What directory does CHECKPOINT_PATH refer to? Where can I find CHECKPOINT_PATH for starcoder?
It's the paths where the checkpoints are saved, for us it was specific to our cluster