| | --- |
| | datasets: |
| | - vikp/python_code_instructions_filtered |
| | --- |
| | |
| | Code llama 7b finetuned for 1 epoch on a subset of the python code instructions dataset. Scores `.62` in humaneval with greedy decoding (matched to code llama pass@1). |
| |
|
| | To use in inference, you'll need to set `trust_remote_code = True` to pick up the right rope theta value: |
| |
|
| | ``` |
| | from transformers import AutoModelForCausalLM |
| | from transformers import AutoTokenizer |
| | |
| | tokenizer = AutoTokenizer.from_pretrained("vikp/llama_coder") |
| | model = AutoModelForCausalLM.from_pretrained("vikp/llama_coder", trust_remote_code=True) |
| | |
| | text = tokenizer.bos_token + """\ |
| | import socket |
| | |
| | def ping_exponential_backoff(host: str):""".lstrip() |
| | |
| | tokens = tokenizer(text, return_tensors="pt") |
| | output = model.generate(**tokens, max_new_tokens=128, do_sample=True, temperature=.1, top_p=1.0) |
| | print(tokenizer.decode(output[0], skip_special_tokens=True).strip()) |
| | ``` |
| |
|
| | You can duplicate benchmark results with the bigcode eval harness: |
| |
|
| | ``` |
| | git clone https://github.com/bigcode-project/bigcode-evaluation-harness.git |
| | cd bigcode-evaluation-harness |
| | pip install -e . |
| | ``` |
| |
|
| | ``` |
| | accelerate launch main.py \ |
| | --model vikp/instruct_llama_7b \ |
| | --tasks humaneval \ |
| | --max_length_generation 1024 \ |
| | --temperature 0 \ |
| | --do_sample False \ |
| | --n_samples 1 \ |
| | --precision fp16 \ |
| | --allow_code_execution \ |
| | --save_generations \ |
| | --use_auth_token \ |
| | --trust_remote_code |
| | ``` |