| | --- |
| | license: apache-2.0 |
| | datasets: |
| | - bigcode/starcoderdata |
| | --- |
| | # Model Card for DeciCoder-6B |
| |
|
| | DeciCoder-6B is a 6 billion parameter decoder-only code completion model |
| | trained on the Python, Java, Javascript, Rust, C++, C, and C# subset of [Starcoder Training Dataset](https://huggingface.co/datasets/bigcode/starcoderdata). |
| | The model uses variable Grouped Query Attention and has a context window of 2k |
| | tokens. It was trained using a Fill-in-the-Middle training objective. The model's |
| | architecture was generated by Deci's proprietary Neural Architecture |
| | Search-based technology, AutoNAC. |
| |
|
| | ## Model Details |
| |
|
| | - **Developed by:** Deci |
| | - **Model type:** DeciCoder-6B is an auto-regressive language model based on the transformer decoder architecture, using variable Grouped Query Attention. |
| | - **Language(s):** Python, Java, JavaScript, Rust, C++, C, C#, Go |
| | - **License:** Model checkpoints are licensed under the [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) |
| |
|
| | ## Documentation |
| |
|
| | - Blog Post: [Introducing DeciCoder-6B: Code LLM Engineered for Accuracy & Cost Efficiency At Scale](https://deci.ai/blog/decicoder-6b-the-best-multi-language-code-generation-llm-in-its-class/) |
| | - Tutorial: [How to Run DeciCoder-6B on Qualcomm Cloud AI 100](https://github.com/quic/cloud-ai-sdk/tree/1.12/models/language_processing/decoder) |
| | - Google Colab [Notebook](http://bit.ly/DeciCoder-6B-Notebook-1) |
| | - Run DeciCoder on [AWS DL2q instances using the Qualcomm Cloud AI Platform SDK](https://bit.ly/Amazon-EC2-DL2q-Instance) |
| | - Questions: Feel free to contact us via our [Discord Community!](https://discord.com/invite/p9ecgRhDR8/) |
| |
|
| | ## Model Architecture |
| |
|
| | | Parameters | Layers | Heads | Sequence Length | GQA num_key_value_heads | |
| | |:----------|:----------|:----------|:----------|:----------| |
| | | 6B | 32 | 32 | 2k | Variable | |
| | |
| | |
| | - **Decoder layer:** Variable Grouped Query Attention |
| | - **Position Embeddings:** Rotary Position Embeddings [Su et al., 2021](https://arxiv.org/abs/2104.09864) |
| | |
| | |
| | ### How to Use |
| | |
| | ```bibtex |
| | # pip install -q transformers |
| | import torch |
| | from transformers import AutoModelForCausalLM, AutoTokenizer |
| | |
| | checkpoint = "Deci/DeciCoder-6B" |
| | device = "cuda" # for GPU usage or "cpu" for CPU usage |
| | |
| | tokenizer = AutoTokenizer.from_pretrained(checkpoint) |
| | model = AutoModelForCausalLM.from_pretrained(checkpoint, torch_dtype=torch.bfloat16, trust_remote_code=True).to(device) |
| |
|
| | inputs = tokenizer.encode("def print_hello_world():", return_tensors="pt").to(device) |
| | outputs = model.generate(inputs, max_new_tokens=100) |
| | print(tokenizer.decode(outputs[0])) |
| | |
| | ### Attribution |
| | |
| | DeciCoder-6B was trained on StarCoder Training Dataset, filtered for |
| | Python, Java, JavaScript, Ruby, RUST, C++, C, and C#. For additional information, please |
| | refer to [https://huggingface.co/datasets/bigcode/starcoderdata](https://huggingface.co/datasets/bigcode/starcoderdata). |
| | |
| | ``` |
| | |
| | ### Limitations |
| | |
| | The model has undergone training with source code from Python, Java, |
| | JavaScript, RUST, C++, C, and C#, and Go. While the primary language in the source is English, it does |
| | contain other languages. Therefore, the model can produce code snippets |
| | given some context. However, there is no assurance that the resulting |
| | code will function as expected. It might be suboptimal, contain bugs, or |
| | even exploits. |
| | |
| | ## Evaluation |
| | |
| | Below are DeciCoder-6B's pass@1 on MultiPL HumanEval scores |
| | |
| | | Python | JavaScript | Java | C++ | C# | Rust | Go | |
| | |:----------|:----------|:----------|:----------|:----------|:----------|:----------| |
| | | 33.3% | 29.3% | 30.3% |29.93% |20.31% |20.5% |77.47% | |
| | |
| | |
| | ### Runtime Benchmarks |
| | |
| | |Inference Tool | Hardware | Prompt Length | Generation Length | Throughput (tokens/sec) | |
| | |:----------|:----------|:----------|:----------|:----------| |
| | | Qualcomm Cloud AI 100 SDK | Qualcomm Cloud AI 100 | 1024 | 1024 | 531.3 | |
| | |
| | - Measured for maximal batch size on the device |
| | |
| | ## How to Cite |
| | |
| | Please cite this model using this format. |
| | |
| | ```bibtex |
| | @misc{DeciFoundationModels, |
| | title = {DeciCoder-6B}, |
| | author = {DeciAI Research Team}, |
| | year = {2024} |
| | url={[https://huggingface.co/deci/decicoder-6B](https://huggingface.co/deci/decicoder-6B)}, |
| | } |
| | ``` |