| --- |
| license: apache-2.0 |
| language: |
| - en |
| base_model: |
| - Qwen/Qwen2.5-Coder-0.5B |
| pipeline_tag: text-generation |
| datasets: |
| - seniruk/git-diff_to_commit_msg_large |
| --- |
| |
| # Hi, I’m Seniru Epasinghe 👋 |
|
|
| I’m an AI undergraduate and an AI enthusiast, working on machine learning projects and open-source contributions. |
| I enjoy exploring AI pipelines, natural language processing, and building tools that make development easier. |
|
|
| --- |
|
|
| ## 🌐 Connect with me |
|
|
| [](https://huggingface.co/seniruk) |
| [](https://medium.com/@senirukepasinghe) |
| [](https://www.linkedin.com/in/seniru-epasinghe-b34b86232/) |
| [](https://github.com/seth2k2) |
|
|
|
|
| ### Finetuned-qwen2.5-coder-0.5B model on 100000 rows of a cutom dataset containing. git-differences and respective commit messages |
| - [dataset huggingface link](https://huggingface.co/datasets/seniruk/git-diff_to_commit_msg_large) |
| - [dataset kaggle link](https://www.kaggle.com/datasets/seniruepasinghe/git-diff-to-commit-msg-large) |
|
|
| ### Each row of the dataset was formatted as below to suit finetuning requirement of Qwen2.5-coder model so we have to use the same prompt for better results |
| ``` |
| """Generate a concise and meaningful commit message based on the provided Git diff. |
| |
| ### Git Diff: |
| {Git diff from dataset} |
| |
| ### Commit Message:""" |
| ``` |
|
|
| ### Code for inference of the gguf model is given below |
|
|
| ``` |
| from llama_cpp import Llama |
| |
| modelGGUF = Llama.from_pretrained( |
| repo_id="seniruk/qwen2.5coder-0.5B_commit_msg", |
| filename="qwen0.5-finetuned.gguf", |
| rope_scaling={"type": "linear", "factor": 2.0}, |
| chat_format=None, # Disables any chat formatting |
| n_ctx=32768, # Set the context size explicitly |
| ) |
| |
| # Define the commit message prompt (Minimal format, avoids assistant behavior) |
| commit_prompt = """Generate a meaningful commit message explaining all the changes in the provided Git diff. |
| |
| ### Git Diff: |
| {} |
| |
| ### Commit Message:""" # Removed {} after "Commit Message:" to prevent pre-filled text. |
| |
| # Git diff example for commit message generation |
| git_diff_example = """ |
| diff --git a/index.html b/index.html |
| index 89abcde..f123456 100644 |
| --- a/index.html |
| +++ b/index.html |
| @@ -5,16 +5,6 @@ <body> |
| <h1>Welcome to My Page</h1> |
| |
| - <table border="1"> |
| - <tr> |
| - <th>Name</th> |
| - <th>Age</th> |
| - </tr> |
| - <tr> |
| - <td>John Doe</td> |
| - <td>30</td> |
| - </tr> |
| - </table> |
| |
| + <p>This is a newly added paragraph replacing the table.</p> |
| </body> |
| </html> |
| """ |
| |
| # Prepare the raw input prompt |
| input_prompt = commit_prompt.format(git_diff_example) |
| |
| # Generate commit message |
| output = modelGGUF( |
| input_prompt, |
| max_tokens=64, |
| temperature=0.6, # Balanced randomness |
| top_p=0.8, # Controls nucleus sampling |
| top_k=50, # Limits vocabulary selection |
| ) |
| |
| # Decode and print the output |
| commit_message = output["choices"][0]["text"].strip() |
| |
| print("\nGenerated Commit Message:\n{}".format(commit_message)) |
| ``` |