| | --- |
| | language: code |
| | tags: |
| | - code |
| | - translation |
| | - codet5 |
| | - vbnet |
| | - csharp |
| | - programming |
| | - source-code |
| | datasets: |
| | - custom |
| | license: mit |
| | library_name: transformers |
| | pipeline_tag: translation |
| | model_type: codet5 |
| | --- |
| | # π CodeT5 VB.NET β C# Translator |
| |
|
| | This is a fine-tuned version of [Salesforce/CodeT5-base](https://huggingface.co/Salesforce/codet5-base) for translating VB.NET to C#. |
| |
|
| | --- |
| |
|
| | # π Evaluation Metrics |
| |
|
| | **BLEU Score:** 0.4506 |
| | - 1-gram: 0.6698 |
| | - 2-gram: 0.5402 |
| | - 3-gram: 0.4656 |
| | - 4-gram: 0.4132 |
| | - Brevity penalty: 0.8773 |
| | - Length ratio: 0.8843 |
| |
|
| | **ROUGE Scores:** |
| | - ROUGE-1: 0.5836 |
| | - ROUGE-2: 0.4586 |
| | - ROUGE-L: 0.5378 |
| | - ROUGE-Lsum: 0.5781 |
| |
|
| | --- |
| |
|
| | # π§ Usage |
| |
|
| | ```python |
| | from transformers import AutoTokenizer, AutoModelForSeq2SeqLM |
| | |
| | model = AutoModelForSeq2SeqLM.from_pretrained("{repo_id}") |
| | tokenizer = AutoTokenizer.from_pretrained("{repo_id}") |
| | |
| | vb_code = "Dim x As Integer = 5" |
| | inputs = tokenizer(f"translate VB.NET to C#: {vb_code}", return_tensors="pt") |
| | outputs = model.generate(**inputs) |
| | print(tokenizer.decode(outputs[0], skip_special_tokens=True)) |
| | ``` |
| |
|
| | # π Dataset Format |
| |
|
| | Training data was in JSONL with fields: |
| | - `"vb_code"`: VB.NET input |
| | - `"csharp_code"`: corresponding C# output |
| |
|
| | # π License |
| |
|
| | MIT |
| |
|