| | --- |
| | datasets: |
| | - GetSoloTech/Code-Reasoning |
| | base_model: |
| | - google/gemma-3-4b-it |
| | pipeline_tag: text-generation |
| | library_name: transformers |
| | tags: |
| | - code-generation |
| | - competitive-programming |
| | - code-reasoning |
| | - programming |
| | - algorithms |
| | - problem-solving |
| | --- |
| | |
| | # GetSoloTech/Gemma3-Code-Reasoning-4B |
| |
|
| | A finetuned version of google/gemma-3-4b-it specifically optimized for competitive programming and code reasoning tasks. This model has been trained on the high-quality [Code-Reasoning](https://huggingface.co/datasets/GetSoloTech/Code-Reasoning) dataset to enhance its capabilities in solving complex programming problems with detailed reasoning. |
| |
|
| | ## π― Model Overview |
| |
|
| | This model is a **LoRA-finetuned** version of [gemma-3-4b-it](https://huggingface.co/google/gemma-3-4b-it) with the following specifications: |
| |
|
| | - **Base Model**: gemma-3-4b-it (4.0B parameters) |
| | - **Training Method**: LoRA (Low-Rank Adaptation) |
| | - **Training Dataset**: GetSoloTech/Code-Reasoning |
| | - **Training Framework**: Unsloth with QLoRA |
| | - **Context Length**: 4096 tokens |
| | - **Model Type**: Causal Language Model with Thinking Capabilities |
| |
|
| | ## π Key Features |
| |
|
| | - **Enhanced Code Reasoning**: Specifically trained on competitive programming problems |
| | - **Thinking Capabilities**: Inherits the advanced reasoning capabilities from the base model |
| | - **High-Quality Solutions**: Trained on solutions with β₯50% test case pass rates |
| | - **Structured Output**: Optimized for generating well-reasoned programming solutions |
| | - **Efficient Training**: Uses LoRA adapters for efficient parameter updates |
| |
|
| |
|
| | ### Dataset Statistics |
| | - **Split**: Python |
| | - **Source**: High-quality competitive programming problems from TACO, APPS, CodeContests, and Codeforces |
| | - **Quality Filter**: Only correctly solved problems with β₯85% test case pass rates |
| |
|
| | ## π§ Usage |
| |
|
| | ### Basic Inference |
| |
|
| | ```python |
| | from transformers import AutoModelForCausalLM, AutoTokenizer |
| | |
| | model_name = "GetSoloTech/Gemma3-Code-Reasoning-4B" |
| | |
| | # Load the tokenizer and model |
| | tokenizer = AutoTokenizer.from_pretrained(model_name) |
| | model = AutoModelForCausalLM.from_pretrained( |
| | model_name, |
| | torch_dtype="auto", |
| | device_map="auto" |
| | ) |
| | |
| | # Prepare input for competitive programming problem |
| | messages = [ |
| | {"role": "system", "content": "You are an expert competitive programmer. Read the problem and produce a correct, efficient solution. Include reasoning if helpful."}, |
| | {"role": "user", "content": "Your programming problem here..."} |
| | ] |
| | |
| | text = tokenizer.apply_chat_template( |
| | messages, |
| | tokenize=False, |
| | add_generation_prompt=True, |
| | ) |
| | |
| | model_inputs = tokenizer([text], return_tensors="pt").to(model.device) |
| | |
| | # Generate solution |
| | generated_ids = model.generate( |
| | **model_inputs, |
| | max_new_tokens=4096, |
| | temperature=1.0, |
| | top_p=0.95, |
| | top_k=64 |
| | ) |
| | |
| | output_ids = generated_ids[0][len(model_inputs.input_ids[0]):].tolist() |
| | content = tokenizer.decode(output_ids, skip_special_tokens=True).strip("\n") |
| | print(content) |
| | ``` |
| |
|
| | ## π Performance Expectations |
| |
|
| | This finetuned model is expected to show improved performance on: |
| |
|
| | - **Competitive Programming Problems**: Better understanding of problem constraints and requirements |
| | - **Code Generation**: More accurate and efficient solutions |
| | - **Reasoning Quality**: Enhanced step-by-step reasoning for complex problems |
| | - **Solution Completeness**: More comprehensive solutions with proper edge case handling |
| |
|
| | ## ποΈ Recommended Settings |
| |
|
| | - **Temperature**: 1.0 |
| | - **Top-p**: 0.95 |
| | - **Top-k**: 64 |
| | - **Max New Tokens**: 4096 (adjust based on problem complexity) |
| |
|
| | ## π Related Resources |
| |
|
| | - **Base Model**: [gemma-3-4b-it](https://huggingface.co/google/gemma-3-4b-it) |
| | - **Training Dataset**: [Code-Reasoning](https://huggingface.co/datasets/GetSoloTech/Code-Reasoning) |
| | - **Training Framework**: [Unsloth](https://github.com/unslothai/unsloth) |
| | - **Original Dataset**: [OpenCodeReasoning-2](https://huggingface.co/datasets/nvidia/OpenCodeReasoning-2) |
| |
|
| | ## π€ Contributing |
| |
|
| | This model was created using the Unsloth framework and the Code-Reasoning dataset. For questions about: |
| | - The base model: [Gemma3 Huggingface](https://huggingface.co/google/gemma-3-4b-it) |
| | - The training dataset: [Code-Reasoning Repository](https://huggingface.co/datasets/GetSoloTech/Code-Reasoning) |
| | - The training framework: [Unsloth Documentation](https://docs.unsloth.ai/) |
| |
|
| |
|
| | ## π Acknowledgments |
| |
|
| | - **Gemma Team** for the excellent base model |
| | - **Unsloth Team** for the efficient training framework |
| | - **NVIDIA Research** for the original OpenCodeReasoning-2 dataset |
| |
|
| | ## π Contact |
| |
|
| | For questions about this finetuned model, please open an issue in the repository. |
| |
|
| | --- |
| |
|
| | **Note**: This model is specifically optimized for competitive programming and code reasoning tasks. |