| | --- |
| | base_model: |
| | - beyoru/EvolLLM |
| | tags: |
| | - text-generation-inference |
| | - transformers |
| | - qwen3 |
| | - code |
| | - tool |
| | - agent |
| | - evolution |
| | - merge |
| | - RL |
| | - grpo |
| | license: apache-2.0 |
| | language: |
| | - en |
| | --- |
| | |
| | This model is fine-tuned Qwen model using a custom reinforcement learning (RL) framework that rewards the model for producing solutions passing automated test cases — similar to the process of programming task evaluation on LeetCode. |
| |
|
| | <p align="center"> |
| | <img src="https://cdn-uploads.huggingface.co/production/uploads/65905af887944e494e37e09a/s4drmYGEYWZyt2ZUkxIpI.png" width="300"> |
| | </p> |
| |
|
| |
|
| | Instead of relying on labeled ground truth answers, the model learns through test-case-based rewards, promoting generalization and reasoning ability in algorithmic problem-solving. |