README.md · beyoru/MinCoder-4B-Expert at main

Update README.md

dc31edb verified 4 months ago

758 Bytes

	---
	base_model:
	- beyoru/EvolLLM
	tags:
	- text-generation-inference
	- transformers
	- qwen3
	- code
	- tool
	- agent
	- evolution
	- merge
	- RL
	- grpo
	license: apache-2.0
	language:
	- en
	---

	This model is fine-tuned Qwen model using a custom reinforcement learning (RL) framework that rewards the model for producing solutions passing automated test cases — similar to the process of programming task evaluation on LeetCode.

	<p align="center">
	<img src="https://cdn-uploads.huggingface.co/production/uploads/65905af887944e494e37e09a/s4drmYGEYWZyt2ZUkxIpI.png" width="300">
	</p>


	Instead of relying on labeled ground truth answers, the model learns through test-case-based rewards, promoting generalization and reasoning ability in algorithmic problem-solving.