shawnxzhu commited on
Commit
f4144d4
·
verified ·
1 Parent(s): 8234e1e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -7,7 +7,7 @@ tags:
7
  - CodeScaler
8
  license: mit
9
  datasets:
10
- - LARK-Lab/CodeScalerPair-52K
11
  language:
12
  - en
13
  base_model:
@@ -44,7 +44,7 @@ base_model:
44
 
45
  We propose **CodeScaler**, an execution-free reward model designed to scale both reinforcement learning training and test-time inference for code generation. **CodeScaler** is trained on carefully curated preference data derived from verified code problems and incorporates syntax-aware code extraction and validity-preserving reward shaping to ensure stable and robust optimization.
46
 
47
- This model is the official CodeScaler-1.7B trained from Skywork/Skywork-Reward-V2-Qwen3-1.7B on [LARK-Lab/CodeScalerPair-52K](https://huggingface.co/datasets/LARK-Lab/CodeScalerPair-52K).
48
 
49
  ## Performance on RM-Bench
50
  | Model | Code | Chat | Math | Safety | Easy | Normal | Hard | Avg |
 
7
  - CodeScaler
8
  license: mit
9
  datasets:
10
+ - LARK-Lab/CodeScalerPair-51K
11
  language:
12
  - en
13
  base_model:
 
44
 
45
  We propose **CodeScaler**, an execution-free reward model designed to scale both reinforcement learning training and test-time inference for code generation. **CodeScaler** is trained on carefully curated preference data derived from verified code problems and incorporates syntax-aware code extraction and validity-preserving reward shaping to ensure stable and robust optimization.
46
 
47
+ This model is the official CodeScaler-1.7B trained from Skywork/Skywork-Reward-V2-Qwen3-1.7B on [LARK-Lab/CodeScalerPair-51K](https://huggingface.co/datasets/LARK-Lab/CodeScalerPair-51K).
48
 
49
  ## Performance on RM-Bench
50
  | Model | Code | Chat | Math | Safety | Easy | Normal | Hard | Avg |