SciJudge-4B

Update: A newer release is available at SciJudge-4B-2605. We recommend using the newer release for current experiments and comparisons.

SciJudge-4B is a Qwen3-4B-Instruct-2507 model fine-tuned for scientific paper evaluation. Given two papers' titles, abstracts, and publication dates, it predicts which paper has higher citation impact.

This model is part of AI Can Learn Scientific Taste. The benchmark dataset is SciJudgeBench.

Resources: Project page and GitHub repository.

Usage

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "OpenMOSS-Team/SciJudge-4B"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.bfloat16,
    device_map="auto",
)

messages = [
    {"role": "system", "content": "You are a helpful assistant. You first think about the reasoning process in your mind and then provide the user with the answer."},
    {"role": "user", "content": "Today is 2025-12-10. Based on the titles, abstracts, and publication dates of the following two papers A and B, determine which paper has a higher citation count.\nShow your reasoning process in <reason> </reason> tags. And return the final answer in <answer> </answer> tags. The final answer should contain only 'A' or 'B'.\n\nPaper A:\nTitle: ...\nAbstract: ...\nDate: ...\n\nPaper B:\nTitle: ...\nAbstract: ...\nDate: ..."}
]

text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=2048, temperature=0.7, top_p=0.8, top_k=20)
response = tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:], skip_special_tokens=True)
print(response)

Training Details

  • Base model: Qwen/Qwen3-4B-Instruct-2507
  • Training method: GRPO with DAPO loss
  • Reward: external preference reward for citation-based pairwise judgment
  • Precision: bfloat16
  • KL coefficient: 0.03

Citation

@misc{tong2026ailearnscientifictaste,
      title={AI Can Learn Scientific Taste}, 
      author={Jingqi Tong and Mingzhe Li and Hangcheng Li and Yongzhuo Yang and Yurong Mou and Weijie Ma and Zhiheng Xi and Hongji Chen and Xiaoran Liu and Qinyuan Cheng and Ming Zhang and Qiguang Chen and Weifeng Ge and Qipeng Guo and Tianlei Ying and Tianxiang Sun and Yining Zheng and Xinchi Chen and Jun Zhao and Ning Ding and Xuanjing Huang and Yugang Jiang and Xipeng Qiu},
      year={2026},
      eprint={2603.14473},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2603.14473}, 
}
Downloads last month
320
Safetensors
Model size
4B params
Tensor type
BF16
·
Inference Providers NEW
Input a message to start chatting with OpenMOSS-Team/SciJudge-4B.

Model tree for OpenMOSS-Team/SciJudge-4B

Finetuned
(1792)
this model
Quantizations
1 model

Dataset used to train OpenMOSS-Team/SciJudge-4B

Collection including OpenMOSS-Team/SciJudge-4B

Paper for OpenMOSS-Team/SciJudge-4B