| --- |
| license: apache-2.0 |
| language: |
| - en |
| tags: |
| - peer-review |
| - scientific-review |
| - lora |
| - qwen |
| - fine-tuned |
| - icml2026 |
| pipeline_tag: text-generation |
| --- |
| |
| # FirstPass: Grounding AI Scientific Judgment in Multi-Round Editorial Outcomes |
|
|
| Official model weights for the **ICML 2026 AI4Science Workshop** papers: |
| - **Research Track**: "FirstPass: Grounding AI Scientific Judgment in Multi-Round Editorial Outcomes" |
| - **Dataset Competition Track**: "FIRSTPASS: A Multi-Domain, Multi-Round Peer Review Dataset Grounded in Real Editorial Outcomes" |
|
|
| **Authors:** Prabhjot Singh, Somnath Luitel, Manmeet Singh, Josh Durkee |
|
|
| --- |
|
|
| ## ποΈ What's in this repo |
|
|
| | Folder | Task | Description | |
| |---|---|---| |
| | `cls_revision_prediction/` | Classification | Predicts Standard (2-round) vs Extended (3+ round) editorial outcome | |
| | `sft_review_generation/` | Generation | Generates full scientific peer reviews | |
| | `checkpoints_v4/cls_adapter/final/` | Classification (alt path) | Same cls adapter via checkpoints folder | |
| | `checkpoints_v4/sft_adapter/final/` | Generation (alt path) | Same sft adapter via checkpoints folder | |
|
|
| **Use `cls_revision_prediction/` and `sft_review_generation/` directly** β those are the clean final adapters. |
|
|
| --- |
|
|
| ## π Key Results |
|
|
| | Task | Metric | Score | |
| |---|---|---| |
| | Revision-Cycle Prediction | Accuracy | **80.5%** | |
| | Revision-Cycle Prediction | F1-macro | **78.2%** | |
| | Review Generation | ROUGE-L | **0.154** | |
| | Review Generation | Avg. length | 1,187 words | |
|
|
| The masking finding: without response-only loss masking β 62.0% accuracy (below majority baseline). With masking β 80.5%. For long-input/short-output classification, masking is an architectural prerequisite, not an optimization trick. |
|
|
| --- |
|
|
| ## π Quick Start |
|
|
| ```python |
| from peft import PeftModel |
| from transformers import AutoModelForCausalLM, AutoTokenizer |
| |
| base_model = "Qwen/Qwen2.5-7B-Instruct" # or whichever base was used |
| adapter_path = "prabhjotschugh/FirstPass-Models/cls_revision_prediction" |
| |
| tokenizer = AutoTokenizer.from_pretrained(adapter_path) |
| model = AutoModelForCausalLM.from_pretrained(base_model) |
| model = PeftModel.from_pretrained(model, adapter_path) |
| ``` |
|
|
| --- |
|
|
| ## π₯ Dataset |
|
|
| [π€ firstpass-peer-review](https://huggingface.co/datasets/Prabhjotschugh/firstpass-peer-review) β 3,668 multi-round peer-review dialogues from *Nature Communications* across 5 domains (biology, chemistry, neuroscience, physics, earth science). |
|
|