--- license: apache-2.0 language: - en tags: - peer-review - scientific-review - lora - qwen - fine-tuned - icml2026 pipeline_tag: text-generation --- # FirstPass: Grounding AI Scientific Judgment in Multi-Round Editorial Outcomes Official model weights for the **ICML 2026 AI4Science Workshop** papers: - **Research Track**: "FirstPass: Grounding AI Scientific Judgment in Multi-Round Editorial Outcomes" - **Dataset Competition Track**: "FIRSTPASS: A Multi-Domain, Multi-Round Peer Review Dataset Grounded in Real Editorial Outcomes" **Authors:** Prabhjot Singh, Somnath Luitel, Manmeet Singh, Josh Durkee --- ## 🗂️ What's in this repo | Folder | Task | Description | |---|---|---| | `cls_revision_prediction/` | Classification | Predicts Standard (2-round) vs Extended (3+ round) editorial outcome | | `sft_review_generation/` | Generation | Generates full scientific peer reviews | | `checkpoints_v4/cls_adapter/final/` | Classification (alt path) | Same cls adapter via checkpoints folder | | `checkpoints_v4/sft_adapter/final/` | Generation (alt path) | Same sft adapter via checkpoints folder | **Use `cls_revision_prediction/` and `sft_review_generation/` directly** — those are the clean final adapters. --- ## 📊 Key Results | Task | Metric | Score | |---|---|---| | Revision-Cycle Prediction | Accuracy | **80.5%** | | Revision-Cycle Prediction | F1-macro | **78.2%** | | Review Generation | ROUGE-L | **0.154** | | Review Generation | Avg. length | 1,187 words | The masking finding: without response-only loss masking → 62.0% accuracy (below majority baseline). With masking → 80.5%. For long-input/short-output classification, masking is an architectural prerequisite, not an optimization trick. --- ## 🚀 Quick Start ```python from peft import PeftModel from transformers import AutoModelForCausalLM, AutoTokenizer base_model = "Qwen/Qwen2.5-7B-Instruct" # or whichever base was used adapter_path = "prabhjotschugh/FirstPass-Models/cls_revision_prediction" tokenizer = AutoTokenizer.from_pretrained(adapter_path) model = AutoModelForCausalLM.from_pretrained(base_model) model = PeftModel.from_pretrained(model, adapter_path) ``` --- ## 📥 Dataset [🤗 firstpass-peer-review](https://huggingface.co/datasets/Prabhjotschugh/firstpass-peer-review) — 3,668 multi-round peer-review dialogues from *Nature Communications* across 5 domains (biology, chemistry, neuroscience, physics, earth science).