| --- |
| language: en |
| license: apache-2.0 |
| base_model: microsoft/prophetnet-large-uncased |
| tags: |
| - summarization |
| - research-paper |
| - seq2seq |
| - prophetnet |
| - lora |
| - peft |
| datasets: |
| - custom |
| metrics: |
| - rouge |
| - bertscore |
| --- |
| |
| # ProphetNet-Large-Summarization |
|
|
| A fine-tuned version of [microsoft/prophetnet-large-uncased](https://huggingface.co/microsoft/prophetnet-large-uncased) for summarizing research papers into concise summaries. This is the first stage of a two-step **Research Paper Simplifier** pipeline. |
|
|
| ## Model Description |
|
|
| This model takes a section of a research paper as input and generates a plain-language summary. Fine-tuned using LoRA (PEFT) with 4-bit quantization for efficient training. |
|
|
| ## Pipeline |
|
|
| ``` |
| Research Paper βββΊ [ProphetNet-Large-Summarization] βββΊ Summary βββΊ [ProphetNet-Large-Story-Generation] βββΊ Story |
| ``` |
|
|
| ## Training Details |
|
|
| | Parameter | Value | |
| |-----------|-------| |
| | Base model | microsoft/prophetnet-large-uncased | |
| | Task | Summarization | |
| | Max input length | 2048 tokens | |
| | Max target length | 256 tokens | |
| | Learning rate | 3e-5 | |
| | Batch size | 2 | |
| | Gradient accumulation steps | 4 | |
| | Warmup steps | 1500 | |
| | Weight decay | 0.01 | |
| | Fine-tuning method | LoRA (r=16, alpha=64, targets: query_proj, value_proj) | |
| | Quantization | 4-bit NF4 (bitsandbytes) | |
|
|
| ## Usage |
|
|
| ```python |
| from transformers import AutoTokenizer, AutoModelForSeq2SeqLM |
| |
| tokenizer = AutoTokenizer.from_pretrained("harsharajkumar273/ProphetNet-Large-Summarization") |
| model = AutoModelForSeq2SeqLM.from_pretrained("harsharajkumar273/ProphetNet-Large-Summarization") |
| |
| text = "Your research paper section here..." |
| word_count = len(text.split()) |
| prompt = f"Summarize this part of the research paper to less than {word_count // 10} words:\n{text}" |
| |
| inputs = tokenizer(prompt, return_tensors="pt", max_length=2048, truncation=True) |
| outputs = model.generate(**inputs, max_length=256, num_beams=4) |
| summary = tokenizer.decode(outputs[0], skip_special_tokens=True) |
| print(summary) |
| ``` |
|
|
| ## Evaluation Metrics |
|
|
| Evaluated using ROUGE and BERTScore on a held-out 10% test split. |
|
|
| ## Related Models |
|
|
| - [harsharajkumar273/Bart-Base-Summarization](https://huggingface.co/harsharajkumar273/Bart-Base-Summarization) |
| - [harsharajkumar273/T5-Base-Summarization](https://huggingface.co/harsharajkumar273/T5-Base-Summarization) |
| - [harsharajkumar273/ProphetNet-Large-Story-Generation](https://huggingface.co/harsharajkumar273/ProphetNet-Large-Story-Generation) β next stage |
|
|