harsharajkumar273
/

ProphetNet-Large-Summarization

Model card Files Files and versions

ProphetNet-Large-Summarization / README.md

harsharajkumar273's picture

harsharajkumar273

Upload README.md with huggingface_hub

d25060a verified 22 days ago

|

history blame contribute delete

2.5 kB

	---
	language: en
	license: apache-2.0
	base_model: microsoft/prophetnet-large-uncased
	tags:
	- summarization
	- research-paper
	- seq2seq
	- prophetnet
	- lora
	- peft
	datasets:
	- custom
	metrics:
	- rouge
	- bertscore
	---

	# ProphetNet-Large-Summarization

	A fine-tuned version of [microsoft/prophetnet-large-uncased](https://huggingface.co/microsoft/prophetnet-large-uncased) for summarizing research papers into concise summaries. This is the first stage of a two-step Research Paper Simplifier pipeline.

	## Model Description

	This model takes a section of a research paper as input and generates a plain-language summary. Fine-tuned using LoRA (PEFT) with 4-bit quantization for efficient training.

	## Pipeline

	```
	Research Paper ──► [ProphetNet-Large-Summarization] ──► Summary ──► [ProphetNet-Large-Story-Generation] ──► Story
	```

	## Training Details

	\| Parameter \| Value \|
	\|-----------\|-------\|
	\| Base model \| microsoft/prophetnet-large-uncased \|
	\| Task \| Summarization \|
	\| Max input length \| 2048 tokens \|
	\| Max target length \| 256 tokens \|
	\| Learning rate \| 3e-5 \|
	\| Batch size \| 2 \|
	\| Gradient accumulation steps \| 4 \|
	\| Warmup steps \| 1500 \|
	\| Weight decay \| 0.01 \|
	\| Fine-tuning method \| LoRA (r=16, alpha=64, targets: query_proj, value_proj) \|
	\| Quantization \| 4-bit NF4 (bitsandbytes) \|

	## Usage

	```python
	from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

	tokenizer = AutoTokenizer.from_pretrained("harsharajkumar273/ProphetNet-Large-Summarization")
	model = AutoModelForSeq2SeqLM.from_pretrained("harsharajkumar273/ProphetNet-Large-Summarization")

	text = "Your research paper section here..."
	word_count = len(text.split())
	prompt = f"Summarize this part of the research paper to less than {word_count // 10} words:\n{text}"

	inputs = tokenizer(prompt, return_tensors="pt", max_length=2048, truncation=True)
	outputs = model.generate(**inputs, max_length=256, num_beams=4)
	summary = tokenizer.decode(outputs[0], skip_special_tokens=True)
	print(summary)
	```

	## Evaluation Metrics

	Evaluated using ROUGE and BERTScore on a held-out 10% test split.

	## Related Models

	- [harsharajkumar273/Bart-Base-Summarization](https://huggingface.co/harsharajkumar273/Bart-Base-Summarization)
	- [harsharajkumar273/T5-Base-Summarization](https://huggingface.co/harsharajkumar273/T5-Base-Summarization)
	- [harsharajkumar273/ProphetNet-Large-Story-Generation](https://huggingface.co/harsharajkumar273/ProphetNet-Large-Story-Generation) — next stage