StanfordSCALE
/

fev_pp_points_classifier

Model card Files Files and versions

fev_pp_points_classifier / README.md

jpmartinezc's picture

Create README.md

e0f3bbf verified 13 days ago

|

history blame contribute delete

2.58 kB

	# Model Name

	This model predicts whether a chat message should earn participation points. It was developed for the FEV Participation Points project, which studied an intervention where elementary and middle school tutors received guidance on awarding participation points during math tutoring sessions. The tutoring is chat-based.

	---

	## Training Details

	## Base model

	Bert base model

	### Datasets
	The dataset consisted on a subset of 1,000 messages that include the word "point" in the utterance.

	\| Dataset \| Split \| Size \| Source \| Notes \|
	\|---------\|-------\|------\|--------\|-------\|
	\| Tutor math chats \| train \| 1,000 \| Shared by tutoring provider\| Contains only utterances with the word "point" \|


	### Hyperparameters

	\| Parameter \| Value \|
	\|-----------\|-------\|
	\| Learning rate \| 1e-5 \|
	\| Batch size \| 8 \|
	\| Optimizer \| AdamW (beta1=0.9, beta2=0.999, epsilon=1*10-8) \|
	\| Epochs / Steps \| 20 epochs with early stopping (F1 on minority class)\|
	\| Warmup \| 0 \|
	\| Weight decay \| 0.01 \|

	---

	## Evaluation

	### Results

	\| Model \| Dataset \| Split \| Metric \| Score \|
	\|-------\|---------\|-------\|--------\|-------\|
	\| This model \| Subset of math messages with points awarded \| test \| F1 - Yes \| 0.9943 \|
	\| This model \| Subset of math messages with points awarded \| test \| F1 - No \| 0.9583 \|

	### Limitations and Caveats

	- Model is highly specific for taks related to FEV Participation points
	- The model was trained on a subset of messages that include the word "point" in the utterance

	---

	## How to Use

	### Message Structure

	The classifier predicts directly on the message, with no previous context or following utterances.


	### Running instructions

	```python
	from transformers import AutoTokenizer, AutoModelForSequenceClassification
	import torch

	model_dir = "model_outputs" # or a specific checkpoint folder
	tokenizer = AutoTokenizer.from_pretrained(model_dir)
	model = AutoModelForSequenceClassification.from_pretrained(model_dir)
	model.eval()

	text = "Your message here"
	inputs = tokenizer(text, return_tensors="pt", truncation=True)
	with torch.no_grad():
	logits = model(**inputs).logits
	pred_id = logits.argmax(dim=-1).item()
	label = {0: "no", 1: "yes"}[pred_id]
	print(label)
	```

	---

	## Code and Responsibles

	Repository: https://github.com/scale-nssa/fev_partpoints_nlp
	Maintainers / Contributors: FEV Participation Points team (lead: JP Martinez)

	---

	## Bias and Fairness

	Dataset does not have information about the tutor or student demographic

	---

	## License

	This model is released under [License Name](https://example.com/license).

	---