MathBenchmarks _Talkmoves
Collection
9 items • Updated
This model is a fine-tuned version of roberta-large that classifies tutor utterances based on teacher talk moves from talkmoves.com. The model identifies whether a tutor's message represents one of the following three categories, or none of them:
This model is trained on text of tutoring sessions from three different tutoring providers, annotated by two raters.
| Class | Examples | Percentage |
|---|---|---|
| None | 987 | 53.4% |
| Classroom management | 284 | 15.4% |
| Pressing for accuracy or reasoning | 474 | 25.6% |
| Restating or revoicing | 104 | 5.6% |
The is trained on utterances with the following format: [PRETEXT] {3 previous messages} [TEXT] {target message},
where [PRETEXT] and [TEXT] are special tokens.
Names are anonymized, message text is lowercased, and leading and trailing whitespace is removed.
Example:
[PRETEXT] tutor: hello there [student]
tutor: what is the answer to this problem?
student: the answer is 6 [TEXT] tutor: why do you say the answer is 6?
Test set results (264 examples):
| Class | Precision | Recall | F1-Score | Support |
|---|---|---|---|---|
| None | 0.9368 | 0.9271 | 0.9319 | 192 |
| Classroom Management | 0.6800 | 0.7727 | 0.7234 | 22 |
| Pressing for Accuracy or Reasoning | 0.8605 | 0.8810 | 0.8706 | 42 |
| Restating or Revoicing | 0.3333 | 0.2500 | 0.2857 | 8 |
| Macro Average | 0.7027 | 0.7077 | 0.7029 | 264 |
| Weighted Average | 0.8850 | 0.8864 | 0.8852 | 264 |