Add model card README

0509be1 verified 17 days ago

3.98 kB

	---
	license: mit
	language:
	- en
	tags:
	- intent-classification
	- tinybert
	- cnn
	- education
	- nlp
	pipeline_tag: text-classification
	---

	# IntentClassifier — TinyBERT-CNN

	A lightweight intent classification model for educational AI tutoring systems. Built on TinyBERT (huawei-noah/TinyBERT_General_4L_312D) with a CNN classification head, optimized for real-time student intent detection.

	## Model Description

	This model classifies student utterances into 5 pedagogical intent categories:

	\| Label ID \| Intent \| Description \|
	\|----------\|--------\|-------------\|
	\| 0 \| On-Topic Question \| Questions related to the current learning material \|
	\| 1 \| Off-Topic Question \| Questions unrelated to the current topic \|
	\| 2 \| Emotional-State \| Expressions of frustration, confusion, excitement, etc. \|
	\| 3 \| Pace-Related \| Requests to speed up, slow down, or adjust pacing \|
	\| 4 \| Repeat/Clarification \| Requests to repeat or clarify previous explanations \|

	## Architecture

	- Backbone: TinyBERT (4-layer, 312-dim) — compact BERT variant
	- Head: Multi-kernel CNN (filter sizes 2, 3, 4) + BatchNorm + FC hidden layer
	- Parameters: ~14.5M total
	- Inference: <50ms on CPU

	```
	TinyBERT → CNN (multi-kernel) → BatchNorm → MaxPool → FC(768→128) → FC(128→5)
	```

	## Performance

	\| Metric \| Score \|
	\|--------\|-------\|
	\| Accuracy \| 99.6% \|
	\| F1 Score \| 99.6% \|
	\| Precision \| 99.6% \|
	\| Recall \| 99.6% \|
	\| Test Loss \| 0.069 \|

	### Per-Class Performance

	\| Intent \| Precision \| Recall \| F1 \| Support \|
	\|--------\|-----------\|--------\|----\|---------\|
	\| On-Topic Question \| 0.993 \| 0.997 \| 0.995 \| 300 \|
	\| Off-Topic Question \| 0.997 \| 0.993 \| 0.995 \| 300 \|
	\| Emotional-State \| 0.993 \| 1.000 \| 0.997 \| 300 \|
	\| Pace-Related \| 0.997 \| 0.993 \| 0.995 \| 300 \|
	\| Repeat/Clarification \| 1.000 \| 0.997 \| 0.998 \| 300 \|

	## Training Details

	- Epochs: 7 (early stopping with patience=5)
	- Batch Size: 16
	- Optimizer: AdamW with discriminative fine-tuning (BERT LR: 2e-5, Head LR: 1e-3)
	- Scheduler: Warmup + Cosine decay
	- Label Smoothing: 0.1
	- Training Duration: ~212 seconds

	## Usage

	```python
	from TinyBert import IntentClassifier

	# Initialize
	classifier = IntentClassifier(num_classes=5)
	classifier.load_model("prod_tinybert.pt")

	# Predict
	texts = ["Can you explain that again?"]
	contexts = ["topic:Python Loops"]
	predictions, probabilities = classifier.predict(texts, contexts)

	intent_names = ['On-Topic Question', 'Off-Topic Question', 'Emotional-State', 'Pace-Related', 'Repeat/clarification']
	print(f"Predicted: {intent_names[predictions[0]]}")
	print(f"Confidence: {probabilities[0][predictions[0]]:.4f}")
	```

	### Compound Sentence Splitting

	The model includes a `CompoundSentenceSplitter` that can detect and split compound questions:

	```python
	from TinyBert import CompoundSentenceSplitter

	splitter = CompoundSentenceSplitter()
	questions = splitter.split_compound_question("What is a loop and how do I use it?")
	# Returns: ["What is a loop?", "how do I use it?"]
	```

	## Files

	\| File \| Description \|
	\|------\|-------------\|
	\| `TinyBert.py` \| Model architecture, IntentClassifier wrapper, CompoundSentenceSplitter \|
	\| `train.py` \| Full training pipeline with early stopping and metrics \|
	\| `auto_trainer.py` \| Automated retraining pipeline \|
	\| `dataset_generator.py` \| Synthetic data generation \|
	\| `test_suite.py` \| Comprehensive test suite \|
	\| `prod_tinybert.pt` \| Production model weights \|
	\| `best_tinybert.pt` \| Best checkpoint from latest training \|
	\| `training_results.json` \| Detailed training metrics and history \|
	\| `data/` \| Training, validation, and test datasets \|

	## Requirements

	```
	torch
	transformers
	pandas
	numpy
	scikit-learn
	tqdm
	```

	## Citation

	Part of the AI-Powered Personalized Learning Platform graduation project.