Toxic Comment Classification with Transformer Optimization
This project demonstrates a high-performance pipeline for classifying toxic comments using a binary classification approach. The models were trained and evaluated using the Jigsaw Toxic Comment Classification dataset, specifically leveraging the domain-specific Toxic-BERT model as a primary architecture.
Project Overview
- Objective: To build an efficient binary toxicity classifier using state-of-the-art NLP models.
- Model Type: Binary classification (Toxic vs. Non-Toxic).
- Dataset: Jigsaw Toxic Comment Classification Challenge.
- Scope: Includes data visualization, model benchmarking, and size reduction for deployment.
Technical Workflow
1. Data Preprocessing & EDA
- Labeling: Multi-label categories (toxic, severe_toxic, obscene, threat, insult, identity_hate) were condensed into a single binary 'is_toxic' label.
- Balancing: The dataset was sampled to include 16,000 toxic and 16,000 non-toxic comments to ensure a balanced 32,000-sample training set.
- Cleaning: Newline characters were removed to standardize the text input for transformer tokenizers.
- Visualization: Word clouds were generated for both classes to identify the most frequent terms associated with toxic and non-toxic speech.
2. Embedding Benchmarking
The project evaluated 15 different embedding sets across two categories:
- Light Models: Includes DistilBERT, MiniLM, ALBERT, and ELECTRA-Small.
- Heavy Models: Includes BERT, RoBERTa, DeBERTa, XLNet, and domain-specific models like Toxic-BERT and HateBERT.
3. Model Performance Results
Models were evaluated using Logistic Regression (LR), Support Vector Machines (SVM), and Random Forest (RF)
| Embedding | LR_AUC | LinearSVM_ACC | RBF_AUC | RF_ACC |
|---|---|---|---|---|
| Toxic-BERT_transformer_emb | 0.997022 | 0.979531 | 0.991532 | 0.979375 |
| HateBERT_transformer_emb | 0.967701 | 0.901875 | 0.965530 | 0.852344 |
| DistilBERT_transformer_emb | 0.967614 | 0.898906 | 0.967362 | 0.878125 |
Optimization Techniques
4. Dynamic Quantization
To optimize the teacher model (Toxic-BERT) for CPU inference, dynamic quantization was applied to convert weights from FP32 to INT8.
- Size Reduction: The model size decreased from 438.01 MB to 181.49 MB.
- Accuracy Retention: The quantized model maintained a high Test AUC of 0.9966, showing negligible performance loss despite the 58% reduction in size.
5. Knowledge Distillation
A smaller student model (DistilBERT) was trained to mimic the behavior of the Toxic-BERT teacher.
- Loss Function: A custom Binary Knowledge Distillation loss was used, combining Kullback-Leibler (KL) divergence for soft teacher probabilities and Cross-Entropy for hard labels.
- Student Performance: Reached a Validation AUC of 0.9866 after 5 training epochs.
- Final Footprint: The student model is 267.86 MB, significantly more portable than the original 438.03 MB teacher model.
Requirements
torchtransformerssentence-transformerspandas,numpymatplotlib,wordcloudscikit-learn
- Downloads last month
- 22
Model tree for manjt/toxic_comment_classifier
Base model
distilbert/distilbert-base-uncased