Toxic Comment Classification with Transformer Optimization

This project demonstrates a high-performance pipeline for classifying toxic comments using a binary classification approach. The models were trained and evaluated using the Jigsaw Toxic Comment Classification dataset, specifically leveraging the domain-specific Toxic-BERT model as a primary architecture.

Project Overview

Objective: To build an efficient binary toxicity classifier using state-of-the-art NLP models.
Model Type: Binary classification (Toxic vs. Non-Toxic).
Dataset: Jigsaw Toxic Comment Classification Challenge.
Scope: Includes data visualization, model benchmarking, and size reduction for deployment.

Technical Workflow

1. Data Preprocessing & EDA

Labeling: Multi-label categories (toxic, severe_toxic, obscene, threat, insult, identity_hate) were condensed into a single binary 'is_toxic' label.
Balancing: The dataset was sampled to include 16,000 toxic and 16,000 non-toxic comments to ensure a balanced 32,000-sample training set.
Cleaning: Newline characters were removed to standardize the text input for transformer tokenizers.
Visualization: Word clouds were generated for both classes to identify the most frequent terms associated with toxic and non-toxic speech.

2. Embedding Benchmarking

The project evaluated 15 different embedding sets across two categories:

Light Models: Includes DistilBERT, MiniLM, ALBERT, and ELECTRA-Small.
Heavy Models: Includes BERT, RoBERTa, DeBERTa, XLNet, and domain-specific models like Toxic-BERT and HateBERT.

3. Model Performance Results

Models were evaluated using Logistic Regression (LR), Support Vector Machines (SVM), and Random Forest (RF)

Embedding	LR_AUC	LinearSVM_ACC	RBF_AUC	RF_ACC
Toxic-BERT_transformer_emb	0.997022	0.979531	0.991532	0.979375
HateBERT_transformer_emb	0.967701	0.901875	0.965530	0.852344
DistilBERT_transformer_emb	0.967614	0.898906	0.967362	0.878125

Optimization Techniques

4. Dynamic Quantization

To optimize the teacher model (Toxic-BERT) for CPU inference, dynamic quantization was applied to convert weights from FP32 to INT8.

Size Reduction: The model size decreased from 438.01 MB to 181.49 MB.
Accuracy Retention: The quantized model maintained a high Test AUC of 0.9966, showing negligible performance loss despite the 58% reduction in size.

5. Knowledge Distillation

A smaller student model (DistilBERT) was trained to mimic the behavior of the Toxic-BERT teacher.

Loss Function: A custom Binary Knowledge Distillation loss was used, combining Kullback-Leibler (KL) divergence for soft teacher probabilities and Cross-Entropy for hard labels.
Student Performance: Reached a Validation AUC of 0.9866 after 5 training epochs.
Final Footprint: The student model is 267.86 MB, significantly more portable than the original 438.03 MB teacher model.

Requirements

torch
transformers
sentence-transformers
pandas, numpy
matplotlib, wordcloud
scikit-learn

Downloads last month: 1

Safetensors

Model size

67M params

Tensor type

F32

Model tree for manjt/toxic_comment_classifier

Base model

distilbert/distilbert-base-uncased

Finetuned

(11997)

this model

manjt
/

toxic_comment_classifier