Toxic Comment Classification with Transformer Optimization

This project demonstrates a high-performance pipeline for classifying toxic comments using a binary classification approach. The models were trained and evaluated using the Jigsaw Toxic Comment Classification dataset, specifically leveraging the domain-specific Toxic-BERT model as a primary architecture.

Project Overview

  • Objective: To build an efficient binary toxicity classifier using state-of-the-art NLP models.
  • Model Type: Binary classification (Toxic vs. Non-Toxic).
  • Dataset: Jigsaw Toxic Comment Classification Challenge.
  • Scope: Includes data visualization, model benchmarking, and size reduction for deployment.

Technical Workflow

1. Data Preprocessing & EDA

  • Labeling: Multi-label categories (toxic, severe_toxic, obscene, threat, insult, identity_hate) were condensed into a single binary 'is_toxic' label.
  • Balancing: The dataset was sampled to include 16,000 toxic and 16,000 non-toxic comments to ensure a balanced 32,000-sample training set.
  • Cleaning: Newline characters were removed to standardize the text input for transformer tokenizers.
  • Visualization: Word clouds were generated for both classes to identify the most frequent terms associated with toxic and non-toxic speech.

2. Embedding Benchmarking

The project evaluated 15 different embedding sets across two categories:

  • Light Models: Includes DistilBERT, MiniLM, ALBERT, and ELECTRA-Small.
  • Heavy Models: Includes BERT, RoBERTa, DeBERTa, XLNet, and domain-specific models like Toxic-BERT and HateBERT.

3. Model Performance Results

Models were evaluated using Logistic Regression (LR), Support Vector Machines (SVM), and Random Forest (RF)

Embedding LR_AUC LinearSVM_ACC RBF_AUC RF_ACC
Toxic-BERT_transformer_emb 0.997022 0.979531 0.991532 0.979375
HateBERT_transformer_emb 0.967701 0.901875 0.965530 0.852344
DistilBERT_transformer_emb 0.967614 0.898906 0.967362 0.878125

Optimization Techniques

4. Dynamic Quantization

To optimize the teacher model (Toxic-BERT) for CPU inference, dynamic quantization was applied to convert weights from FP32 to INT8.

  • Size Reduction: The model size decreased from 438.01 MB to 181.49 MB.
  • Accuracy Retention: The quantized model maintained a high Test AUC of 0.9966, showing negligible performance loss despite the 58% reduction in size.

5. Knowledge Distillation

A smaller student model (DistilBERT) was trained to mimic the behavior of the Toxic-BERT teacher.

  • Loss Function: A custom Binary Knowledge Distillation loss was used, combining Kullback-Leibler (KL) divergence for soft teacher probabilities and Cross-Entropy for hard labels.
  • Student Performance: Reached a Validation AUC of 0.9866 after 5 training epochs.
  • Final Footprint: The student model is 267.86 MB, significantly more portable than the original 438.03 MB teacher model.

Requirements

  • torch
  • transformers
  • sentence-transformers
  • pandas, numpy
  • matplotlib, wordcloud
  • scikit-learn
Downloads last month
22
Safetensors
Model size
67M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for manjt/toxic_comment_classifier

Finetuned
(10995)
this model

Dataset used to train manjt/toxic_comment_classifier