| --- |
| library_name: transformers |
| tags: |
| - text-classification |
| - distilbert |
| - command-classification |
| - intent-detection |
| - nlp |
| language: |
| - en |
| license: apache-2.0 |
| metrics: |
| - accuracy |
| - f1 |
| base_model: distilbert-base-uncased |
| pipeline_tag: text-classification |
| --- |
| |
| # DistilBERT Command Classifier |
|
|
| A fine-tuned DistilBERT model for classifying user commands and questions with high accuracy, including handling of typos and variations. |
|
|
| ## Model Details |
|
|
| ### Model Description |
|
|
| This model is a fine-tuned version of `distilbert-base-uncased` specifically trained to classify various command types from user input. It's designed to handle natural language commands with typos, variations in phrasing, and different command intents. |
|
|
| - **Developed by:** jhonacmarvik |
| - **Model type:** Text Classification (Sequence Classification) |
| - **Language(s):** English |
| - **License:** Apache 2.0 |
| - **Finetuned from model:** distilbert-base-uncased |
|
|
| ### Model Sources |
|
|
| - **Base Model:** [distilbert-base-uncased](https://huggingface.co/distilbert-base-uncased) |
| - **Framework:** PyTorch + Transformers |
|
|
| ## Uses |
|
|
| ### Direct Use |
|
|
| This model can be directly used for: |
| - **Command intent classification** - Identify what action the user wants to perform |
| - **Voice assistant routing** - Route commands to appropriate handlers |
| - **Natural language interface control** - Control systems through natural language |
| - **Question vs Command detection** - Distinguish between questions and actionable commands |
|
|
| ### Example Usage |
|
|
| ```python |
| from transformers import pipeline |
| |
| # Load the classifier |
| classifier = pipeline( |
| "text-classification", |
| model="jhonacmarvik/distilbert-command-classifier", |
| top_k=3 |
| ) |
| |
| # Single prediction |
| result = classifier("Turn on all work lights") |
| print(result) |
| # Output: [ |
| # {'label': 'turn_on_lights', 'score': 0.9234}, |
| # {'label': 'increase_brightness', 'score': 0.0543}, |
| # {'label': 'turn_off_lights', 'score': 0.0123} |
| # ] |
| |
| # Batch prediction |
| commands = [ |
| "Turn on all work lights", |
| "Decrease the brightness", |
| "What's the temperature?" |
| ] |
| results = classifier(commands) |
| ``` |
|
|
| ### Alternative Usage (Manual) |
|
|
| ```python |
| from transformers import AutoModelForSequenceClassification, AutoTokenizer |
| import torch |
| |
| model = AutoModelForSequenceClassification.from_pretrained( |
| "jhonacmarvik/distilbert-command-classifier" |
| ) |
| tokenizer = AutoTokenizer.from_pretrained( |
| "jhonacmarvik/distilbert-command-classifier" |
| ) |
| |
| device = torch.device("cuda" if torch.cuda.is_available() else "cpu") |
| model.to(device) |
| model.eval() |
| |
| # Tokenize |
| text = "Turn on all work lights" |
| tokens = tokenizer(text, return_tensors="pt", padding=True, truncation=True) |
| tokens = {k: v.to(device) for k, v in tokens.items()} |
| |
| # Predict |
| with torch.no_grad(): |
| outputs = model(**tokens) |
| probs = torch.softmax(outputs.logits, dim=-1) |
| predicted_class = torch.argmax(probs, dim=-1) |
| |
| print(f"Predicted: {model.config.id2label[predicted_class.item()]}") |
| print(f"Confidence: {probs[0][predicted_class].item():.4f}") |
| ``` |
|
|
| ### Downstream Use |
|
|
| Can be integrated into: |
| - Smart home systems |
| - Voice assistants |
| - Chatbots and conversational AI |
| - IoT device control interfaces |
| - Natural language command parsers |
|
|
| ### Out-of-Scope Use |
|
|
| This model is NOT suitable for: |
| - Commands outside its training vocabulary |
| - Languages other than English |
| - Sentiment analysis or emotion detection |
| - General text classification tasks unrelated to commands |
| - Safety-critical applications without human oversight |
|
|
| ## Bias, Risks, and Limitations |
|
|
| - **Vocabulary Limitation:** Model is trained on specific command types and may not generalize to completely novel command categories |
| - **Typo Handling:** While trained on variations with typos, extreme misspellings may reduce accuracy |
| - **Context Awareness:** Model processes single utterances and doesn't maintain conversation context |
| - **Language:** Only supports English language commands |
|
|
| ### Recommendations |
|
|
| - Implement confidence thresholds (e.g., > 0.7) before executing commands |
| - Provide fallback mechanisms for low-confidence predictions |
| - Add human-in-the-loop for critical operations |
| - Monitor model performance on production data and retrain periodically |
| - Test thoroughly with your specific use case before deployment |
|
|
| ## Training Details |
|
|
| ### Training Data |
|
|
| - **Dataset:** Custom dataset of command variations with intentional typos and paraphrases |
| - **Size:** Multiple variations per command class |
| - **Format:** CSV with text variations and corresponding labels |
| - **Split:** 80% training, 20% validation (stratified) |
|
|
| ### Training Procedure |
|
|
| #### Preprocessing |
|
|
| - Text converted to lowercase |
| - Tokenization using DistilBERT tokenizer |
| - Maximum sequence length: 128 tokens |
| - Padding and truncation applied |
|
|
| #### Training Hyperparameters |
|
|
| - **Training regime:** FP32 |
| - **Optimizer:** AdamW |
| - **Learning rate:** 2e-5 |
| - **Warmup steps:** 100 |
| - **Weight decay:** 0.01 |
| - **Batch size:** 16 (per device) |
| - **Number of epochs:** 10 |
| - **Early stopping patience:** 3 epochs |
| - **Evaluation strategy:** Per epoch |
| - **Best model selection:** Based on eval_loss |
| |
| #### Hardware & Software |
| |
| - **Framework:** PyTorch + Transformers (Hugging Face) |
| - **Base model:** distilbert-base-uncased |
| - **Hardware:** GPU (CUDA-enabled) or CPU compatible |
| |
| ## Evaluation |
| |
| ### Metrics |
| |
| The model was evaluated using: |
| - **Accuracy:** Overall classification accuracy |
| - **F1 Score:** Per-class and macro-averaged F1 |
| - **Precision & Recall:** Per-class metrics |
| - **Confusion Matrix:** Visual representation of classification performance |
| - **ROC-AUC:** Per-class ROC curves |
| |
| ### Results |
| |
| Model achieves high accuracy on the validation set with strong performance across all command classes. Detailed metrics are available in the training outputs. |
| |
| *Note: Specific metrics depend on your final training results. Update with actual values after training.* |
| |
| ## How to Get Started |
| |
| ### Installation |
| |
| ```bash |
| pip install transformers torch |
| ``` |
| |
| ### Quick Start |
| |
| ```python |
| from transformers import pipeline |
| |
| classifier = pipeline( |
| "text-classification", |
| model="jhonacmarvik/distilbert-command-classifier" |
| ) |
| |
| result = classifier("Turn on the lights") |
| print(result) |
| ``` |
| |
| ### Production Deployment |
| |
| For production use with custom loading pattern: |
| |
| ```python |
| import os |
| import torch |
| from transformers import AutoModelForSequenceClassification, AutoTokenizer |
| |
| class CommandClassifier: |
| def __init__(self): |
| model_path = "jhonacmarvik/distilbert-command-classifier" |
| self.device = torch.device("cuda" if torch.cuda.is_available() else "cpu") |
| |
| self.tokenizer = AutoTokenizer.from_pretrained(model_path) |
| self.model = AutoModelForSequenceClassification.from_pretrained(model_path) |
| self.model.to(self.device) |
| self.model.eval() |
| |
| def predict(self, text: str, top_k: int = 3): |
| tokens = self.tokenizer(text, return_tensors="pt", padding=True, truncation=True) |
| tokens = {k: v.to(self.device) for k, v in tokens.items()} |
| |
| with torch.no_grad(): |
| logits = self.model(**tokens).logits |
| probs = torch.softmax(logits, dim=-1) |
| top_probs, top_indices = torch.topk(probs, k=top_k) |
| |
| results = [] |
| for prob, idx in zip(top_probs[0], top_indices[0]): |
| results.append({ |
| "label": self.model.config.id2label[idx.item()], |
| "score": float(prob.item()) |
| }) |
| return results |
| |
| # Usage |
| classifier = CommandClassifier() |
| result = classifier.predict("Turn on lights", top_k=3) |
| ``` |
| |
| ## Environmental Impact |
| |
| Training a single model on standard GPU hardware has minimal environmental impact compared to large language models. This model uses a lightweight DistilBERT architecture which is significantly more efficient than full BERT models. |
| |
| - **Hardware Type:** GPU (CUDA-enabled) |
| - **Compute Region:** [Your region] |
| - **Carbon Impact:** Minimal due to efficient architecture |
| |
| ## Technical Specifications |
| |
| ### Model Architecture |
| |
| - **Base Architecture:** DistilBERT (6-layer, 768-hidden, 12-heads) |
| - **Parameters:** ~66M parameters |
| - **Classification Head:** Linear layer for multi-class classification |
| - **Dropout:** 0.1 (default DistilBERT configuration) |
| - **Activation:** GELU |
| |
| ### Compute Infrastructure |
| |
| #### Hardware |
| |
| - Compatible with CPU and GPU (CUDA) |
| - Recommended: GPU with 4GB+ VRAM for faster inference |
| - Works on CPU for low-volume applications |
| |
| #### Software |
| |
| - Python 3.8+ |
| - PyTorch 2.0+ |
| - Transformers 4.30+ |
| - CUDA 11.0+ (for GPU acceleration) |
| |
| ## Citation |
| |
| If you use this model in your research or application, please cite: |
| |
| ```bibtex |
| @misc{distilbert-command-classifier, |
| author = {jhonacmarvik}, |
| title = {DistilBERT Command Classifier}, |
| year = {2024}, |
| publisher = {HuggingFace}, |
| howpublished = {\url{https://huggingface.co/jhonacmarvik/distilbert-command-classifier}} |
| } |
| ``` |
| |
| ## Model Card Authors |
| |
| jhonacmarvik |
| |
| ## Model Card Contact |
| |
| For questions or issues, please open an issue in the model repository or contact through HuggingFace. |