| | --- |
| | license: apache-2.0 |
| | datasets: |
| | - 4nkh/theme_data |
| | language: |
| | - en |
| | metrics: |
| | - precision |
| | - f1 |
| | - recall |
| | - accuracy |
| | base_model: |
| | - google-bert/bert-base-uncased |
| | pipeline_tag: text-classification |
| | library_name: transformers |
| | tags: |
| | - multi-label |
| | - theme_detection |
| | - mentorship |
| | - entrepreneurship |
| | - startup success |
| | - json automation |
| | --- |
| | # Theme classification model (multi-label) |
| |
|
| | This repository contains a fine-tuned BERT model for classifying short texts into community-oriented themes. The model was trained locally and pushed to the Hugging Face Hub. |
| |
|
| | Model details |
| |
|
| | - Model architecture: bert-base-uncased (fine-tuned) |
| | - Problem type: multi-label classification |
| | - Labels: `mentorship`, `entrepreneurship`, `startup success` |
| | - Training data: `train_theme.jsonl` (included) |
| | - Final evaluation (example run): |
| | - eval_loss: 0.1822 |
| | - eval_micro/f1: 1.0 |
| | - eval_macro/f1: 1.0 |
| | |
| | Usage |
| | |
| | ```python |
| | from transformers import AutoTokenizer, AutoModelForSequenceClassification |
| | import torch |
| | |
| | repo = "4nkh/theme_model" |
| | tokenizer = AutoTokenizer.from_pretrained(repo) |
| | model = AutoModelForSequenceClassification.from_pretrained(repo) |
| |
|
| | texts = ["Our co-op paired first-time founders with veteran shop owners to troubleshoot setbacks."] |
| | inputs = tokenizer(texts, truncation=True, padding=True, return_tensors="pt") |
| | with torch.no_grad(): |
| | outputs = model(**inputs) |
| | logits = outputs.logits |
| | probs = torch.sigmoid(logits) |
| | preds = (probs >= 0.5).int() |
| | print('probs', probs.numpy(), 'preds', preds.numpy()) |
| | ``` |
| | |
| | Notes |
| |
|
| | - This model uses a threshold of 0.5 for multi-label predictions. Adjust thresholds per-class as needed. |
| | - If you want to re-train or fine-tune further, see `train_theme_model.py` in this folder. |
| |
|
| | License |
| |
|
| | Specify your license here (e.g., Apache-2.0) or remove this section if you prefer a different license. |