Model Card for ModernBERT-large fine-tuned on CoNLL-2003 (NER)
A Named Entity Recognition model based on answerdotai/ModernBERT-large, fine-tuned on the English CoNLL-2003 dataset. It identifies and classifies entities into four types: Person, Organization, Location, and Miscellaneous.
Model Details
- Base model: answerdotai/ModernBERT-large
- Task: Token classification (NER)
- Dataset: lhoestq/conll2003 (CoNLL-2003 English)
- Number of labels: 9 (BIO format)
- O (0)
- B-PER (1), I-PER (2)
- B-ORG (3), I-ORG (4)
- B-LOC (5), I-LOC (6)
- B-MISC (7), I-MISC (8)
- Training procedure: Fine-tuning with Optuna hyperparameter search (20 trials)
- Evaluation metric:
seqeval(overall precision, recall, F1, accuracy)
Label Mapping
| Label ID | Entity Tag |
|---|---|
| 0 | O |
| 1 | B-PER |
| 2 | I-PER |
| 3 | B-ORG |
| 4 | I-ORG |
| 5 | B-LOC |
| 6 | I-LOC |
| 7 | B-MISC |
| 8 | I-MISC |
Training Procedure
Hyperparameter Search
An Optuna study (20 trials) maximized validation F1 over the following search space:
- Learning rate:
[1e-5, 5e-4](log scale) - Batch size per device:
[8, 16, 32] - Number of epochs:
[2, 6] - Weight decay:
[0.0, 0.1] - Warmup ratio:
[0.0, 0.2] - Gradient accumulation steps:
[1, 4]
Other fixed training arguments:
- Evaluation batch size: 8
- Max sequence length: 256
- Evaluation strategy: epoch
- Save strategy: epoch
- Best model selection based on validation F1
- Seed: 42
Training Data
- Training set: CoNLL-2003
trainsplit - Validation set: CoNLL-2003
validationsplit (used for early stopping / best model selection) - Test set: CoNLL-2003
testsplit (final evaluation)
Tokenizer Alignment
During tokenization, the original tokens are split into subwords. Subword tokens that are continuations of the same word are assigned the inside label of the corresponding entity class, if applicable. For example, if “Microsoft” is tokenized into ["Micro", "##soft"] and the original tag is B-ORG, the first subword gets B-ORG and the second gets I-ORG. This is implemented in the align_labels function.
Evaluation Results
After hyperparameter search, the best trial achieved the following results on the test set:
- Precision: 0.87
- Recall: 0.91
- F1: 0.89
- Accuracy: 0.97
How to Use
Quick Pipeline
from transformers import pipeline
ner = pipeline("token-classification", model="violetar/ner-model", aggregation_strategy="simple")
sentence = "John Smith works at Microsoft in New York."
results = ner(sentence)
for entity in results:
print(f"{entity['word']} -> {entity['entity_group']} (score: {entity['score']:.2f})")
- Downloads last month
- 79