| | --- |
| | annotations_creators: |
| | - machine-generated |
| | language_creators: |
| | - machine-generated |
| | widget: |
| | - text: George Washington went to Washington. |
| | - text: What is the seventh tallest mountain in North America? |
| | tags: |
| | - named-entity-recognition |
| | - sequence-tagger-model |
| | datasets: |
| | - Babelscape/cner |
| | language: |
| | - en |
| | pretty_name: cner-model |
| | source_datasets: |
| | - original |
| | task_categories: |
| | - structure-prediction |
| | task_ids: |
| | - named-entity-recognition |
| | --- |
| | |
| | # CNER: Concept and Named Entity Recognition |
| | This is the model card for the NAACL 2024 paper [CNER: Concept and Named Entity Recognition](https://aclanthology.org/2024.naacl-long.461/). |
| | We fine-tuned a language model (DeBERTa-v3-base) for 1 epoch on our [CNER dataset](https://huggingface.co/datasets/Babelscape/cner) using the default hyperparameters, optimizer and architecture of Hugging Face, therefore the results of this model may differ from the ones presented in the paper. |
| | The resulting CNER model is able to jointly identifying and classifying concepts and named entities with fine-grained tags. |
| |
|
| | **If you use the model, please reference this work in your paper**: |
| |
|
| | ```bibtex |
| | @inproceedings{martinelli-etal-2024-cner, |
| | title = "{CNER}: Concept and Named Entity Recognition", |
| | author = "Martinelli, Giuliano and |
| | Molfese, Francesco and |
| | Tedeschi, Simone and |
| | Fern{\'a}ndez-Castro, Alberte and |
| | Navigli, Roberto", |
| | editor = "Duh, Kevin and |
| | Gomez, Helena and |
| | Bethard, Steven", |
| | booktitle = "Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)", |
| | month = jun, |
| | year = "2024", |
| | address = "Mexico City, Mexico", |
| | publisher = "Association for Computational Linguistics", |
| | url = "https://aclanthology.org/2024.naacl-long.461", |
| | pages = "8329--8344", |
| | } |
| | ``` |
| | |
| | The original repository for the paper can be found at [https://github.com/Babelscape/cner](https://github.com/Babelscape/cner). |
| | |
| | ## How to use |
| |
|
| | You can use this model with Transformers NER *pipeline*. |
| |
|
| | ```python |
| | from transformers import AutoTokenizer, AutoModelForTokenClassification |
| | from transformers import pipeline |
| | |
| | tokenizer = AutoTokenizer.from_pretrained("Babelscape/cner-model") |
| | model = AutoModelForTokenClassification.from_pretrained("Babelscape/cner-model") |
| | |
| | nlp = pipeline("ner", model=model, tokenizer=tokenizer, grouped_entities=True) |
| | example = "What is the seventh tallest mountain in North America?" |
| | |
| | ner_results = nlp(example) |
| | print(ner_results) |
| | ``` |
| |
|
| | ## Classes |
| | <img src="https://cdn-uploads.huggingface.co/production/uploads/65e9ccd84ce78d665a50f78b/2K3NZ79go3Zjf3qFeHO0O.png" alt="drawing" /> |
| |
|
| | ## Licensing Information |
| | Contents of this repository are restricted to only non-commercial research purposes under the [Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (CC BY-NC-SA 4.0)](https://creativecommons.org/licenses/by-nc-sa/4.0/). Copyright of the dataset contents and models belongs to the original copyright holders. |
| |
|
| | `microsoft/deberta-v3-base` is released under the [MIT license](https://choosealicense.com/licenses/mit/). |
| |
|