Instructions to use Cbelem/scibert-certainty-classif with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Cbelem/scibert-certainty-classif with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-classification", model="Cbelem/scibert-certainty-classif")# Load model directly from transformers import AutoTokenizer, AutoModelForSequenceClassification tokenizer = AutoTokenizer.from_pretrained("Cbelem/scibert-certainty-classif") model = AutoModelForSequenceClassification.from_pretrained("Cbelem/scibert-certainty-classif") - Notebooks
- Google Colab
- Kaggle
| library_name: transformers | |
| license: apache-2.0 | |
| language: | |
| - en | |
| base_model: | |
| - allenai/scibert_scivocab_uncased | |
| pipeline_tag: text-classification | |
| # Model Card for Model ID | |
| This is a text classification model. | |
| It was fine-tuned to predict certainty ratings of scientific findings using a classification loss and a ranking loss. | |
| We fine-tuned an allenai/scibert_scivocab_uncased on the dataset made available by [Wurl et al (2024): Understanding Fine-Grained Distortions in Reports for Scientific Finding.](https://aclanthology.org/2024.findings-acl.369/). | |
| ## Model Details | |
| ### Model Description | |
| <!-- Provide a longer summary of what this model is. --> | |
| This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated. | |
| - **Developed by:** Researchers at UCI with the goal of obtaining a reliable certainty scoring function. | |
| - **Model type:** BERT | |
| - **Language(s) (NLP):** English | |
| - **Finetuned from model:** allenai/scibert_scivocab_uncased | |
| ## Uses | |
| The model is meant to be used for estimating certainty scores. Because it is trained on sentence-level academic findings, we suspect its reliability to be restricted to this domain. | |
| The original dataset had only moderate inter-annotator agreement (spearman correlation coefficient of 0.44), which suggests that predicting certainty scores is difficult even for humans. | |
| We recommend users of this model to validate that the model behaves as intended in a small portion of the data of interest before scaling evaluations. | |
| We also note that the per-class F1 scores ranged between (0.48-0.70), which reflects once again the difficulty in learning clear class boundaries. | |
| ## How to Get Started with the Model | |
| Use the code below to get started with the model. | |
| ```python | |
| from transformers import AutoTokenizer, AutoModelForSequenceClassification | |
| tokenizer = AutoTokenizer.from_pretrained("Cbelem/scibert-certainty-classif") | |
| model = AutoModelForSequenceClassification.from_pretrained("Cbelem/scibert-certainty-classif") | |
| model.eval() | |
| texts = [ | |
| "Compared with controls, taxi drivers had greater grey matter volume in the posterior hippocampi (Maguire et al.", | |
| "The study described in this paper focuses on gaze, but similar approaches can be used to understand the effects of other interactions that contribute to patient outcomes such as emotion.", | |
| '""The initial findings could have been explained by a correlation, that people with big hippocampi become taxi drivers,"" he says.', | |
| "We are less sure about a possible explanation for lower acceptance for mobile phone behaviors among professionals in the West.", | |
| ] | |
| inputs_ids = tokenizer(texts, return_tensors="pt") | |
| model(**inputs_ids) | |
| ``` | |
| ## Training Details | |
| ### Training Data | |
| TBD | |
| ### Training Procedure | |
| TBD | |
| #### Preprocessing [optional] | |
| TBD | |
| #### Training Hyperparameters | |
| - **Training regime:** fp32 | |
| ## Evaluation | |
| ### Testing Data, Factors & Metrics | |
| #### Testing Data | |
| <!-- This should link to a Dataset Card if possible. --> | |
| [More Information Needed] | |
| #### Factors | |
| <!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. --> | |
| [More Information Needed] | |
| #### Metrics | |
| TBD | |
| ### Results | |
| ``` | |
| "train/learning_rate": 6.869747470432602e-7, | |
| "train/loss": 0.562, | |
| "train/global_step": 3000, | |
| "eval/qwk": 0.5507, | |
| "eval/loss": 0.9391, | |
| "eval/accuracy": 0.6078, | |
| "eval/balanced_accuracy": 0.3980, | |
| "eval/f1_macro": 0.6006, | |
| "eval/f1_class_0": 0.6211, | |
| "eval/f1_class_1": 0.4932, | |
| "eval/f1_class_2": 0.6875, | |
| "eval/precision_macro": 0.6033, | |
| "eval/precision_class_0": 0.6410, | |
| "eval/precision_class_1": 0.5, | |
| "eval/precision_class_2": 0.6689, | |
| "eval/recall_macro": 0.5987, | |
| "eval/recall_class_0": 0.6024, | |
| "eval/recall_class_1": 0.4865, | |
| "eval/recall_class_2": 0.7071, | |
| "train_steps_per_second": 6.532, | |
| ``` | |
| #### Summary | |
| ## Technical Specifications [optional] | |
| ### Model Architecture and Objective | |
| TBD | |
| ### Compute Infrastructure | |
| [More Information Needed] | |
| #### Hardware | |
| [More Information Needed] | |
| #### Software | |
| Transformers, Pytorch, Wandb for running the hyperparameter sweep | |
| ## Citation | |
| TBD | |
| ## Model Card Authors | |
| Catarina Belem (Cbelem) | |
| ## Model Card Contact | |
| For more information contact cbelem@uci.edu. |