| --- |
| language: en |
| tags: |
| - frame-semantics |
| - semantic-role-labeling |
| - token-classification |
| - roberta |
| license: apache-2.0 |
| widget: |
| - text: "The cat sat on the mat" |
| text_pair: "sat" |
| --- |
| |
| # nixie1981/sem_frames |
| |
| ## Model Description |
| |
| This is a RoBERTa-based model fine-tuned for semantic frame detection using a Question-Answering approach. |
| Given a sentence and a target word, the model predicts the semantic frame that best describes the word's meaning in context. |
| |
| The model is trained on FrameNet 1.7 data and can predict from 797 different semantic frames. |
| |
| ## Usage |
| |
| ### Using the model directly |
| |
| ```python |
| from transformers import AutoTokenizer, AutoModelForSequenceClassification |
| import torch |
| |
| # Load model and tokenizer |
| tokenizer = AutoTokenizer.from_pretrained("nixie1981/sem_frames") |
| model = AutoModelForSequenceClassification.from_pretrained("nixie1981/sem_frames") |
| |
| # Example: Find frame for "sat" in "The cat sat on the mat" |
| sentence = "The cat sat on the mat" |
| target_word = "sat" |
|
|
| # Tokenize |
| inputs = tokenizer(sentence, target_word, return_tensors="pt") |
|
|
| # Predict |
| with torch.no_grad(): |
| outputs = model(**inputs) |
| predicted_frame_id = torch.argmax(outputs.logits, dim=-1).item() |
| predicted_frame = model.config.id2label[predicted_frame_id] |
| |
| print(f"Frame for '{target_word}': {predicted_frame}") |
| ``` |
| |
| ### Using the inference script |
| |
| ```python |
| from frame_finder.inference_qa import FrameFinderQA |
| |
| # Initialize |
| finder = FrameFinderQA(model_path="nixie1981/sem_frames") |
| |
| # Analyze text |
| text = "The cat sat on the mat" |
| result = finder.predict_text(text) |
| |
| for token, frame in zip(result['tokens'], result['frames']): |
| print(f"{token}: {frame}") |
| ``` |
| |
| ## Training Data |
| |
| The model was trained on FrameNet 1.7, which includes: |
| - Semantic frame annotations for English text |
| - 797 distinct semantic frames |
| - Thousands of annotated sentences |
| |
| ## Input Format |
| |
| The model expects two inputs: |
| 1. **Sentence**: The complete sentence containing the target word |
| 2. **Target word**: The specific word to classify |
| |
| Format: `[CLS] sentence [SEP] target_word [SEP]` |
| |
| ## Output |
| |
| The model outputs one of 797 semantic frame labels, or '_' for words without frames. |
| |
| ## Performance |
| |
| |
| |
| ## Limitations |
| |
| - Trained only on English text |
| - Limited to FrameNet 1.7 frame inventory |
| - Performance may vary on domain-specific text |
| |
| ## Citation |
| |
| If you use this model, please cite: |
| |
| ```bibtex |
| @misc{frame-finder-qa, |
| title={QA-based Semantic Frame Finder}, |
| author={Your Name}, |
| year={2024}, |
| publisher={HuggingFace} |
| } |
| ``` |
| |
| ## License |
| |
| Apache 2.0 |
| |