| | --- |
| | license: mit |
| | tags: |
| | - vision |
| | - video-classification |
| | language: |
| | - en |
| | pipeline_tag: video-classification |
| | --- |
| | |
| | # FAL - Framework For Automated Labeling Of Videos (FALVideoClassifier) |
| |
|
| | FAL (Framework for Automated Labeling Of Videos) is a custom video classification model developed by **SVECTOR** and fine-tuned on the **FAL-500** dataset. This model is designed for efficient video understanding and classification, leveraging state-of-the-art video processing techniques. |
| |
|
| | <img src="https://cdn-uploads.huggingface.co/production/uploads/6631e2b06d207536a4651738/Sf9tEMK8989JpQorvokT_.png" alt="Demo" width="560"> |
| |
|
| | ## Model Overview |
| |
|
| | Paper: https://github.com/SVECTOR-CORPORATION/FAL/blob/main/FAL.pdf |
| |
|
| | This model, referred to as `FALVideoClassifier`, fine-tuned on **FAL-500** Dataset, and optimized for automated video labeling tasks. It is capable of classifying a video into one of the 5 |
| | 00 possible labels from the FAL-500 dataset. |
| |
|
| | This model was developed by **SVECTOR** as part of our initiative to advance automated video understanding and classification technologies. |
| |
|
| | ## Intended Uses & Limitations |
| |
|
| | This model is designed for video classification tasks, and you can use it to classify videos into one of the 500 classes from the FAL-500 dataset. Please note that the model was trained on **FAL-500** and may not perform as well on datasets that significantly differ from this. |
| |
|
| | ### Intended Use: |
| | - Automated video labeling |
| | - Video content classification |
| | - Research in video understanding and machine learning |
| |
|
| | ### Limitations: |
| | - Only trained on FAL-500 |
| | - May not generalize well to out-of-domain videos without further fine-tuning |
| | - Requires videos to be pre-processed (such as resizing frames, normalization, etc.) |
| |
|
| | ## How to Use |
| |
|
| | To use this model for video classification, follow these steps: |
| |
|
| | ### Installation: |
| |
|
| | Ensure you have the necessary dependencies installed: |
| |
|
| | ```bash |
| | pip install torch torchvision transformers |
| | ``` |
| |
|
| | ### Code Example: |
| |
|
| | Here is an example Python code snippet for using the FAL model to classify a video: |
| |
|
| | ```python |
| | from transformers import AutoImageProcessor, FALVideoClassifierForVideoClassification |
| | import numpy as np |
| | import torch |
| | |
| | # Simulating a sample video (8 frames of size 224x224 with 3 color channels) |
| | video = list(np.random.randn(8, 3, 224, 224)) # 8 frames, each of size 224x224 with RGB channels |
| | |
| | # Load the image processor and model |
| | processor = AutoImageProcessor.from_pretrained("SVECTOR-CORPORATION/FAL") |
| | model = FALVideoClassifierForVideoClassification.from_pretrained("SVECTOR-CORPORATION/FAL") |
| | |
| | # Pre-process the video input |
| | inputs = processor(video, return_tensors="pt") |
| | |
| | # Run inference with no gradient calculation (evaluation mode) |
| | with torch.no_grad(): |
| | outputs = model(**inputs) |
| | logits = outputs.logits |
| | |
| | # Find the predicted class (highest logit) |
| | predicted_class_idx = logits.argmax(-1).item() |
| | |
| | # Output the predicted label |
| | print("Predicted class:", model.config.id2label[predicted_class_idx]) |
| | ``` |
| |
|
| | ### Model Details: |
| |
|
| | - **Model Name**: `FALVideoClassifier` |
| | - **Dataset Used**: FAL-S500 |
| | - **Input Size**: 8 frames of size 224x224 with 3 color channels (RGB) |
| |
|
| | ### Configuration: |
| |
|
| | The `FALVideoClassifier` uses the following hyperparameters: |
| |
|
| | - `num_frames`: Number of frames in the video (e.g., 8) |
| | - `num_labels`: The number of possible video classes (500 for FAL-500) |
| | - `hidden_size`: Hidden size for transformer layers (768) |
| | - `attention_probs_dropout_prob`: Dropout probability for attention layers (0.0) |
| | - `hidden_dropout_prob`: Dropout probability for the hidden layers (0.0) |
| | - `drop_path_rate`: Dropout rate for stochastic depth (0.0) |
| |
|
| | ### Preprocessing: |
| |
|
| | Before feeding videos into the model, ensure the frames are properly pre-processed: |
| |
|
| | - Resize frames to `224x224` |
| | - Normalize pixel values (use the processor from the model, as shown in the code) |
| |
|
| | ## License |
| |
|
| | This project is licensed under the **SVECTOR Proprietary License**. Refer to the `LICENSE` file for more details. |
| | -- |
| |
|
| | This model is licensed under the **CC-BY-NC-4.0** license, which means it can be used for non-commercial purposes with proper attribution. |
| |
|
| | ## Citation |
| |
|
| | If you use this model in your research or projects, please cite the following: |
| |
|
| | ```bibtex |
| | @misc{svector2024fal, |
| | title={FAL - Framework For Automated Labeling Of Videos (FALVideoClassifier)}, |
| | author={SVECTOR}, |
| | year={2024}, |
| | url={https://www.svector.co.in}, |
| | |
| | } |
| | |
| | ``` |
| |
|
| | FAL Paper & Details: https://github.com/SVECTOR-CORPORATION/FAL |
| |
|
| | ## Contact |
| |
|
| | For any inquiries regarding this model or its implementation, you can contact the SVECTOR team at ai@svector.com. |
| |
|
| | --- |