Update README.md
Browse files
README.md
CHANGED
|
@@ -17,7 +17,7 @@ The model was trained from pretrained YOLOv8n weights and fine-tuned on a custom
|
|
| 17 |
|
| 18 |
# Training Data
|
| 19 |
### Dataset Sources
|
| 20 |
-
*The training dataset was constructed from two sources:*
|
| 21 |
|
| 22 |
Rock-Paper-Scissors dataset
|
| 23 |
* Source: Roboflow Universe
|
|
@@ -30,14 +30,14 @@ Custom gesture dataset
|
|
| 30 |
* Video parsed into frames at 10 frames per second
|
| 31 |
* Images manually selected and annotated
|
| 32 |
|
| 33 |
-
|
| 34 |
| Category | Count |
|
| 35 |
| ---------------- | --------- |
|
| 36 |
| Original Images | 444 |
|
| 37 |
| Augmented Images | 1066 |
|
| 38 |
| Image Resolution | 512 × 512 |
|
| 39 |
|
| 40 |
-
|
| 41 |
| Class | Gesture | Annotation Count |
|
| 42 |
| -------- | ----------- | ---------------- |
|
| 43 |
| Forward | Open Palm | 169 |
|
|
@@ -85,4 +85,38 @@ Dataset availability: https://universe.roboflow.com/b-data-497-ws/hand-gesture-c
|
|
| 85 |
* Limited diversity in backgrounds and lighting conditions
|
| 86 |
* Limited number of subjects (primarily one person)
|
| 87 |
|
| 88 |
-
*These factors may affect model generalization.*
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 17 |
|
| 18 |
# Training Data
|
| 19 |
### Dataset Sources
|
| 20 |
+
**The training dataset was constructed from two sources:**
|
| 21 |
|
| 22 |
Rock-Paper-Scissors dataset
|
| 23 |
* Source: Roboflow Universe
|
|
|
|
| 30 |
* Video parsed into frames at 10 frames per second
|
| 31 |
* Images manually selected and annotated
|
| 32 |
|
| 33 |
+
### Dataset Size
|
| 34 |
| Category | Count |
|
| 35 |
| ---------------- | --------- |
|
| 36 |
| Original Images | 444 |
|
| 37 |
| Augmented Images | 1066 |
|
| 38 |
| Image Resolution | 512 × 512 |
|
| 39 |
|
| 40 |
+
### Class Distribution
|
| 41 |
| Class | Gesture | Annotation Count |
|
| 42 |
| -------- | ----------- | ---------------- |
|
| 43 |
| Forward | Open Palm | 169 |
|
|
|
|
| 85 |
* Limited diversity in backgrounds and lighting conditions
|
| 86 |
* Limited number of subjects (primarily one person)
|
| 87 |
|
| 88 |
+
*These factors may affect model generalization.*
|
| 89 |
+
|
| 90 |
+
# Training Procedure
|
| 91 |
+
### Framework
|
| 92 |
+
Training was performed using
|
| 93 |
+
|
| 94 |
+
### Model Architecture
|
| 95 |
+
Base model: YOLOv8n (Nano)
|
| 96 |
+
|
| 97 |
+
**Reasons for selection:**
|
| 98 |
+
* Lightweight architecture
|
| 99 |
+
* Low inference latency
|
| 100 |
+
* Lower hardware requirements
|
| 101 |
+
* Faster training times
|
| 102 |
+
* Suitable for real-time applications
|
| 103 |
+
|
| 104 |
+
### Training Configuration
|
| 105 |
+
| Parameter | Value |
|
| 106 |
+
| ----------------------- | ---------------------------- |
|
| 107 |
+
| Epochs | 200 (training stopped early) |
|
| 108 |
+
| Early stopping patience | 10 |
|
| 109 |
+
| Image size | 512 × 512 |
|
| 110 |
+
| Batch size | 64 |
|
| 111 |
+
|
| 112 |
+
### Training Hardware
|
| 113 |
+
| Component | Specification |
|
| 114 |
+
| ------------- | ---------------- |
|
| 115 |
+
| GPU | A100 (High Ram) |
|
| 116 |
+
| VRAM | 80 GB |
|
| 117 |
+
| Training Time | ~40 minutes |
|
| 118 |
+
|
| 119 |
+
### Preprocessing Steps
|
| 120 |
+
* Images resized to 512×512
|
| 121 |
+
* Bounding box annotations normalized
|
| 122 |
+
* Augmented images generated before training
|