Spaces:
Sleeping
Sleeping
Commit Β·
fad97ce
1
Parent(s): 24a5e7e
Add general README into folder /others
Browse files- others/README.md +91 -0
others/README.md
ADDED
|
@@ -0,0 +1,91 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# FocusGuard
|
| 2 |
+
|
| 3 |
+
Real-time webcam-based focus detection system combining geometric feature extraction with machine learning classification. The pipeline extracts 17 facial features (EAR, gaze, head pose, PERCLOS, blink rate, etc.) from MediaPipe landmarks and classifies attentiveness using MLP and XGBoost models. Served via a React + FastAPI web application with live WebSocket video.
|
| 4 |
+
|
| 5 |
+
## 1. Project Structure
|
| 6 |
+
|
| 7 |
+
```
|
| 8 |
+
βββ data/ Raw collected sessions (collected_<name>/*.npz)
|
| 9 |
+
βββ data_preparation/ Data loading, cleaning, and exploration
|
| 10 |
+
βββ notebooks/ Training notebooks (MLP, XGBoost) with LOPO evaluation
|
| 11 |
+
βββ models/ Feature extraction modules and training scripts
|
| 12 |
+
βββ checkpoints/ All saved weights (mlp_best.pt, xgboost_*_best.json, GRU, scalers)
|
| 13 |
+
βββ evaluation/ Training logs and metrics (JSON)
|
| 14 |
+
βββ ui/ Live OpenCV demo and inference pipeline
|
| 15 |
+
βββ src/ React/Vite frontend source
|
| 16 |
+
βββ static/ Built frontend (served by FastAPI)
|
| 17 |
+
βββ app.py / main.py FastAPI backend (API, WebSocket, DB)
|
| 18 |
+
βββ requirements.txt Python dependencies
|
| 19 |
+
βββ package.json Frontend dependencies
|
| 20 |
+
```
|
| 21 |
+
|
| 22 |
+
## 2. Setup
|
| 23 |
+
|
| 24 |
+
```bash
|
| 25 |
+
python -m venv venv
|
| 26 |
+
source venv/bin/activate
|
| 27 |
+
pip install -r requirements.txt
|
| 28 |
+
```
|
| 29 |
+
|
| 30 |
+
Frontend (only needed if modifying the React app):
|
| 31 |
+
|
| 32 |
+
```bash
|
| 33 |
+
npm install
|
| 34 |
+
npm run build
|
| 35 |
+
cp -r dist/* static/
|
| 36 |
+
```
|
| 37 |
+
|
| 38 |
+
## 3. Running
|
| 39 |
+
|
| 40 |
+
**Web application (API + frontend):**
|
| 41 |
+
|
| 42 |
+
```bash
|
| 43 |
+
uvicorn app:app --host 0.0.0.0 --port 7860
|
| 44 |
+
```
|
| 45 |
+
|
| 46 |
+
Open http://localhost:7860 in a browser.
|
| 47 |
+
|
| 48 |
+
**Live camera demo (OpenCV):**
|
| 49 |
+
|
| 50 |
+
```bash
|
| 51 |
+
python ui/live_demo.py
|
| 52 |
+
python ui/live_demo.py --xgb # XGBoost mode
|
| 53 |
+
```
|
| 54 |
+
|
| 55 |
+
**Training:**
|
| 56 |
+
|
| 57 |
+
```bash
|
| 58 |
+
python -m models.mlp.train # MLP
|
| 59 |
+
python -m models.xgboost.train # XGBoost
|
| 60 |
+
```
|
| 61 |
+
|
| 62 |
+
## 4. Dataset
|
| 63 |
+
|
| 64 |
+
- **9 participants**, each recorded via webcam with real-time labelling (focused / unfocused)
|
| 65 |
+
- **144,793 total samples**, 10 selected features, binary classification
|
| 66 |
+
- Collected using `python -m models.collect_features --name <name>`
|
| 67 |
+
- Stored as `.npz` files in `data/collected_<name>/`
|
| 68 |
+
|
| 69 |
+
## 5. Models
|
| 70 |
+
|
| 71 |
+
| Model | Test Accuracy | Test F1 | ROC-AUC |
|
| 72 |
+
|-------|--------------|---------|---------|
|
| 73 |
+
| XGBoost (600 trees, depth 8, lr 0.149) | 95.87% | 0.959 | 0.991 |
|
| 74 |
+
| MLP (64β32, 30 epochs, lr 1e-3) | 92.92% | 0.929 | 0.971 |
|
| 75 |
+
|
| 76 |
+
Both evaluated on a held-out 15% stratified test split. LOPO (Leave-One-Person-Out) cross-validation available in `notebooks/`.
|
| 77 |
+
|
| 78 |
+
## 6. Feature Pipeline
|
| 79 |
+
|
| 80 |
+
1. **Face mesh** β MediaPipe 478-landmark detection
|
| 81 |
+
2. **Head pose** β solvePnP β yaw, pitch, roll, face score, gaze offset, head deviation
|
| 82 |
+
3. **Eye scorer** β EAR (left/right/avg), horizontal/vertical gaze ratio, MAR
|
| 83 |
+
4. **Temporal tracking** β PERCLOS, blink rate, closure duration, yawn duration
|
| 84 |
+
5. **Classification** β 10-feature vector β MLP or XGBoost β focused / unfocused
|
| 85 |
+
|
| 86 |
+
## 7. Tech Stack
|
| 87 |
+
|
| 88 |
+
- **Backend:** Python, FastAPI, WebSocket, aiosqlite
|
| 89 |
+
- **Frontend:** React, Vite, TypeScript
|
| 90 |
+
- **ML:** PyTorch (MLP), XGBoost, scikit-learn
|
| 91 |
+
- **Vision:** MediaPipe, OpenCV
|