Yingtao-Zheng commited on
Commit
fad97ce
Β·
1 Parent(s): 24a5e7e

Add general README into folder /others

Browse files
Files changed (1) hide show
  1. others/README.md +91 -0
others/README.md ADDED
@@ -0,0 +1,91 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # FocusGuard
2
+
3
+ Real-time webcam-based focus detection system combining geometric feature extraction with machine learning classification. The pipeline extracts 17 facial features (EAR, gaze, head pose, PERCLOS, blink rate, etc.) from MediaPipe landmarks and classifies attentiveness using MLP and XGBoost models. Served via a React + FastAPI web application with live WebSocket video.
4
+
5
+ ## 1. Project Structure
6
+
7
+ ```
8
+ β”œβ”€β”€ data/ Raw collected sessions (collected_<name>/*.npz)
9
+ β”œβ”€β”€ data_preparation/ Data loading, cleaning, and exploration
10
+ β”œβ”€β”€ notebooks/ Training notebooks (MLP, XGBoost) with LOPO evaluation
11
+ β”œβ”€β”€ models/ Feature extraction modules and training scripts
12
+ β”œβ”€β”€ checkpoints/ All saved weights (mlp_best.pt, xgboost_*_best.json, GRU, scalers)
13
+ β”œβ”€β”€ evaluation/ Training logs and metrics (JSON)
14
+ β”œβ”€β”€ ui/ Live OpenCV demo and inference pipeline
15
+ β”œβ”€β”€ src/ React/Vite frontend source
16
+ β”œβ”€β”€ static/ Built frontend (served by FastAPI)
17
+ β”œβ”€β”€ app.py / main.py FastAPI backend (API, WebSocket, DB)
18
+ β”œβ”€β”€ requirements.txt Python dependencies
19
+ └── package.json Frontend dependencies
20
+ ```
21
+
22
+ ## 2. Setup
23
+
24
+ ```bash
25
+ python -m venv venv
26
+ source venv/bin/activate
27
+ pip install -r requirements.txt
28
+ ```
29
+
30
+ Frontend (only needed if modifying the React app):
31
+
32
+ ```bash
33
+ npm install
34
+ npm run build
35
+ cp -r dist/* static/
36
+ ```
37
+
38
+ ## 3. Running
39
+
40
+ **Web application (API + frontend):**
41
+
42
+ ```bash
43
+ uvicorn app:app --host 0.0.0.0 --port 7860
44
+ ```
45
+
46
+ Open http://localhost:7860 in a browser.
47
+
48
+ **Live camera demo (OpenCV):**
49
+
50
+ ```bash
51
+ python ui/live_demo.py
52
+ python ui/live_demo.py --xgb # XGBoost mode
53
+ ```
54
+
55
+ **Training:**
56
+
57
+ ```bash
58
+ python -m models.mlp.train # MLP
59
+ python -m models.xgboost.train # XGBoost
60
+ ```
61
+
62
+ ## 4. Dataset
63
+
64
+ - **9 participants**, each recorded via webcam with real-time labelling (focused / unfocused)
65
+ - **144,793 total samples**, 10 selected features, binary classification
66
+ - Collected using `python -m models.collect_features --name <name>`
67
+ - Stored as `.npz` files in `data/collected_<name>/`
68
+
69
+ ## 5. Models
70
+
71
+ | Model | Test Accuracy | Test F1 | ROC-AUC |
72
+ |-------|--------------|---------|---------|
73
+ | XGBoost (600 trees, depth 8, lr 0.149) | 95.87% | 0.959 | 0.991 |
74
+ | MLP (64β†’32, 30 epochs, lr 1e-3) | 92.92% | 0.929 | 0.971 |
75
+
76
+ Both evaluated on a held-out 15% stratified test split. LOPO (Leave-One-Person-Out) cross-validation available in `notebooks/`.
77
+
78
+ ## 6. Feature Pipeline
79
+
80
+ 1. **Face mesh** β€” MediaPipe 478-landmark detection
81
+ 2. **Head pose** β€” solvePnP β†’ yaw, pitch, roll, face score, gaze offset, head deviation
82
+ 3. **Eye scorer** β€” EAR (left/right/avg), horizontal/vertical gaze ratio, MAR
83
+ 4. **Temporal tracking** β€” PERCLOS, blink rate, closure duration, yawn duration
84
+ 5. **Classification** β€” 10-feature vector β†’ MLP or XGBoost β†’ focused / unfocused
85
+
86
+ ## 7. Tech Stack
87
+
88
+ - **Backend:** Python, FastAPI, WebSocket, aiosqlite
89
+ - **Frontend:** React, Vite, TypeScript
90
+ - **ML:** PyTorch (MLP), XGBoost, scikit-learn
91
+ - **Vision:** MediaPipe, OpenCV