Spaces:
Running on CPU Upgrade
Running on CPU Upgrade
k22056537 commited on
Commit ·
8b47064
1
Parent(s): 6114098
chore: MLP pipeline, evaluation updates, feature importance, confusion matrices
Browse files- .gitignore +1 -0
- FOCUS_SCORE_EQUATIONS.md +0 -147
- checkpoints/{scaler_best.joblib → hybrid_combiner.joblib} +2 -2
- checkpoints/hybrid_focus_config.json +7 -3
- checkpoints/{model_best.joblib → meta_mlp.npz} +2 -2
- checkpoints/scaler_mlp.joblib +3 -0
- evaluation/README.md +9 -2
- evaluation/THRESHOLD_JUSTIFICATION.md +124 -7
- evaluation/feature_importance.py +187 -0
- evaluation/feature_selection_justification.md +54 -0
- evaluation/justify_thresholds.py +528 -18
- evaluation/plots/confusion_matrix_mlp.png +0 -0
- evaluation/plots/confusion_matrix_xgb.png +0 -0
- evaluation/plots/hybrid_xgb_weight_search.png +0 -0
- models/mlp/train.py +10 -2
- requirements.txt +1 -0
- ui/README.md +1 -1
- ui/live_demo.py +7 -8
- ui/pipeline.py +137 -59
.gitignore
CHANGED
|
@@ -41,3 +41,4 @@ test_focus_guard.db
|
|
| 41 |
static/
|
| 42 |
__pycache__/
|
| 43 |
docs/
|
|
|
|
|
|
| 41 |
static/
|
| 42 |
__pycache__/
|
| 43 |
docs/
|
| 44 |
+
docs
|
FOCUS_SCORE_EQUATIONS.md
DELETED
|
@@ -1,147 +0,0 @@
|
|
| 1 |
-
# How the focused/unfocused score is computed
|
| 2 |
-
|
| 3 |
-
The system outputs a **focus score** in `[0, 1]` and a binary **focused/unfocused** label. The label is derived from the score and a threshold. The exact equation depends on which pipeline (model) you use.
|
| 4 |
-
|
| 5 |
-
---
|
| 6 |
-
|
| 7 |
-
## 1. Final output (all pipelines)
|
| 8 |
-
|
| 9 |
-
- **`raw_score`** (or **`focus_score`** in Hybrid): value in `[0, 1]` after optional smoothing.
|
| 10 |
-
- **`is_focused`**: binary label.
|
| 11 |
-
|
| 12 |
-
**Equation:**
|
| 13 |
-
|
| 14 |
-
```text
|
| 15 |
-
is_focused = (smoothed_score >= threshold)
|
| 16 |
-
```
|
| 17 |
-
|
| 18 |
-
- **Smoothed score:** the pipeline may apply an exponential moving average (EMA) to the raw score; that smoothed value is what you see as `raw_score` / `focus_score` in the API.
|
| 19 |
-
- **Threshold:** set in the UI (sensitivity) or in pipeline config; typical default **0.5** or **0.55**.
|
| 20 |
-
|
| 21 |
-
So: **focus score** is the continuous value; **focused vs unfocused** is **score ≥ threshold** vs **score < threshold**.
|
| 22 |
-
|
| 23 |
-
---
|
| 24 |
-
|
| 25 |
-
## 2. Geometric pipeline (rule-based, no ML)
|
| 26 |
-
|
| 27 |
-
**Raw score (before smoothing):**
|
| 28 |
-
|
| 29 |
-
```text
|
| 30 |
-
raw = α · s_face + β · s_eye
|
| 31 |
-
```
|
| 32 |
-
|
| 33 |
-
- Default: **α = 0.4**, **β = 0.6** (face weight 40%, eye weight 60%).
|
| 34 |
-
- If **yawning** (MAR > 0.55): **raw = 0**.
|
| 35 |
-
|
| 36 |
-
**Face score `s_face`** (head pose, from `HeadPoseEstimator`):
|
| 37 |
-
|
| 38 |
-
- **deviation** = √( yaw² + pitch² + (0.5·roll)² )
|
| 39 |
-
- **t** = min( deviation / max_angle , 1 ), with **max_angle = 22°** (default).
|
| 40 |
-
- **s_face** = 0.5 · (1 + cos(π · t))
|
| 41 |
-
→ 1 when head is straight, 0 when deviation ≥ max_angle.
|
| 42 |
-
|
| 43 |
-
**Eye score `s_eye`** (from `EyeBehaviourScorer`):
|
| 44 |
-
|
| 45 |
-
- **EAR** = Eye Aspect Ratio (from landmarks); use **min(left_ear, right_ear)**.
|
| 46 |
-
- **ear_s** = linear map of EAR to [0,1] between `ear_closed=0.16` and `ear_open=0.30`.
|
| 47 |
-
- **Gaze:** horizontal/vertical gaze ratios from iris position; **offset** = distance from center (0.5, 0.5).
|
| 48 |
-
- **gaze_s** = 0.5 · (1 + cos(π · t)), with **t** = min( offset / gaze_max_offset , 1 ), **gaze_max_offset = 0.28**.
|
| 49 |
-
- **s_eye** = ear_s · gaze_s (or just ear_s if ear_s < 0.3).
|
| 50 |
-
|
| 51 |
-
Then:
|
| 52 |
-
|
| 53 |
-
```text
|
| 54 |
-
smoothed_score = EMA(raw)
|
| 55 |
-
is_focused = (smoothed_score >= threshold)
|
| 56 |
-
```
|
| 57 |
-
|
| 58 |
-
---
|
| 59 |
-
|
| 60 |
-
## 3. MLP pipeline
|
| 61 |
-
|
| 62 |
-
- Features are extracted (same 17-d feature vector as in training), clipped, then optionally extended (magnitudes, velocities, variances) and scaled with the **training-time scaler**.
|
| 63 |
-
- The MLP outputs either:
|
| 64 |
-
- **Probability of class 1 (focused):** `mlp_prob = predict_proba(X_sc)[0, 1]`, or
|
| 65 |
-
- If no `predict_proba`: **mlp_prob = 1 if predict(X_sc) == 1 else 0**.
|
| 66 |
-
|
| 67 |
-
**Equations:**
|
| 68 |
-
|
| 69 |
-
```text
|
| 70 |
-
raw_score = mlp_prob (clipped to [0, 1])
|
| 71 |
-
smoothed_score = EMA(raw_score)
|
| 72 |
-
is_focused = (smoothed_score >= threshold)
|
| 73 |
-
```
|
| 74 |
-
|
| 75 |
-
So the **focus score** is the **MLP’s estimated probability of being focused** (after optional smoothing).
|
| 76 |
-
|
| 77 |
-
---
|
| 78 |
-
|
| 79 |
-
## 4. XGBoost pipeline
|
| 80 |
-
|
| 81 |
-
- Same feature extraction and clipping; uses the **same feature subset** as in XGBoost training (no runtime magnitude/velocity extension).
|
| 82 |
-
- **prob** = `predict_proba(X)[0]` → **[P(unfocused), P(focused)]**.
|
| 83 |
-
|
| 84 |
-
**Equations:**
|
| 85 |
-
|
| 86 |
-
```text
|
| 87 |
-
raw_score = prob[1] (probability of focused class)
|
| 88 |
-
smoothed_score = EMA(raw_score)
|
| 89 |
-
is_focused = (smoothed_score >= threshold)
|
| 90 |
-
```
|
| 91 |
-
|
| 92 |
-
So the **focus score** is the **XGBoost probability of the focused class**.
|
| 93 |
-
|
| 94 |
-
---
|
| 95 |
-
|
| 96 |
-
## 5. Hybrid pipeline (MLP + geometric)
|
| 97 |
-
|
| 98 |
-
Combines the MLP’s probability with a geometric score, then applies a single threshold.
|
| 99 |
-
|
| 100 |
-
**Geometric part:**
|
| 101 |
-
|
| 102 |
-
```text
|
| 103 |
-
geo_score = geo_face_weight · s_face + geo_eye_weight · s_eye
|
| 104 |
-
```
|
| 105 |
-
|
| 106 |
-
- Default: **geo_face_weight = 0.4**, **geo_eye_weight = 0.6**.
|
| 107 |
-
- **s_face** and **s_eye** as in the Geometric pipeline (with optional yawn veto: if yawning, **geo_score = 0**).
|
| 108 |
-
- **geo_score** is clipped to [0, 1].
|
| 109 |
-
|
| 110 |
-
**MLP part:** same as MLP pipeline → **mlp_prob** in [0, 1].
|
| 111 |
-
|
| 112 |
-
**Combined focus score (default weights):**
|
| 113 |
-
|
| 114 |
-
```text
|
| 115 |
-
focus_score = w_mlp · mlp_prob + w_geo · geo_score
|
| 116 |
-
```
|
| 117 |
-
|
| 118 |
-
- Default: **w_mlp = 0.7**, **w_geo = 0.3** (after normalising so weights sum to 1).
|
| 119 |
-
- **focus_score** is clipped to [0, 1], then smoothed.
|
| 120 |
-
|
| 121 |
-
**Equations:**
|
| 122 |
-
|
| 123 |
-
```text
|
| 124 |
-
focus_score = clip( w_mlp · mlp_prob + w_geo · geo_score , 0 , 1 )
|
| 125 |
-
smoothed_score = EMA(focus_score)
|
| 126 |
-
is_focused = (smoothed_score >= threshold)
|
| 127 |
-
```
|
| 128 |
-
|
| 129 |
-
Default **threshold** in hybrid config is **0.55**.
|
| 130 |
-
|
| 131 |
-
---
|
| 132 |
-
|
| 133 |
-
## 6. Summary table
|
| 134 |
-
|
| 135 |
-
| Pipeline | Raw score formula | Focused condition |
|
| 136 |
-
|-----------|--------------------------------------|-----------------------------|
|
| 137 |
-
| Geometric | α·s_face + β·s_eye (0 if yawn) | smoothed ≥ threshold |
|
| 138 |
-
| MLP | MLP P(focused) | smoothed ≥ threshold |
|
| 139 |
-
| XGBoost | XGB P(focused) | smoothed ≥ threshold |
|
| 140 |
-
| Hybrid | w_mlp·mlp_prob + w_geo·geo_score | smoothed ≥ threshold |
|
| 141 |
-
|
| 142 |
-
**s_face** = head-pose score (cosine of normalised deviation).
|
| 143 |
-
**s_eye** = eye score (EAR × gaze score, or blend with CNN).
|
| 144 |
-
**geo_score** = geo_face_weight·s_face + geo_eye_weight·s_eye (with optional yawn veto).
|
| 145 |
-
**EMA** = exponential moving average (e.g. α=0.3) for temporal smoothing.
|
| 146 |
-
|
| 147 |
-
So: **focus score** is always a number in [0, 1]; **focused vs unfocused** is **score ≥ threshold** in all pipelines.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
checkpoints/{scaler_best.joblib → hybrid_combiner.joblib}
RENAMED
|
@@ -1,3 +1,3 @@
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:
|
| 3 |
-
size
|
|
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:7e460c6ca8d2cadf37727456401a0d63028ba23cb6401f0835d869abfa2e053c
|
| 3 |
+
size 965
|
checkpoints/hybrid_focus_config.json
CHANGED
|
@@ -1,10 +1,14 @@
|
|
| 1 |
{
|
|
|
|
| 2 |
"w_mlp": 0.3,
|
|
|
|
| 3 |
"w_geo": 0.7,
|
| 4 |
-
"threshold": 0.
|
| 5 |
"use_yawn_veto": true,
|
| 6 |
"geo_face_weight": 0.7,
|
| 7 |
"geo_eye_weight": 0.3,
|
| 8 |
"mar_yawn_threshold": 0.55,
|
| 9 |
-
"metric": "f1"
|
| 10 |
-
|
|
|
|
|
|
|
|
|
| 1 |
{
|
| 2 |
+
"use_xgb": true,
|
| 3 |
"w_mlp": 0.3,
|
| 4 |
+
"w_xgb": 0.3,
|
| 5 |
"w_geo": 0.7,
|
| 6 |
+
"threshold": 0.46117913373775393,
|
| 7 |
"use_yawn_veto": true,
|
| 8 |
"geo_face_weight": 0.7,
|
| 9 |
"geo_eye_weight": 0.3,
|
| 10 |
"mar_yawn_threshold": 0.55,
|
| 11 |
+
"metric": "f1",
|
| 12 |
+
"combiner": "logistic",
|
| 13 |
+
"combiner_path": "/Users/mohammedalketbi22/GAP/Final/checkpoints/hybrid_combiner.joblib"
|
| 14 |
+
}
|
checkpoints/{model_best.joblib → meta_mlp.npz}
RENAMED
|
@@ -1,3 +1,3 @@
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:
|
| 3 |
-
size
|
|
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:4771c61cdf0711aa640b4d600a0851d344414cd16c1c2f75afc90e3c6135d14b
|
| 3 |
+
size 840
|
checkpoints/scaler_mlp.joblib
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:2038d5b051d4de303c5688b1b861a0b53b1307a52b9447bfa48e8c7ace749329
|
| 3 |
+
size 823
|
evaluation/README.md
CHANGED
|
@@ -8,7 +8,9 @@ Training logs, threshold analysis, and performance metrics.
|
|
| 8 |
logs/ # training run logs (JSON)
|
| 9 |
plots/ # threshold justification figures (ROC, weight search, EAR/MAR)
|
| 10 |
justify_thresholds.py # LOPO analysis script
|
| 11 |
-
|
|
|
|
|
|
|
| 12 |
```
|
| 13 |
|
| 14 |
**Logs (when present):**
|
|
@@ -64,9 +66,14 @@ From repo root, with venv active. The script runs LOPO over 9 participants (~145
|
|
| 64 |
|
| 65 |
Takes ~10–15 minutes. Re-run after changing data or pipeline weights (e.g. geometric face/eye); hybrid optimal w_mlp depends on the geometric sub-score weights.
|
| 66 |
|
| 67 |
-
## 4.
|
|
|
|
|
|
|
|
|
|
|
|
|
| 68 |
|
| 69 |
- `python -m models.mlp.train` → MLP log in `logs/`
|
| 70 |
- `python -m models.xgboost.train` → XGBoost log in `logs/`
|
| 71 |
- `python -m evaluation.justify_thresholds` → plots + THRESHOLD_JUSTIFICATION.md
|
|
|
|
| 72 |
- Notebooks in `notebooks/` can also write logs here
|
|
|
|
| 8 |
logs/ # training run logs (JSON)
|
| 9 |
plots/ # threshold justification figures (ROC, weight search, EAR/MAR)
|
| 10 |
justify_thresholds.py # LOPO analysis script
|
| 11 |
+
feature_importance.py # XGBoost importance + leave-one-out ablation
|
| 12 |
+
THRESHOLD_JUSTIFICATION.md # report (auto-generated by justify_thresholds)
|
| 13 |
+
feature_selection_justification.md # report (auto-generated by feature_importance)
|
| 14 |
```
|
| 15 |
|
| 16 |
**Logs (when present):**
|
|
|
|
| 66 |
|
| 67 |
Takes ~10–15 minutes. Re-run after changing data or pipeline weights (e.g. geometric face/eye); hybrid optimal w_mlp depends on the geometric sub-score weights.
|
| 68 |
|
| 69 |
+
## 4. Feature selection justification
|
| 70 |
+
|
| 71 |
+
Run `python -m evaluation.feature_importance` to compute XGBoost gain-based importance for the 10 face_orientation features and a leave-one-feature-out LOPO ablation. Writes **feature_selection_justification.md** with tables. Use this to justify the 10-of-17 feature set (ablation + importance; see PAPER_AUDIT §2.7).
|
| 72 |
+
|
| 73 |
+
## 5. Generated by
|
| 74 |
|
| 75 |
- `python -m models.mlp.train` → MLP log in `logs/`
|
| 76 |
- `python -m models.xgboost.train` → XGBoost log in `logs/`
|
| 77 |
- `python -m evaluation.justify_thresholds` → plots + THRESHOLD_JUSTIFICATION.md
|
| 78 |
+
- `python -m evaluation.feature_importance` → feature_selection_justification.md
|
| 79 |
- Notebooks in `notebooks/` can also write logs here
|
evaluation/THRESHOLD_JUSTIFICATION.md
CHANGED
|
@@ -15,7 +15,92 @@ Thresholds selected via **Youden's J statistic** (J = sensitivity + specificity
|
|
| 15 |
|
| 16 |

|
| 17 |
|
| 18 |
-
## 2.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 19 |
|
| 20 |
Grid search over face weight alpha in {0.2 ... 0.8}. Eye weight = 1 - alpha. Threshold per fold via Youden's J.
|
| 21 |
|
|
@@ -33,9 +118,9 @@ Grid search over face weight alpha in {0.2 ... 0.8}. Eye weight = 1 - alpha. Thr
|
|
| 33 |
|
| 34 |

|
| 35 |
|
| 36 |
-
##
|
| 37 |
|
| 38 |
-
Grid search over w_mlp in {0.3 ... 0.8}. w_geo = 1 - w_mlp. Geometric sub-score uses same weights as geometric pipeline (face=0.7, eye=0.3).
|
| 39 |
|
| 40 |
| MLP Weight (w_mlp) | Mean LOPO F1 |
|
| 41 |
|-------------------:|-------------:|
|
|
@@ -46,11 +131,43 @@ Grid search over w_mlp in {0.3 ... 0.8}. w_geo = 1 - w_mlp. Geometric sub-score
|
|
| 46 |
| 0.7 | 0.8039 |
|
| 47 |
| 0.8 | 0.8016 |
|
| 48 |
|
| 49 |
-
**Best:** w_mlp = 0.3 (MLP 30%, geometric 70%)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 50 |
|
| 51 |
-
|
| 52 |
|
| 53 |
-
##
|
| 54 |
|
| 55 |
### EAR (Eye Aspect Ratio)
|
| 56 |
|
|
@@ -76,7 +193,7 @@ Between 0.16 and 0.30 the `_ear_score` function linearly interpolates from 0 to
|
|
| 76 |
|
| 77 |

|
| 78 |
|
| 79 |
-
##
|
| 80 |
|
| 81 |
| Constant | Value | Rationale |
|
| 82 |
|----------|------:|-----------|
|
|
|
|
| 15 |
|
| 16 |

|
| 17 |
|
| 18 |
+
## 2. Precision, Recall and Tradeoff
|
| 19 |
+
|
| 20 |
+
At the optimal threshold (Youden's J), pooled over all LOPO held-out predictions:
|
| 21 |
+
|
| 22 |
+
| Model | Threshold | Precision | Recall | F1 | Accuracy |
|
| 23 |
+
|-------|----------:|----------:|-------:|---:|---------:|
|
| 24 |
+
| MLP | 0.228 | 0.8187 | 0.9008 | 0.8578 | 0.8164 |
|
| 25 |
+
| XGBoost | 0.377 | 0.8426 | 0.8750 | 0.8585 | 0.8228 |
|
| 26 |
+
|
| 27 |
+
Higher threshold → fewer positive predictions → higher precision, lower recall. Youden's J picks the threshold that balances sensitivity and specificity (recall for the positive class and true negative rate).
|
| 28 |
+
|
| 29 |
+
## 3. Confusion Matrix (Pooled LOPO)
|
| 30 |
+
|
| 31 |
+
At optimal threshold. Rows = true label, columns = predicted label (0 = unfocused, 1 = focused).
|
| 32 |
+
|
| 33 |
+
### MLP
|
| 34 |
+
|
| 35 |
+
| | Pred 0 | Pred 1 |
|
| 36 |
+
|--|-------:|-------:|
|
| 37 |
+
| **True 0** | 38065 (TN) | 17750 (FP) |
|
| 38 |
+
| **True 1** | 8831 (FN) | 80147 (TP) |
|
| 39 |
+
|
| 40 |
+
TN=38065, FP=17750, FN=8831, TP=80147.
|
| 41 |
+
|
| 42 |
+
### XGBoost
|
| 43 |
+
|
| 44 |
+
| | Pred 0 | Pred 1 |
|
| 45 |
+
|--|-------:|-------:|
|
| 46 |
+
| **True 0** | 41271 (TN) | 14544 (FP) |
|
| 47 |
+
| **True 1** | 11118 (FN) | 77860 (TP) |
|
| 48 |
+
|
| 49 |
+
TN=41271, FP=14544, FN=11118, TP=77860.
|
| 50 |
+
|
| 51 |
+

|
| 52 |
+
|
| 53 |
+

|
| 54 |
+
|
| 55 |
+
## 4. Per-Person Performance Variance (LOPO)
|
| 56 |
+
|
| 57 |
+
One fold per left-out person; metrics at optimal threshold.
|
| 58 |
+
|
| 59 |
+
### MLP — per held-out person
|
| 60 |
+
|
| 61 |
+
| Person | Accuracy | F1 | Precision | Recall |
|
| 62 |
+
|--------|---------:|---:|----------:|-------:|
|
| 63 |
+
| Abdelrahman | 0.8628 | 0.9029 | 0.8760 | 0.9314 |
|
| 64 |
+
| Jarek | 0.8400 | 0.8770 | 0.8909 | 0.8635 |
|
| 65 |
+
| Junhao | 0.8872 | 0.8986 | 0.8354 | 0.9723 |
|
| 66 |
+
| Kexin | 0.7941 | 0.8123 | 0.7965 | 0.8288 |
|
| 67 |
+
| Langyuan | 0.5877 | 0.6169 | 0.4972 | 0.8126 |
|
| 68 |
+
| Mohamed | 0.8432 | 0.8653 | 0.7931 | 0.9519 |
|
| 69 |
+
| Yingtao | 0.8794 | 0.9263 | 0.9217 | 0.9309 |
|
| 70 |
+
| ayten | 0.8307 | 0.8986 | 0.8558 | 0.9459 |
|
| 71 |
+
| saba | 0.9192 | 0.9243 | 0.9260 | 0.9226 |
|
| 72 |
+
|
| 73 |
+
### XGBoost — per held-out person
|
| 74 |
+
|
| 75 |
+
| Person | Accuracy | F1 | Precision | Recall |
|
| 76 |
+
|--------|---------:|---:|----------:|-------:|
|
| 77 |
+
| Abdelrahman | 0.8601 | 0.8959 | 0.9129 | 0.8795 |
|
| 78 |
+
| Jarek | 0.8680 | 0.8993 | 0.9070 | 0.8917 |
|
| 79 |
+
| Junhao | 0.9099 | 0.9180 | 0.8627 | 0.9810 |
|
| 80 |
+
| Kexin | 0.7363 | 0.7385 | 0.7906 | 0.6928 |
|
| 81 |
+
| Langyuan | 0.6738 | 0.6945 | 0.5625 | 0.9074 |
|
| 82 |
+
| Mohamed | 0.8868 | 0.8988 | 0.8529 | 0.9498 |
|
| 83 |
+
| Yingtao | 0.8711 | 0.9195 | 0.9347 | 0.9048 |
|
| 84 |
+
| ayten | 0.8451 | 0.9070 | 0.8654 | 0.9528 |
|
| 85 |
+
| saba | 0.9393 | 0.9421 | 0.9615 | 0.9235 |
|
| 86 |
+
|
| 87 |
+
### Summary across persons
|
| 88 |
+
|
| 89 |
+
| Model | Accuracy mean ± std | F1 mean ± std | Precision mean ± std | Recall mean ± std |
|
| 90 |
+
|-------|---------------------|---------------|----------------------|-------------------|
|
| 91 |
+
| MLP | 0.8271 ± 0.0968 | 0.8580 ± 0.0968 | 0.8214 ± 0.1307 | 0.9067 ± 0.0572 |
|
| 92 |
+
| XGBoost | 0.8434 ± 0.0847 | 0.8682 ± 0.0879 | 0.8500 ± 0.1191 | 0.8981 ± 0.0836 |
|
| 93 |
+
|
| 94 |
+
## 5. Confidence Intervals (95%, LOPO over 9 persons)
|
| 95 |
+
|
| 96 |
+
Mean ± half-width of 95% t-interval (df=8) for each metric across the 9 left-out persons.
|
| 97 |
+
|
| 98 |
+
| Model | F1 | Accuracy | Precision | Recall |
|
| 99 |
+
|-------|---:|--------:|----------:|-------:|
|
| 100 |
+
| MLP | 0.8580 [0.7835, 0.9326] | 0.8271 [0.7526, 0.9017] | 0.8214 [0.7207, 0.9221] | 0.9067 [0.8626, 0.9507] |
|
| 101 |
+
| XGBoost | 0.8682 [0.8005, 0.9358] | 0.8434 [0.7781, 0.9086] | 0.8500 [0.7583, 0.9417] | 0.8981 [0.8338, 0.9625] |
|
| 102 |
+
|
| 103 |
+
## 6. Geometric Pipeline Weights (s_face vs s_eye)
|
| 104 |
|
| 105 |
Grid search over face weight alpha in {0.2 ... 0.8}. Eye weight = 1 - alpha. Threshold per fold via Youden's J.
|
| 106 |
|
|
|
|
| 118 |
|
| 119 |

|
| 120 |
|
| 121 |
+
## 7. Hybrid Pipeline: MLP vs Geometric
|
| 122 |
|
| 123 |
+
Grid search over w_mlp in {0.3 ... 0.8}. w_geo = 1 - w_mlp. Geometric sub-score uses same weights as geometric pipeline (face=0.7, eye=0.3).
|
| 124 |
|
| 125 |
| MLP Weight (w_mlp) | Mean LOPO F1 |
|
| 126 |
|-------------------:|-------------:|
|
|
|
|
| 131 |
| 0.7 | 0.8039 |
|
| 132 |
| 0.8 | 0.8016 |
|
| 133 |
|
| 134 |
+
**Best:** w_mlp = 0.3 (MLP 30%, geometric 70%) → mean LOPO F1 = 0.8409
|
| 135 |
+
|
| 136 |
+

|
| 137 |
+
|
| 138 |
+
## 8. Hybrid Pipeline: XGBoost vs Geometric
|
| 139 |
+
|
| 140 |
+
Same grid over w_xgb in {0.3 ... 0.8}. w_geo = 1 - w_xgb.
|
| 141 |
+
|
| 142 |
+
| XGBoost Weight (w_xgb) | Mean LOPO F1 |
|
| 143 |
+
|-----------------------:|-------------:|
|
| 144 |
+
| 0.3 | 0.8639 **<-- selected** |
|
| 145 |
+
| 0.4 | 0.8552 |
|
| 146 |
+
| 0.5 | 0.8451 |
|
| 147 |
+
| 0.6 | 0.8419 |
|
| 148 |
+
| 0.7 | 0.8382 |
|
| 149 |
+
| 0.8 | 0.8353 |
|
| 150 |
+
|
| 151 |
+
**Best:** w_xgb = 0.3 → mean LOPO F1 = 0.8639
|
| 152 |
+
|
| 153 |
+

|
| 154 |
+
|
| 155 |
+
### Which hybrid is used in the app?
|
| 156 |
+
|
| 157 |
+
**XGBoost hybrid is better** (F1 = 0.8639 vs MLP hybrid F1 = 0.8409).
|
| 158 |
+
|
| 159 |
+
### Logistic regression combiner (replaces heuristic weights)
|
| 160 |
+
|
| 161 |
+
Instead of a fixed linear blend (e.g. 0.3·ML + 0.7·geo), a **logistic regression** combines model probability and geometric score: meta-features = [model_prob, geo_score], trained on the same LOPO splits. Threshold from Youden's J on combiner output.
|
| 162 |
+
|
| 163 |
+
| Method | Mean LOPO F1 |
|
| 164 |
+
|--------|-------------:|
|
| 165 |
+
| Heuristic weight grid (best w) | 0.8639 |
|
| 166 |
+
| **LR combiner** | **0.8241** |
|
| 167 |
|
| 168 |
+
The app uses the saved LR combiner when `combiner_path` is set in `hybrid_focus_config.json`.
|
| 169 |
|
| 170 |
+
## 5. Eye and Mouth Aspect Ratio Thresholds
|
| 171 |
|
| 172 |
### EAR (Eye Aspect Ratio)
|
| 173 |
|
|
|
|
| 193 |
|
| 194 |

|
| 195 |
|
| 196 |
+
## 10. Other Constants
|
| 197 |
|
| 198 |
| Constant | Value | Rationale |
|
| 199 |
|----------|------:|-----------|
|
evaluation/feature_importance.py
ADDED
|
@@ -0,0 +1,187 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""
|
| 2 |
+
Feature importance and leave-one-feature-out ablation for the 10 face_orientation features.
|
| 3 |
+
Run: python -m evaluation.feature_importance
|
| 4 |
+
|
| 5 |
+
Outputs:
|
| 6 |
+
- XGBoost gain-based importance (from trained checkpoint)
|
| 7 |
+
- Leave-one-feature-out LOPO F1 (ablation): drop each feature in turn, report mean LOPO F1.
|
| 8 |
+
- Writes evaluation/feature_selection_justification.md
|
| 9 |
+
"""
|
| 10 |
+
|
| 11 |
+
import os
|
| 12 |
+
import sys
|
| 13 |
+
|
| 14 |
+
import numpy as np
|
| 15 |
+
from sklearn.preprocessing import StandardScaler
|
| 16 |
+
from sklearn.metrics import f1_score
|
| 17 |
+
from xgboost import XGBClassifier
|
| 18 |
+
|
| 19 |
+
_PROJECT_ROOT = os.path.abspath(os.path.join(os.path.dirname(__file__), ".."))
|
| 20 |
+
if _PROJECT_ROOT not in sys.path:
|
| 21 |
+
sys.path.insert(0, _PROJECT_ROOT)
|
| 22 |
+
|
| 23 |
+
from data_preparation.prepare_dataset import load_per_person, SELECTED_FEATURES
|
| 24 |
+
|
| 25 |
+
SEED = 42
|
| 26 |
+
FEATURES = SELECTED_FEATURES["face_orientation"]
|
| 27 |
+
|
| 28 |
+
|
| 29 |
+
def _resolve_xgb_path():
|
| 30 |
+
p = os.path.join(_PROJECT_ROOT, "models", "xgboost", "checkpoints", "face_orientation_best.json")
|
| 31 |
+
if os.path.isfile(p):
|
| 32 |
+
return p
|
| 33 |
+
return os.path.join(_PROJECT_ROOT, "checkpoints", "xgboost_face_orientation_best.json")
|
| 34 |
+
|
| 35 |
+
|
| 36 |
+
def xgb_feature_importance():
|
| 37 |
+
"""Load trained XGBoost and return gain-based importance for the 10 features."""
|
| 38 |
+
path = _resolve_xgb_path()
|
| 39 |
+
if not os.path.isfile(path):
|
| 40 |
+
print(f"[WARN] No XGBoost checkpoint at {path}; skip importance.")
|
| 41 |
+
return None
|
| 42 |
+
model = XGBClassifier()
|
| 43 |
+
model.load_model(path)
|
| 44 |
+
imp = model.get_booster().get_score(importance_type="gain")
|
| 45 |
+
# Booster uses f0, f1, ...; we use same order as FEATURES (training order)
|
| 46 |
+
by_idx = {int(k.replace("f", "")): v for k, v in imp.items() if k.startswith("f")}
|
| 47 |
+
order = [by_idx.get(i, 0.0) for i in range(len(FEATURES))]
|
| 48 |
+
return dict(zip(FEATURES, order))
|
| 49 |
+
|
| 50 |
+
|
| 51 |
+
def run_ablation_lopo():
|
| 52 |
+
"""Leave-one-feature-out: for each feature, train XGBoost on the other 9 with LOPO, report mean F1."""
|
| 53 |
+
by_person, _, _ = load_per_person("face_orientation")
|
| 54 |
+
persons = sorted(by_person.keys())
|
| 55 |
+
n_folds = len(persons)
|
| 56 |
+
|
| 57 |
+
results = {}
|
| 58 |
+
for drop_feat in FEATURES:
|
| 59 |
+
idx_keep = [i for i, f in enumerate(FEATURES) if f != drop_feat]
|
| 60 |
+
f1s = []
|
| 61 |
+
for held_out in persons:
|
| 62 |
+
train_X = np.concatenate([by_person[p][0] for p in persons if p != held_out])
|
| 63 |
+
train_y = np.concatenate([by_person[p][1] for p in persons if p != held_out])
|
| 64 |
+
X_test, y_test = by_person[held_out]
|
| 65 |
+
|
| 66 |
+
X_tr = train_X[:, idx_keep]
|
| 67 |
+
X_te = X_test[:, idx_keep]
|
| 68 |
+
scaler = StandardScaler().fit(X_tr)
|
| 69 |
+
X_tr_sc = scaler.transform(X_tr)
|
| 70 |
+
X_te_sc = scaler.transform(X_te)
|
| 71 |
+
|
| 72 |
+
xgb = XGBClassifier(
|
| 73 |
+
n_estimators=600, max_depth=8, learning_rate=0.05,
|
| 74 |
+
subsample=0.8, colsample_bytree=0.8,
|
| 75 |
+
reg_alpha=0.1, reg_lambda=1.0,
|
| 76 |
+
use_label_encoder=False, eval_metric="logloss",
|
| 77 |
+
random_state=SEED, verbosity=0,
|
| 78 |
+
)
|
| 79 |
+
xgb.fit(X_tr_sc, train_y)
|
| 80 |
+
pred = xgb.predict(X_te_sc)
|
| 81 |
+
f1s.append(f1_score(y_test, pred, average="weighted"))
|
| 82 |
+
results[drop_feat] = np.mean(f1s)
|
| 83 |
+
return results
|
| 84 |
+
|
| 85 |
+
|
| 86 |
+
def run_baseline_lopo_f1():
|
| 87 |
+
"""Full 10-feature LOPO mean F1 for reference."""
|
| 88 |
+
by_person, _, _ = load_per_person("face_orientation")
|
| 89 |
+
persons = sorted(by_person.keys())
|
| 90 |
+
f1s = []
|
| 91 |
+
for held_out in persons:
|
| 92 |
+
train_X = np.concatenate([by_person[p][0] for p in persons if p != held_out])
|
| 93 |
+
train_y = np.concatenate([by_person[p][1] for p in persons if p != held_out])
|
| 94 |
+
X_test, y_test = by_person[held_out]
|
| 95 |
+
scaler = StandardScaler().fit(train_X)
|
| 96 |
+
X_tr_sc = scaler.transform(train_X)
|
| 97 |
+
X_te_sc = scaler.transform(X_test)
|
| 98 |
+
xgb = XGBClassifier(
|
| 99 |
+
n_estimators=600, max_depth=8, learning_rate=0.05,
|
| 100 |
+
subsample=0.8, colsample_bytree=0.8,
|
| 101 |
+
reg_alpha=0.1, reg_lambda=1.0,
|
| 102 |
+
use_label_encoder=False, eval_metric="logloss",
|
| 103 |
+
random_state=SEED, verbosity=0,
|
| 104 |
+
)
|
| 105 |
+
xgb.fit(X_tr_sc, train_y)
|
| 106 |
+
pred = xgb.predict(X_te_sc)
|
| 107 |
+
f1s.append(f1_score(y_test, pred, average="weighted"))
|
| 108 |
+
return np.mean(f1s)
|
| 109 |
+
|
| 110 |
+
|
| 111 |
+
def main():
|
| 112 |
+
print("=== Feature importance (XGBoost gain) ===")
|
| 113 |
+
imp = xgb_feature_importance()
|
| 114 |
+
if imp:
|
| 115 |
+
for name in FEATURES:
|
| 116 |
+
print(f" {name}: {imp.get(name, 0):.2f}")
|
| 117 |
+
order = sorted(imp.items(), key=lambda x: -x[1])
|
| 118 |
+
print(" Top-5 by gain:", [x[0] for x in order[:5]])
|
| 119 |
+
|
| 120 |
+
print("\n=== Leave-one-feature-out ablation (LOPO mean F1) ===")
|
| 121 |
+
baseline = run_baseline_lopo_f1()
|
| 122 |
+
print(f" Baseline (all 10 features) mean LOPO F1: {baseline:.4f}")
|
| 123 |
+
ablation = run_ablation_lopo()
|
| 124 |
+
for feat in FEATURES:
|
| 125 |
+
delta = baseline - ablation[feat]
|
| 126 |
+
print(f" drop {feat}: F1={ablation[feat]:.4f} (Δ={delta:+.4f})")
|
| 127 |
+
worst_drop = min(ablation.items(), key=lambda x: x[1])
|
| 128 |
+
print(f" Largest F1 drop when dropping: {worst_drop[0]} (F1={worst_drop[1]:.4f})")
|
| 129 |
+
|
| 130 |
+
out_dir = os.path.join(_PROJECT_ROOT, "evaluation")
|
| 131 |
+
out_path = os.path.join(out_dir, "feature_selection_justification.md")
|
| 132 |
+
lines = [
|
| 133 |
+
"# Feature selection justification",
|
| 134 |
+
"",
|
| 135 |
+
"The face_orientation model uses 10 of 17 extracted features. This document summarises empirical support.",
|
| 136 |
+
"",
|
| 137 |
+
"## 1. Domain rationale",
|
| 138 |
+
"",
|
| 139 |
+
"The 10 features were chosen to cover three channels:",
|
| 140 |
+
"- **Head pose:** head_deviation, s_face, pitch",
|
| 141 |
+
"- **Eye state:** ear_left, ear_right, ear_avg, perclos",
|
| 142 |
+
"- **Gaze:** h_gaze, gaze_offset, s_eye",
|
| 143 |
+
"",
|
| 144 |
+
"Excluded: v_gaze (noisy), mar (rare events), yaw/roll (redundant with head_deviation/s_face), blink_rate/closure_duration/yawn_duration (temporal overlap with perclos).",
|
| 145 |
+
"",
|
| 146 |
+
"## 2. XGBoost feature importance (gain)",
|
| 147 |
+
"",
|
| 148 |
+
"From the trained XGBoost checkpoint (gain on the 10 features):",
|
| 149 |
+
"",
|
| 150 |
+
"| Feature | Gain |",
|
| 151 |
+
"|---------|------|",
|
| 152 |
+
]
|
| 153 |
+
if imp:
|
| 154 |
+
for name in FEATURES:
|
| 155 |
+
lines.append(f"| {name} | {imp.get(name, 0):.2f} |")
|
| 156 |
+
order = sorted(imp.items(), key=lambda x: -x[1])
|
| 157 |
+
lines.append("")
|
| 158 |
+
lines.append(f"**Top 5 by gain:** {', '.join(x[0] for x in order[:5])}.")
|
| 159 |
+
else:
|
| 160 |
+
lines.append("(Run with XGBoost checkpoint to populate.)")
|
| 161 |
+
lines.extend([
|
| 162 |
+
"",
|
| 163 |
+
"## 3. Leave-one-feature-out ablation (LOPO)",
|
| 164 |
+
"",
|
| 165 |
+
f"Baseline (all 10 features) mean LOPO F1: **{baseline:.4f}**.",
|
| 166 |
+
"",
|
| 167 |
+
"| Feature dropped | Mean LOPO F1 | Δ vs baseline |",
|
| 168 |
+
"|------------------|--------------|---------------|",
|
| 169 |
+
])
|
| 170 |
+
for feat in FEATURES:
|
| 171 |
+
delta = baseline - ablation[feat]
|
| 172 |
+
lines.append(f"| {feat} | {ablation[feat]:.4f} | {delta:+.4f} |")
|
| 173 |
+
worst_drop = min(ablation.items(), key=lambda x: x[1])
|
| 174 |
+
lines.append("")
|
| 175 |
+
lines.append(f"Dropping **{worst_drop[0]}** hurts most (F1={worst_drop[1]:.4f}), consistent with it being important.")
|
| 176 |
+
lines.append("")
|
| 177 |
+
lines.append("## 4. Conclusion")
|
| 178 |
+
lines.append("")
|
| 179 |
+
lines.append("Selection is supported by (1) domain rationale (three attention channels), (2) XGBoost gain importance, and (3) leave-one-out ablation. SHAP or correlation-based pruning can be added in future work.")
|
| 180 |
+
lines.append("")
|
| 181 |
+
with open(out_path, "w", encoding="utf-8") as f:
|
| 182 |
+
f.write("\n".join(lines))
|
| 183 |
+
print(f"\nReport written to {out_path}")
|
| 184 |
+
|
| 185 |
+
|
| 186 |
+
if __name__ == "__main__":
|
| 187 |
+
main()
|
evaluation/feature_selection_justification.md
ADDED
|
@@ -0,0 +1,54 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Feature selection justification
|
| 2 |
+
|
| 3 |
+
The face_orientation model uses 10 of 17 extracted features. This document summarises empirical support.
|
| 4 |
+
|
| 5 |
+
## 1. Domain rationale
|
| 6 |
+
|
| 7 |
+
The 10 features were chosen to cover three channels:
|
| 8 |
+
- **Head pose:** head_deviation, s_face, pitch
|
| 9 |
+
- **Eye state:** ear_left, ear_right, ear_avg, perclos
|
| 10 |
+
- **Gaze:** h_gaze, gaze_offset, s_eye
|
| 11 |
+
|
| 12 |
+
Excluded: v_gaze (noisy), mar (rare events), yaw/roll (redundant with head_deviation/s_face), blink_rate/closure_duration/yawn_duration (temporal overlap with perclos).
|
| 13 |
+
|
| 14 |
+
## 2. XGBoost feature importance (gain)
|
| 15 |
+
|
| 16 |
+
From the trained XGBoost checkpoint (gain on the 10 features):
|
| 17 |
+
|
| 18 |
+
| Feature | Gain |
|
| 19 |
+
|---------|------|
|
| 20 |
+
| head_deviation | 8.83 |
|
| 21 |
+
| s_face | 10.27 |
|
| 22 |
+
| s_eye | 2.18 |
|
| 23 |
+
| h_gaze | 4.99 |
|
| 24 |
+
| pitch | 4.64 |
|
| 25 |
+
| ear_left | 3.57 |
|
| 26 |
+
| ear_avg | 6.96 |
|
| 27 |
+
| ear_right | 9.54 |
|
| 28 |
+
| gaze_offset | 1.80 |
|
| 29 |
+
| perclos | 5.68 |
|
| 30 |
+
|
| 31 |
+
**Top 5 by gain:** s_face, ear_right, head_deviation, ear_avg, perclos.
|
| 32 |
+
|
| 33 |
+
## 3. Leave-one-feature-out ablation (LOPO)
|
| 34 |
+
|
| 35 |
+
Baseline (all 10 features) mean LOPO F1: **0.8327**.
|
| 36 |
+
|
| 37 |
+
| Feature dropped | Mean LOPO F1 | Δ vs baseline |
|
| 38 |
+
|------------------|--------------|---------------|
|
| 39 |
+
| head_deviation | 0.8395 | -0.0068 |
|
| 40 |
+
| s_face | 0.8390 | -0.0063 |
|
| 41 |
+
| s_eye | 0.8342 | -0.0015 |
|
| 42 |
+
| h_gaze | 0.8244 | +0.0083 |
|
| 43 |
+
| pitch | 0.8250 | +0.0077 |
|
| 44 |
+
| ear_left | 0.8326 | +0.0001 |
|
| 45 |
+
| ear_avg | 0.8350 | -0.0023 |
|
| 46 |
+
| ear_right | 0.8344 | -0.0017 |
|
| 47 |
+
| gaze_offset | 0.8351 | -0.0024 |
|
| 48 |
+
| perclos | 0.8258 | +0.0069 |
|
| 49 |
+
|
| 50 |
+
Dropping **h_gaze** hurts most (F1=0.8244), consistent with it being important.
|
| 51 |
+
|
| 52 |
+
## 4. Conclusion
|
| 53 |
+
|
| 54 |
+
Selection is supported by (1) domain rationale (three attention channels), (2) XGBoost gain importance, and (3) leave-one-out ablation. SHAP or correlation-based pruning can be added in future work.
|
evaluation/justify_thresholds.py
CHANGED
|
@@ -8,9 +8,19 @@ import numpy as np
|
|
| 8 |
import matplotlib
|
| 9 |
matplotlib.use("Agg")
|
| 10 |
import matplotlib.pyplot as plt
|
|
|
|
|
|
|
| 11 |
from sklearn.neural_network import MLPClassifier
|
| 12 |
from sklearn.preprocessing import StandardScaler
|
| 13 |
-
from sklearn.metrics import
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 14 |
from xgboost import XGBClassifier
|
| 15 |
|
| 16 |
_PROJECT_ROOT = os.path.abspath(os.path.join(os.path.dirname(__file__), ".."))
|
|
@@ -56,7 +66,8 @@ def run_lopo_models():
|
|
| 56 |
by_person, _, _ = load_per_person("face_orientation")
|
| 57 |
persons = sorted(by_person.keys())
|
| 58 |
|
| 59 |
-
results = {"mlp": {"y": [], "p": []
|
|
|
|
| 60 |
|
| 61 |
for i, held_out in enumerate(persons):
|
| 62 |
X_test, y_test = by_person[held_out]
|
|
@@ -77,6 +88,8 @@ def run_lopo_models():
|
|
| 77 |
mlp_prob = mlp.predict_proba(X_te_sc)[:, 1]
|
| 78 |
results["mlp"]["y"].append(y_test)
|
| 79 |
results["mlp"]["p"].append(mlp_prob)
|
|
|
|
|
|
|
| 80 |
|
| 81 |
xgb = XGBClassifier(
|
| 82 |
n_estimators=600, max_depth=8, learning_rate=0.05,
|
|
@@ -89,11 +102,14 @@ def run_lopo_models():
|
|
| 89 |
xgb_prob = xgb.predict_proba(X_te_sc)[:, 1]
|
| 90 |
results["xgb"]["y"].append(y_test)
|
| 91 |
results["xgb"]["p"].append(xgb_prob)
|
|
|
|
|
|
|
| 92 |
|
| 93 |
print(f" fold {i+1}/{len(persons)}: held out {held_out} "
|
| 94 |
f"({X_test.shape[0]} samples)")
|
| 95 |
|
| 96 |
-
|
|
|
|
| 97 |
results[key]["y"] = np.concatenate(results[key]["y"])
|
| 98 |
results[key]["p"] = np.concatenate(results[key]["p"])
|
| 99 |
|
|
@@ -126,6 +142,129 @@ def analyse_model_thresholds(results):
|
|
| 126 |
return model_stats
|
| 127 |
|
| 128 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 129 |
def run_geo_weight_search():
|
| 130 |
print("\n=== Geometric weight grid search ===")
|
| 131 |
|
|
@@ -252,6 +391,191 @@ def run_hybrid_weight_search(lopo_results):
|
|
| 252 |
return dict(mean_f1), best_w
|
| 253 |
|
| 254 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 255 |
def plot_distributions():
|
| 256 |
print("\n=== EAR / MAR distributions ===")
|
| 257 |
npz_files = sorted(glob.glob(os.path.join(_PROJECT_ROOT, "data", "collected_*", "*.npz")))
|
|
@@ -326,7 +650,11 @@ def plot_distributions():
|
|
| 326 |
return stats
|
| 327 |
|
| 328 |
|
| 329 |
-
def write_report(model_stats, geo_f1, best_alpha,
|
|
|
|
|
|
|
|
|
|
|
|
|
| 330 |
lines = []
|
| 331 |
lines.append("# Threshold Justification Report")
|
| 332 |
lines.append("")
|
|
@@ -351,7 +679,91 @@ def write_report(model_stats, geo_f1, best_alpha, hybrid_f1, best_w, dist_stats)
|
|
| 351 |
lines.append("")
|
| 352 |
lines.append("")
|
| 353 |
|
| 354 |
-
lines.append("## 2.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 355 |
lines.append("")
|
| 356 |
lines.append("Grid search over face weight alpha in {0.2 ... 0.8}. "
|
| 357 |
"Eye weight = 1 - alpha. Threshold per fold via Youden's J.")
|
|
@@ -368,25 +780,68 @@ def write_report(model_stats, geo_f1, best_alpha, hybrid_f1, best_w, dist_stats)
|
|
| 368 |
lines.append("")
|
| 369 |
lines.append("")
|
| 370 |
|
| 371 |
-
lines.append("##
|
| 372 |
lines.append("")
|
| 373 |
lines.append("Grid search over w_mlp in {0.3 ... 0.8}. w_geo = 1 - w_mlp. "
|
| 374 |
-
"Geometric sub-score uses same weights as geometric pipeline (face=0.7, eye=0.3).
|
| 375 |
-
"If you change geometric weights, re-run this script — optimal w_mlp can shift.")
|
| 376 |
lines.append("")
|
| 377 |
lines.append("| MLP Weight (w_mlp) | Mean LOPO F1 |")
|
| 378 |
lines.append("|-------------------:|-------------:|")
|
| 379 |
-
for w in sorted(
|
| 380 |
-
marker = " **<-- selected**" if w ==
|
| 381 |
-
lines.append(f"| {w:.1f} | {
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 382 |
lines.append("")
|
| 383 |
-
lines.append(
|
| 384 |
-
|
|
|
|
|
|
|
|
|
|
| 385 |
lines.append("")
|
| 386 |
-
lines.append("
|
|
|
|
|
|
|
| 387 |
lines.append("")
|
| 388 |
|
| 389 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 390 |
lines.append("")
|
| 391 |
lines.append("### EAR (Eye Aspect Ratio)")
|
| 392 |
lines.append("")
|
|
@@ -419,7 +874,7 @@ def write_report(model_stats, geo_f1, best_alpha, hybrid_f1, best_w, dist_stats)
|
|
| 419 |
lines.append("")
|
| 420 |
lines.append("")
|
| 421 |
|
| 422 |
-
lines.append("##
|
| 423 |
lines.append("")
|
| 424 |
lines.append("| Constant | Value | Rationale |")
|
| 425 |
lines.append("|----------|------:|-----------|")
|
|
@@ -446,16 +901,71 @@ def write_report(model_stats, geo_f1, best_alpha, hybrid_f1, best_w, dist_stats)
|
|
| 446 |
print(f"\nReport written to {REPORT_PATH}")
|
| 447 |
|
| 448 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 449 |
def main():
|
| 450 |
os.makedirs(PLOTS_DIR, exist_ok=True)
|
| 451 |
|
| 452 |
lopo_results = run_lopo_models()
|
| 453 |
model_stats = analyse_model_thresholds(lopo_results)
|
|
|
|
|
|
|
| 454 |
geo_f1, best_alpha = run_geo_weight_search()
|
| 455 |
-
|
|
|
|
| 456 |
dist_stats = plot_distributions()
|
| 457 |
|
| 458 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 459 |
print("\nDone.")
|
| 460 |
|
| 461 |
|
|
|
|
| 8 |
import matplotlib
|
| 9 |
matplotlib.use("Agg")
|
| 10 |
import matplotlib.pyplot as plt
|
| 11 |
+
import joblib
|
| 12 |
+
from sklearn.linear_model import LogisticRegression
|
| 13 |
from sklearn.neural_network import MLPClassifier
|
| 14 |
from sklearn.preprocessing import StandardScaler
|
| 15 |
+
from sklearn.metrics import (
|
| 16 |
+
roc_curve,
|
| 17 |
+
roc_auc_score,
|
| 18 |
+
f1_score,
|
| 19 |
+
precision_score,
|
| 20 |
+
recall_score,
|
| 21 |
+
accuracy_score,
|
| 22 |
+
confusion_matrix,
|
| 23 |
+
)
|
| 24 |
from xgboost import XGBClassifier
|
| 25 |
|
| 26 |
_PROJECT_ROOT = os.path.abspath(os.path.join(os.path.dirname(__file__), ".."))
|
|
|
|
| 66 |
by_person, _, _ = load_per_person("face_orientation")
|
| 67 |
persons = sorted(by_person.keys())
|
| 68 |
|
| 69 |
+
results = {"mlp": {"y": [], "p": [], "y_folds": [], "p_folds": []},
|
| 70 |
+
"xgb": {"y": [], "p": [], "y_folds": [], "p_folds": []}}
|
| 71 |
|
| 72 |
for i, held_out in enumerate(persons):
|
| 73 |
X_test, y_test = by_person[held_out]
|
|
|
|
| 88 |
mlp_prob = mlp.predict_proba(X_te_sc)[:, 1]
|
| 89 |
results["mlp"]["y"].append(y_test)
|
| 90 |
results["mlp"]["p"].append(mlp_prob)
|
| 91 |
+
results["mlp"]["y_folds"].append(y_test)
|
| 92 |
+
results["mlp"]["p_folds"].append(mlp_prob)
|
| 93 |
|
| 94 |
xgb = XGBClassifier(
|
| 95 |
n_estimators=600, max_depth=8, learning_rate=0.05,
|
|
|
|
| 102 |
xgb_prob = xgb.predict_proba(X_te_sc)[:, 1]
|
| 103 |
results["xgb"]["y"].append(y_test)
|
| 104 |
results["xgb"]["p"].append(xgb_prob)
|
| 105 |
+
results["xgb"]["y_folds"].append(y_test)
|
| 106 |
+
results["xgb"]["p_folds"].append(xgb_prob)
|
| 107 |
|
| 108 |
print(f" fold {i+1}/{len(persons)}: held out {held_out} "
|
| 109 |
f"({X_test.shape[0]} samples)")
|
| 110 |
|
| 111 |
+
results["persons"] = persons
|
| 112 |
+
for key in ("mlp", "xgb"):
|
| 113 |
results[key]["y"] = np.concatenate(results[key]["y"])
|
| 114 |
results[key]["p"] = np.concatenate(results[key]["p"])
|
| 115 |
|
|
|
|
| 142 |
return model_stats
|
| 143 |
|
| 144 |
|
| 145 |
+
def _ci_95_t(n):
|
| 146 |
+
"""95% CI half-width multiplier (t-distribution, df=n-1). Approximate for small n."""
|
| 147 |
+
if n <= 1:
|
| 148 |
+
return 0.0
|
| 149 |
+
df = n - 1
|
| 150 |
+
t_975 = [0, 12.71, 4.30, 3.18, 2.78, 2.57, 2.45, 2.37, 2.31]
|
| 151 |
+
if df < len(t_975):
|
| 152 |
+
return float(t_975[df])
|
| 153 |
+
if df <= 30:
|
| 154 |
+
return 2.0 + (30 - df) / 100
|
| 155 |
+
return 1.96
|
| 156 |
+
|
| 157 |
+
|
| 158 |
+
def analyse_precision_recall_confusion(results, model_stats):
|
| 159 |
+
"""Precision/recall at optimal threshold, pooled confusion matrix, per-fold metrics, 95% CIs."""
|
| 160 |
+
print("\n=== Precision, recall, confusion matrix, per-person variance ===")
|
| 161 |
+
from sklearn.metrics import precision_recall_curve, average_precision_score
|
| 162 |
+
|
| 163 |
+
extended = {}
|
| 164 |
+
persons = results["persons"]
|
| 165 |
+
n_folds = len(persons)
|
| 166 |
+
|
| 167 |
+
for name, label in [("mlp", "MLP"), ("xgb", "XGBoost")]:
|
| 168 |
+
y_all = results[name]["y"]
|
| 169 |
+
p_all = results[name]["p"]
|
| 170 |
+
y_folds = results[name]["y_folds"]
|
| 171 |
+
p_folds = results[name]["p_folds"]
|
| 172 |
+
opt_t = model_stats[name]["opt_threshold"]
|
| 173 |
+
|
| 174 |
+
y_pred = (p_all >= opt_t).astype(int)
|
| 175 |
+
prec_pooled = precision_score(y_all, y_pred, zero_division=0)
|
| 176 |
+
rec_pooled = recall_score(y_all, y_pred, zero_division=0)
|
| 177 |
+
acc_pooled = accuracy_score(y_all, y_pred)
|
| 178 |
+
cm = confusion_matrix(y_all, y_pred)
|
| 179 |
+
if cm.shape == (2, 2):
|
| 180 |
+
tn, fp, fn, tp = cm.ravel()
|
| 181 |
+
else:
|
| 182 |
+
tn = fp = fn = tp = 0
|
| 183 |
+
|
| 184 |
+
prec_folds = []
|
| 185 |
+
rec_folds = []
|
| 186 |
+
acc_folds = []
|
| 187 |
+
f1_folds = []
|
| 188 |
+
per_person = []
|
| 189 |
+
for k, (y_f, p_f) in enumerate(zip(y_folds, p_folds)):
|
| 190 |
+
pred_f = (p_f >= opt_t).astype(int)
|
| 191 |
+
prec_f = precision_score(y_f, pred_f, zero_division=0)
|
| 192 |
+
rec_f = recall_score(y_f, pred_f, zero_division=0)
|
| 193 |
+
acc_f = accuracy_score(y_f, pred_f)
|
| 194 |
+
f1_f = f1_score(y_f, pred_f, zero_division=0)
|
| 195 |
+
prec_folds.append(prec_f)
|
| 196 |
+
rec_folds.append(rec_f)
|
| 197 |
+
acc_folds.append(acc_f)
|
| 198 |
+
f1_folds.append(f1_f)
|
| 199 |
+
per_person.append({
|
| 200 |
+
"person": persons[k],
|
| 201 |
+
"accuracy": acc_f,
|
| 202 |
+
"f1": f1_f,
|
| 203 |
+
"precision": prec_f,
|
| 204 |
+
"recall": rec_f,
|
| 205 |
+
})
|
| 206 |
+
|
| 207 |
+
t_mult = _ci_95_t(n_folds)
|
| 208 |
+
mean_acc = np.mean(acc_folds)
|
| 209 |
+
std_acc = np.std(acc_folds, ddof=1) if n_folds > 1 else 0.0
|
| 210 |
+
mean_f1 = np.mean(f1_folds)
|
| 211 |
+
std_f1 = np.std(f1_folds, ddof=1) if n_folds > 1 else 0.0
|
| 212 |
+
mean_prec = np.mean(prec_folds)
|
| 213 |
+
std_prec = np.std(prec_folds, ddof=1) if n_folds > 1 else 0.0
|
| 214 |
+
mean_rec = np.mean(rec_folds)
|
| 215 |
+
std_rec = np.std(rec_folds, ddof=1) if n_folds > 1 else 0.0
|
| 216 |
+
|
| 217 |
+
extended[name] = {
|
| 218 |
+
"label": label,
|
| 219 |
+
"opt_threshold": opt_t,
|
| 220 |
+
"precision_pooled": prec_pooled,
|
| 221 |
+
"recall_pooled": rec_pooled,
|
| 222 |
+
"accuracy_pooled": acc_pooled,
|
| 223 |
+
"confusion_matrix": cm,
|
| 224 |
+
"tn": int(tn), "fp": int(fp), "fn": int(fn), "tp": int(tp),
|
| 225 |
+
"per_person": per_person,
|
| 226 |
+
"accuracy_mean": mean_acc, "accuracy_std": std_acc,
|
| 227 |
+
"accuracy_ci_half": t_mult * (std_acc / np.sqrt(n_folds)) if n_folds > 1 else 0.0,
|
| 228 |
+
"f1_mean": mean_f1, "f1_std": std_f1,
|
| 229 |
+
"f1_ci_half": t_mult * (std_f1 / np.sqrt(n_folds)) if n_folds > 1 else 0.0,
|
| 230 |
+
"precision_mean": mean_prec, "precision_std": std_prec,
|
| 231 |
+
"precision_ci_half": t_mult * (std_prec / np.sqrt(n_folds)) if n_folds > 1 else 0.0,
|
| 232 |
+
"recall_mean": mean_rec, "recall_std": std_rec,
|
| 233 |
+
"recall_ci_half": t_mult * (std_rec / np.sqrt(n_folds)) if n_folds > 1 else 0.0,
|
| 234 |
+
"n_folds": n_folds,
|
| 235 |
+
}
|
| 236 |
+
|
| 237 |
+
print(f" {label}: precision={prec_pooled:.4f}, recall={rec_pooled:.4f} | "
|
| 238 |
+
f"per-fold F1 mean={mean_f1:.4f} ± {std_f1:.4f} "
|
| 239 |
+
f"(95% CI [{mean_f1 - extended[name]['f1_ci_half']:.4f}, {mean_f1 + extended[name]['f1_ci_half']:.4f}])")
|
| 240 |
+
|
| 241 |
+
return extended
|
| 242 |
+
|
| 243 |
+
|
| 244 |
+
def plot_confusion_matrices(extended_stats):
|
| 245 |
+
"""Save confusion matrix heatmaps for MLP and XGBoost."""
|
| 246 |
+
for name in ("mlp", "xgb"):
|
| 247 |
+
s = extended_stats[name]
|
| 248 |
+
cm = s["confusion_matrix"]
|
| 249 |
+
fig, ax = plt.subplots(figsize=(4, 3))
|
| 250 |
+
im = ax.imshow(cm, cmap="Blues")
|
| 251 |
+
ax.set_xticks([0, 1])
|
| 252 |
+
ax.set_yticks([0, 1])
|
| 253 |
+
ax.set_xticklabels(["Pred 0", "Pred 1"])
|
| 254 |
+
ax.set_yticklabels(["True 0", "True 1"])
|
| 255 |
+
ax.set_ylabel("True label")
|
| 256 |
+
ax.set_xlabel("Predicted label")
|
| 257 |
+
for i in range(2):
|
| 258 |
+
for j in range(2):
|
| 259 |
+
ax.text(j, i, str(cm[i, j]), ha="center", va="center", color="white" if cm[i, j] > cm.max() / 2 else "black", fontweight="bold")
|
| 260 |
+
ax.set_title(f"LOPO {s['label']} @ t={s['opt_threshold']:.3f}")
|
| 261 |
+
fig.tight_layout()
|
| 262 |
+
path = os.path.join(PLOTS_DIR, f"confusion_matrix_{name}.png")
|
| 263 |
+
fig.savefig(path, dpi=150)
|
| 264 |
+
plt.close(fig)
|
| 265 |
+
print(f" saved {path}")
|
| 266 |
+
|
| 267 |
+
|
| 268 |
def run_geo_weight_search():
|
| 269 |
print("\n=== Geometric weight grid search ===")
|
| 270 |
|
|
|
|
| 391 |
return dict(mean_f1), best_w
|
| 392 |
|
| 393 |
|
| 394 |
+
def run_hybrid_xgb_weight_search(lopo_results):
|
| 395 |
+
"""Grid search: XGBoost prob + geometric. Same structure as MLP hybrid."""
|
| 396 |
+
print("\n=== Hybrid XGBoost weight grid search ===")
|
| 397 |
+
|
| 398 |
+
by_person, _, _ = load_per_person("face_orientation")
|
| 399 |
+
persons = sorted(by_person.keys())
|
| 400 |
+
features = SELECTED_FEATURES["face_orientation"]
|
| 401 |
+
sf_idx = features.index("s_face")
|
| 402 |
+
se_idx = features.index("s_eye")
|
| 403 |
+
|
| 404 |
+
GEO_FACE_W = 0.7
|
| 405 |
+
GEO_EYE_W = 0.3
|
| 406 |
+
|
| 407 |
+
w_xgbs = np.arange(0.3, 0.85, 0.1).round(1)
|
| 408 |
+
wmf1 = {w: [] for w in w_xgbs}
|
| 409 |
+
xgb_p = lopo_results["xgb"]["p"]
|
| 410 |
+
offset = 0
|
| 411 |
+
for held_out in persons:
|
| 412 |
+
X_test, y_test = by_person[held_out]
|
| 413 |
+
n = X_test.shape[0]
|
| 414 |
+
xgb_prob_fold = xgb_p[offset : offset + n]
|
| 415 |
+
offset += n
|
| 416 |
+
|
| 417 |
+
sf = X_test[:, sf_idx]
|
| 418 |
+
se = X_test[:, se_idx]
|
| 419 |
+
geo_score = np.clip(GEO_FACE_W * sf + GEO_EYE_W * se, 0, 1)
|
| 420 |
+
|
| 421 |
+
train_X = np.concatenate([by_person[p][0] for p in persons if p != held_out])
|
| 422 |
+
train_y = np.concatenate([by_person[p][1] for p in persons if p != held_out])
|
| 423 |
+
sf_tr = train_X[:, sf_idx]
|
| 424 |
+
se_tr = train_X[:, se_idx]
|
| 425 |
+
geo_tr = np.clip(GEO_FACE_W * sf_tr + GEO_EYE_W * se_tr, 0, 1)
|
| 426 |
+
|
| 427 |
+
scaler = StandardScaler().fit(train_X)
|
| 428 |
+
X_tr_sc = scaler.transform(train_X)
|
| 429 |
+
xgb_tr = XGBClassifier(
|
| 430 |
+
n_estimators=600, max_depth=8, learning_rate=0.05,
|
| 431 |
+
subsample=0.8, colsample_bytree=0.8,
|
| 432 |
+
reg_alpha=0.1, reg_lambda=1.0,
|
| 433 |
+
use_label_encoder=False, eval_metric="logloss",
|
| 434 |
+
random_state=SEED, verbosity=0,
|
| 435 |
+
)
|
| 436 |
+
xgb_tr.fit(X_tr_sc, train_y)
|
| 437 |
+
xgb_prob_tr = xgb_tr.predict_proba(X_tr_sc)[:, 1]
|
| 438 |
+
|
| 439 |
+
for w in w_xgbs:
|
| 440 |
+
combo_tr = w * xgb_prob_tr + (1.0 - w) * geo_tr
|
| 441 |
+
opt_t, *_ = _youdens_j(train_y, combo_tr)
|
| 442 |
+
|
| 443 |
+
combo_te = w * xgb_prob_fold + (1.0 - w) * geo_score
|
| 444 |
+
f1 = _f1_at_threshold(y_test, combo_te, opt_t)
|
| 445 |
+
wmf1[w].append(f1)
|
| 446 |
+
|
| 447 |
+
mean_f1 = {w: np.mean(f1s) for w, f1s in wmf1.items()}
|
| 448 |
+
best_w = max(mean_f1, key=mean_f1.get)
|
| 449 |
+
|
| 450 |
+
fig, ax = plt.subplots(figsize=(7, 4))
|
| 451 |
+
ax.bar([f"{w:.1f}" for w in w_xgbs],
|
| 452 |
+
[mean_f1[w] for w in w_xgbs], color="steelblue")
|
| 453 |
+
ax.set_xlabel("XGBoost weight (w_xgb); geo weight = 1 - w_xgb")
|
| 454 |
+
ax.set_ylabel("Mean LOPO F1")
|
| 455 |
+
ax.set_title("Hybrid Pipeline: XGBoost vs Geometric Weight Search")
|
| 456 |
+
ax.set_ylim(bottom=max(0, min(mean_f1.values()) - 0.05))
|
| 457 |
+
for i, w in enumerate(w_xgbs):
|
| 458 |
+
ax.text(i, mean_f1[w] + 0.003, f"{mean_f1[w]:.3f}",
|
| 459 |
+
ha="center", va="bottom", fontsize=8)
|
| 460 |
+
fig.tight_layout()
|
| 461 |
+
path = os.path.join(PLOTS_DIR, "hybrid_xgb_weight_search.png")
|
| 462 |
+
fig.savefig(path, dpi=150)
|
| 463 |
+
plt.close(fig)
|
| 464 |
+
print(f" saved {path}")
|
| 465 |
+
|
| 466 |
+
print(f" Best w_xgb = {best_w:.1f}, mean LOPO F1 = {mean_f1[best_w]:.4f}")
|
| 467 |
+
return dict(mean_f1), best_w
|
| 468 |
+
|
| 469 |
+
|
| 470 |
+
def run_hybrid_lr_combiner(lopo_results, use_xgb=True):
|
| 471 |
+
"""LR combiner: meta-features = [model_prob, geo_score], learned weights instead of grid search."""
|
| 472 |
+
print("\n=== Hybrid LR combiner (LOPO) ===")
|
| 473 |
+
by_person, _, _ = load_per_person("face_orientation")
|
| 474 |
+
persons = sorted(by_person.keys())
|
| 475 |
+
features = SELECTED_FEATURES["face_orientation"]
|
| 476 |
+
sf_idx = features.index("s_face")
|
| 477 |
+
se_idx = features.index("s_eye")
|
| 478 |
+
GEO_FACE_W = 0.7
|
| 479 |
+
GEO_EYE_W = 0.3
|
| 480 |
+
|
| 481 |
+
key = "xgb" if use_xgb else "mlp"
|
| 482 |
+
model_p = lopo_results[key]["p"]
|
| 483 |
+
offset = 0
|
| 484 |
+
fold_f1s = []
|
| 485 |
+
for held_out in persons:
|
| 486 |
+
X_test, y_test = by_person[held_out]
|
| 487 |
+
n = X_test.shape[0]
|
| 488 |
+
prob_fold = model_p[offset : offset + n]
|
| 489 |
+
offset += n
|
| 490 |
+
sf = X_test[:, sf_idx]
|
| 491 |
+
se = X_test[:, se_idx]
|
| 492 |
+
geo_score = np.clip(GEO_FACE_W * sf + GEO_EYE_W * se, 0, 1)
|
| 493 |
+
meta_te = np.column_stack([prob_fold, geo_score])
|
| 494 |
+
|
| 495 |
+
train_X = np.concatenate([by_person[p][0] for p in persons if p != held_out])
|
| 496 |
+
train_y = np.concatenate([by_person[p][1] for p in persons if p != held_out])
|
| 497 |
+
sf_tr = train_X[:, sf_idx]
|
| 498 |
+
se_tr = train_X[:, se_idx]
|
| 499 |
+
geo_tr = np.clip(GEO_FACE_W * sf_tr + GEO_EYE_W * se_tr, 0, 1)
|
| 500 |
+
scaler = StandardScaler().fit(train_X)
|
| 501 |
+
X_tr_sc = scaler.transform(train_X)
|
| 502 |
+
if use_xgb:
|
| 503 |
+
xgb_tr = XGBClassifier(
|
| 504 |
+
n_estimators=600, max_depth=8, learning_rate=0.05,
|
| 505 |
+
subsample=0.8, colsample_bytree=0.8,
|
| 506 |
+
reg_alpha=0.1, reg_lambda=1.0,
|
| 507 |
+
use_label_encoder=False, eval_metric="logloss",
|
| 508 |
+
random_state=SEED, verbosity=0,
|
| 509 |
+
)
|
| 510 |
+
xgb_tr.fit(X_tr_sc, train_y)
|
| 511 |
+
prob_tr = xgb_tr.predict_proba(X_tr_sc)[:, 1]
|
| 512 |
+
else:
|
| 513 |
+
mlp_tr = MLPClassifier(
|
| 514 |
+
hidden_layer_sizes=(64, 32), activation="relu",
|
| 515 |
+
max_iter=200, early_stopping=True, validation_fraction=0.15,
|
| 516 |
+
random_state=SEED, verbose=False,
|
| 517 |
+
)
|
| 518 |
+
mlp_tr.fit(X_tr_sc, train_y)
|
| 519 |
+
prob_tr = mlp_tr.predict_proba(X_tr_sc)[:, 1]
|
| 520 |
+
meta_tr = np.column_stack([prob_tr, geo_tr])
|
| 521 |
+
|
| 522 |
+
lr = LogisticRegression(C=1.0, max_iter=500, random_state=SEED)
|
| 523 |
+
lr.fit(meta_tr, train_y)
|
| 524 |
+
p_tr = lr.predict_proba(meta_tr)[:, 1]
|
| 525 |
+
opt_t, *_ = _youdens_j(train_y, p_tr)
|
| 526 |
+
p_te = lr.predict_proba(meta_te)[:, 1]
|
| 527 |
+
f1 = _f1_at_threshold(y_test, p_te, opt_t)
|
| 528 |
+
fold_f1s.append(f1)
|
| 529 |
+
print(f" fold {held_out}: F1 = {f1:.4f} (threshold = {opt_t:.3f})")
|
| 530 |
+
|
| 531 |
+
mean_f1 = float(np.mean(fold_f1s))
|
| 532 |
+
print(f" LR combiner mean LOPO F1 = {mean_f1:.4f}")
|
| 533 |
+
return mean_f1
|
| 534 |
+
|
| 535 |
+
|
| 536 |
+
def train_and_save_hybrid_combiner(lopo_results, use_xgb, geo_face_weight=0.7, geo_eye_weight=0.3,
|
| 537 |
+
combiner_path=None):
|
| 538 |
+
"""Build OOS meta-dataset from LOPO predictions, train one LR, save joblib + optimal threshold."""
|
| 539 |
+
by_person, _, _ = load_per_person("face_orientation")
|
| 540 |
+
persons = sorted(by_person.keys())
|
| 541 |
+
features = SELECTED_FEATURES["face_orientation"]
|
| 542 |
+
sf_idx = features.index("s_face")
|
| 543 |
+
se_idx = features.index("s_eye")
|
| 544 |
+
|
| 545 |
+
key = "xgb" if use_xgb else "mlp"
|
| 546 |
+
model_p = lopo_results[key]["p"]
|
| 547 |
+
meta_y = lopo_results[key]["y"]
|
| 548 |
+
geo_list = []
|
| 549 |
+
offset = 0
|
| 550 |
+
for p in persons:
|
| 551 |
+
X, _ = by_person[p]
|
| 552 |
+
n = X.shape[0]
|
| 553 |
+
sf = X[:, sf_idx]
|
| 554 |
+
se = X[:, se_idx]
|
| 555 |
+
geo_list.append(np.clip(geo_face_weight * sf + geo_eye_weight * se, 0, 1))
|
| 556 |
+
offset += n
|
| 557 |
+
geo_all = np.concatenate(geo_list)
|
| 558 |
+
meta_X = np.column_stack([model_p, geo_all])
|
| 559 |
+
|
| 560 |
+
lr = LogisticRegression(C=1.0, max_iter=500, random_state=SEED)
|
| 561 |
+
lr.fit(meta_X, meta_y)
|
| 562 |
+
p = lr.predict_proba(meta_X)[:, 1]
|
| 563 |
+
opt_threshold, *_ = _youdens_j(meta_y, p)
|
| 564 |
+
|
| 565 |
+
if combiner_path is None:
|
| 566 |
+
combiner_path = os.path.join(_PROJECT_ROOT, "checkpoints", "hybrid_combiner.joblib")
|
| 567 |
+
os.makedirs(os.path.dirname(combiner_path), exist_ok=True)
|
| 568 |
+
joblib.dump({
|
| 569 |
+
"combiner": lr,
|
| 570 |
+
"threshold": float(opt_threshold),
|
| 571 |
+
"use_xgb": bool(use_xgb),
|
| 572 |
+
"geo_face_weight": geo_face_weight,
|
| 573 |
+
"geo_eye_weight": geo_eye_weight,
|
| 574 |
+
}, combiner_path)
|
| 575 |
+
print(f" Saved combiner to {combiner_path} (threshold={opt_threshold:.3f})")
|
| 576 |
+
return opt_threshold, combiner_path
|
| 577 |
+
|
| 578 |
+
|
| 579 |
def plot_distributions():
|
| 580 |
print("\n=== EAR / MAR distributions ===")
|
| 581 |
npz_files = sorted(glob.glob(os.path.join(_PROJECT_ROOT, "data", "collected_*", "*.npz")))
|
|
|
|
| 650 |
return stats
|
| 651 |
|
| 652 |
|
| 653 |
+
def write_report(model_stats, extended_stats, geo_f1, best_alpha,
|
| 654 |
+
hybrid_mlp_f1, best_w_mlp,
|
| 655 |
+
hybrid_xgb_f1, best_w_xgb,
|
| 656 |
+
use_xgb_for_hybrid, dist_stats,
|
| 657 |
+
lr_combiner_f1=None):
|
| 658 |
lines = []
|
| 659 |
lines.append("# Threshold Justification Report")
|
| 660 |
lines.append("")
|
|
|
|
| 679 |
lines.append("")
|
| 680 |
lines.append("")
|
| 681 |
|
| 682 |
+
lines.append("## 2. Precision, Recall and Tradeoff")
|
| 683 |
+
lines.append("")
|
| 684 |
+
lines.append("At the optimal threshold (Youden's J), pooled over all LOPO held-out predictions:")
|
| 685 |
+
lines.append("")
|
| 686 |
+
lines.append("| Model | Threshold | Precision | Recall | F1 | Accuracy |")
|
| 687 |
+
lines.append("|-------|----------:|----------:|-------:|---:|---------:|")
|
| 688 |
+
for key in ("mlp", "xgb"):
|
| 689 |
+
s = extended_stats[key]
|
| 690 |
+
lines.append(f"| {s['label']} | {s['opt_threshold']:.3f} | {s['precision_pooled']:.4f} | "
|
| 691 |
+
f"{s['recall_pooled']:.4f} | {model_stats[key]['f1_opt']:.4f} | {s['accuracy_pooled']:.4f} |")
|
| 692 |
+
lines.append("")
|
| 693 |
+
lines.append("Higher threshold → fewer positive predictions → higher precision, lower recall. "
|
| 694 |
+
"Youden's J picks the threshold that balances sensitivity and specificity (recall for the positive class and true negative rate).")
|
| 695 |
+
lines.append("")
|
| 696 |
+
|
| 697 |
+
lines.append("## 3. Confusion Matrix (Pooled LOPO)")
|
| 698 |
+
lines.append("")
|
| 699 |
+
lines.append("At optimal threshold. Rows = true label, columns = predicted label (0 = unfocused, 1 = focused).")
|
| 700 |
+
lines.append("")
|
| 701 |
+
for key in ("mlp", "xgb"):
|
| 702 |
+
s = extended_stats[key]
|
| 703 |
+
lines.append(f"### {s['label']}")
|
| 704 |
+
lines.append("")
|
| 705 |
+
lines.append("| | Pred 0 | Pred 1 |")
|
| 706 |
+
lines.append("|--|-------:|-------:|")
|
| 707 |
+
cm = s["confusion_matrix"]
|
| 708 |
+
if cm.shape == (2, 2):
|
| 709 |
+
lines.append(f"| **True 0** | {cm[0,0]} (TN) | {cm[0,1]} (FP) |")
|
| 710 |
+
lines.append(f"| **True 1** | {cm[1,0]} (FN) | {cm[1,1]} (TP) |")
|
| 711 |
+
lines.append("")
|
| 712 |
+
lines.append(f"TN={s['tn']}, FP={s['fp']}, FN={s['fn']}, TP={s['tp']}. ")
|
| 713 |
+
lines.append("")
|
| 714 |
+
lines.append("")
|
| 715 |
+
lines.append("")
|
| 716 |
+
lines.append("")
|
| 717 |
+
lines.append("")
|
| 718 |
+
|
| 719 |
+
lines.append("## 4. Per-Person Performance Variance (LOPO)")
|
| 720 |
+
lines.append("")
|
| 721 |
+
lines.append("One fold per left-out person; metrics at optimal threshold.")
|
| 722 |
+
lines.append("")
|
| 723 |
+
for key in ("mlp", "xgb"):
|
| 724 |
+
s = extended_stats[key]
|
| 725 |
+
lines.append(f"### {s['label']} — per held-out person")
|
| 726 |
+
lines.append("")
|
| 727 |
+
lines.append("| Person | Accuracy | F1 | Precision | Recall |")
|
| 728 |
+
lines.append("|--------|---------:|---:|----------:|-------:|")
|
| 729 |
+
for row in s["per_person"]:
|
| 730 |
+
lines.append(f"| {row['person']} | {row['accuracy']:.4f} | {row['f1']:.4f} | {row['precision']:.4f} | {row['recall']:.4f} |")
|
| 731 |
+
lines.append("")
|
| 732 |
+
lines.append("### Summary across persons")
|
| 733 |
+
lines.append("")
|
| 734 |
+
lines.append("| Model | Accuracy mean ± std | F1 mean ± std | Precision mean ± std | Recall mean ± std |")
|
| 735 |
+
lines.append("|-------|---------------------|---------------|----------------------|-------------------|")
|
| 736 |
+
for key in ("mlp", "xgb"):
|
| 737 |
+
s = extended_stats[key]
|
| 738 |
+
lines.append(f"| {s['label']} | {s['accuracy_mean']:.4f} ± {s['accuracy_std']:.4f} | "
|
| 739 |
+
f"{s['f1_mean']:.4f} ± {s['f1_std']:.4f} | "
|
| 740 |
+
f"{s['precision_mean']:.4f} ± {s['precision_std']:.4f} | "
|
| 741 |
+
f"{s['recall_mean']:.4f} ± {s['recall_std']:.4f} |")
|
| 742 |
+
lines.append("")
|
| 743 |
+
|
| 744 |
+
lines.append("## 5. Confidence Intervals (95%, LOPO over 9 persons)")
|
| 745 |
+
lines.append("")
|
| 746 |
+
lines.append("Mean ± half-width of 95% t-interval (df=8) for each metric across the 9 left-out persons.")
|
| 747 |
+
lines.append("")
|
| 748 |
+
lines.append("| Model | F1 | Accuracy | Precision | Recall |")
|
| 749 |
+
lines.append("|-------|---:|--------:|----------:|-------:|")
|
| 750 |
+
for key in ("mlp", "xgb"):
|
| 751 |
+
s = extended_stats[key]
|
| 752 |
+
f1_lo = s["f1_mean"] - s["f1_ci_half"]
|
| 753 |
+
f1_hi = s["f1_mean"] + s["f1_ci_half"]
|
| 754 |
+
acc_lo = s["accuracy_mean"] - s["accuracy_ci_half"]
|
| 755 |
+
acc_hi = s["accuracy_mean"] + s["accuracy_ci_half"]
|
| 756 |
+
prec_lo = s["precision_mean"] - s["precision_ci_half"]
|
| 757 |
+
prec_hi = s["precision_mean"] + s["precision_ci_half"]
|
| 758 |
+
rec_lo = s["recall_mean"] - s["recall_ci_half"]
|
| 759 |
+
rec_hi = s["recall_mean"] + s["recall_ci_half"]
|
| 760 |
+
lines.append(f"| {s['label']} | {s['f1_mean']:.4f} [{f1_lo:.4f}, {f1_hi:.4f}] | "
|
| 761 |
+
f"{s['accuracy_mean']:.4f} [{acc_lo:.4f}, {acc_hi:.4f}] | "
|
| 762 |
+
f"{s['precision_mean']:.4f} [{prec_lo:.4f}, {prec_hi:.4f}] | "
|
| 763 |
+
f"{s['recall_mean']:.4f} [{rec_lo:.4f}, {rec_hi:.4f}] |")
|
| 764 |
+
lines.append("")
|
| 765 |
+
|
| 766 |
+
lines.append("## 6. Geometric Pipeline Weights (s_face vs s_eye)")
|
| 767 |
lines.append("")
|
| 768 |
lines.append("Grid search over face weight alpha in {0.2 ... 0.8}. "
|
| 769 |
"Eye weight = 1 - alpha. Threshold per fold via Youden's J.")
|
|
|
|
| 780 |
lines.append("")
|
| 781 |
lines.append("")
|
| 782 |
|
| 783 |
+
lines.append("## 7. Hybrid Pipeline: MLP vs Geometric")
|
| 784 |
lines.append("")
|
| 785 |
lines.append("Grid search over w_mlp in {0.3 ... 0.8}. w_geo = 1 - w_mlp. "
|
| 786 |
+
"Geometric sub-score uses same weights as geometric pipeline (face=0.7, eye=0.3).")
|
|
|
|
| 787 |
lines.append("")
|
| 788 |
lines.append("| MLP Weight (w_mlp) | Mean LOPO F1 |")
|
| 789 |
lines.append("|-------------------:|-------------:|")
|
| 790 |
+
for w in sorted(hybrid_mlp_f1.keys()):
|
| 791 |
+
marker = " **<-- selected**" if w == best_w_mlp else ""
|
| 792 |
+
lines.append(f"| {w:.1f} | {hybrid_mlp_f1[w]:.4f}{marker} |")
|
| 793 |
+
lines.append("")
|
| 794 |
+
lines.append(f"**Best:** w_mlp = {best_w_mlp:.1f} (MLP {best_w_mlp*100:.0f}%, "
|
| 795 |
+
f"geometric {(1-best_w_mlp)*100:.0f}%) → mean LOPO F1 = {hybrid_mlp_f1[best_w_mlp]:.4f}")
|
| 796 |
+
lines.append("")
|
| 797 |
+
lines.append("")
|
| 798 |
+
lines.append("")
|
| 799 |
+
|
| 800 |
+
lines.append("## 8. Hybrid Pipeline: XGBoost vs Geometric")
|
| 801 |
+
lines.append("")
|
| 802 |
+
lines.append("Same grid over w_xgb in {0.3 ... 0.8}. w_geo = 1 - w_xgb.")
|
| 803 |
lines.append("")
|
| 804 |
+
lines.append("| XGBoost Weight (w_xgb) | Mean LOPO F1 |")
|
| 805 |
+
lines.append("|-----------------------:|-------------:|")
|
| 806 |
+
for w in sorted(hybrid_xgb_f1.keys()):
|
| 807 |
+
marker = " **<-- selected**" if w == best_w_xgb else ""
|
| 808 |
+
lines.append(f"| {w:.1f} | {hybrid_xgb_f1[w]:.4f}{marker} |")
|
| 809 |
lines.append("")
|
| 810 |
+
lines.append(f"**Best:** w_xgb = {best_w_xgb:.1f} → mean LOPO F1 = {hybrid_xgb_f1[best_w_xgb]:.4f}")
|
| 811 |
+
lines.append("")
|
| 812 |
+
lines.append("")
|
| 813 |
lines.append("")
|
| 814 |
|
| 815 |
+
f1_mlp = hybrid_mlp_f1[best_w_mlp]
|
| 816 |
+
f1_xgb = hybrid_xgb_f1[best_w_xgb]
|
| 817 |
+
lines.append("### Which hybrid is used in the app?")
|
| 818 |
+
lines.append("")
|
| 819 |
+
if use_xgb_for_hybrid:
|
| 820 |
+
lines.append(f"**XGBoost hybrid is better** (F1 = {f1_xgb:.4f} vs MLP hybrid F1 = {f1_mlp:.4f}).")
|
| 821 |
+
else:
|
| 822 |
+
lines.append(f"**MLP hybrid is better** (F1 = {f1_mlp:.4f} vs XGBoost hybrid F1 = {f1_xgb:.4f}).")
|
| 823 |
+
lines.append("")
|
| 824 |
+
if lr_combiner_f1 is not None:
|
| 825 |
+
lines.append("### Logistic regression combiner (replaces heuristic weights)")
|
| 826 |
+
lines.append("")
|
| 827 |
+
lines.append("Instead of a fixed linear blend (e.g. 0.3·ML + 0.7·geo), a **logistic regression** "
|
| 828 |
+
"combines model probability and geometric score: meta-features = [model_prob, geo_score], "
|
| 829 |
+
"trained on the same LOPO splits. Threshold from Youden's J on combiner output.")
|
| 830 |
+
lines.append("")
|
| 831 |
+
lines.append(f"| Method | Mean LOPO F1 |")
|
| 832 |
+
lines.append("|--------|-------------:|")
|
| 833 |
+
lines.append(f"| Heuristic weight grid (best w) | {(f1_xgb if use_xgb_for_hybrid else f1_mlp):.4f} |")
|
| 834 |
+
lines.append(f"| **LR combiner** | **{lr_combiner_f1:.4f}** |")
|
| 835 |
+
lines.append("")
|
| 836 |
+
lines.append("The app uses the saved LR combiner when `combiner_path` is set in `hybrid_focus_config.json`.")
|
| 837 |
+
lines.append("")
|
| 838 |
+
else:
|
| 839 |
+
if use_xgb_for_hybrid:
|
| 840 |
+
lines.append("The app uses **XGBoost + geometric** with the weights above.")
|
| 841 |
+
else:
|
| 842 |
+
lines.append("The app uses **MLP + geometric** with the weights above.")
|
| 843 |
+
lines.append("")
|
| 844 |
+
lines.append("## 5. Eye and Mouth Aspect Ratio Thresholds")
|
| 845 |
lines.append("")
|
| 846 |
lines.append("### EAR (Eye Aspect Ratio)")
|
| 847 |
lines.append("")
|
|
|
|
| 874 |
lines.append("")
|
| 875 |
lines.append("")
|
| 876 |
|
| 877 |
+
lines.append("## 10. Other Constants")
|
| 878 |
lines.append("")
|
| 879 |
lines.append("| Constant | Value | Rationale |")
|
| 880 |
lines.append("|----------|------:|-----------|")
|
|
|
|
| 901 |
print(f"\nReport written to {REPORT_PATH}")
|
| 902 |
|
| 903 |
|
| 904 |
+
def write_hybrid_config(use_xgb, best_w_mlp, best_w_xgb, config_path,
|
| 905 |
+
combiner_path=None, combiner_threshold=None):
|
| 906 |
+
"""Write hybrid_focus_config.json. If combiner_path set, app uses LR combiner instead of heuristic weights."""
|
| 907 |
+
import json
|
| 908 |
+
if use_xgb:
|
| 909 |
+
w_xgb = round(float(best_w_xgb), 2)
|
| 910 |
+
w_geo = round(1.0 - best_w_xgb, 2)
|
| 911 |
+
w_mlp = 0.3
|
| 912 |
+
else:
|
| 913 |
+
w_mlp = round(float(best_w_mlp), 2)
|
| 914 |
+
w_geo = round(1.0 - best_w_mlp, 2)
|
| 915 |
+
w_xgb = 0.0
|
| 916 |
+
cfg = {
|
| 917 |
+
"use_xgb": bool(use_xgb),
|
| 918 |
+
"w_mlp": w_mlp,
|
| 919 |
+
"w_xgb": w_xgb,
|
| 920 |
+
"w_geo": w_geo,
|
| 921 |
+
"threshold": float(combiner_threshold) if combiner_threshold is not None else 0.35,
|
| 922 |
+
"use_yawn_veto": True,
|
| 923 |
+
"geo_face_weight": 0.7,
|
| 924 |
+
"geo_eye_weight": 0.3,
|
| 925 |
+
"mar_yawn_threshold": 0.55,
|
| 926 |
+
"metric": "f1",
|
| 927 |
+
}
|
| 928 |
+
if combiner_path:
|
| 929 |
+
cfg["combiner"] = "logistic"
|
| 930 |
+
cfg["combiner_path"] = os.path.normpath(combiner_path)
|
| 931 |
+
with open(config_path, "w", encoding="utf-8") as f:
|
| 932 |
+
json.dump(cfg, f, indent=2)
|
| 933 |
+
print(f" Written {config_path} (use_xgb={cfg['use_xgb']}, combiner={cfg.get('combiner', 'heuristic')})")
|
| 934 |
+
|
| 935 |
+
|
| 936 |
def main():
|
| 937 |
os.makedirs(PLOTS_DIR, exist_ok=True)
|
| 938 |
|
| 939 |
lopo_results = run_lopo_models()
|
| 940 |
model_stats = analyse_model_thresholds(lopo_results)
|
| 941 |
+
extended_stats = analyse_precision_recall_confusion(lopo_results, model_stats)
|
| 942 |
+
plot_confusion_matrices(extended_stats)
|
| 943 |
geo_f1, best_alpha = run_geo_weight_search()
|
| 944 |
+
hybrid_mlp_f1, best_w_mlp = run_hybrid_weight_search(lopo_results)
|
| 945 |
+
hybrid_xgb_f1, best_w_xgb = run_hybrid_xgb_weight_search(lopo_results)
|
| 946 |
dist_stats = plot_distributions()
|
| 947 |
|
| 948 |
+
f1_mlp = hybrid_mlp_f1[best_w_mlp]
|
| 949 |
+
f1_xgb = hybrid_xgb_f1[best_w_xgb]
|
| 950 |
+
use_xgb_for_hybrid = f1_xgb > f1_mlp
|
| 951 |
+
print(f"\n Hybrid comparison: MLP F1 = {f1_mlp:.4f}, XGBoost F1 = {f1_xgb:.4f} → "
|
| 952 |
+
f"use {'XGBoost' if use_xgb_for_hybrid else 'MLP'}")
|
| 953 |
+
|
| 954 |
+
lr_combiner_f1 = run_hybrid_lr_combiner(lopo_results, use_xgb=use_xgb_for_hybrid)
|
| 955 |
+
combiner_threshold, combiner_path = train_and_save_hybrid_combiner(
|
| 956 |
+
lopo_results, use_xgb_for_hybrid,
|
| 957 |
+
combiner_path=os.path.join(_PROJECT_ROOT, "checkpoints", "hybrid_combiner.joblib"),
|
| 958 |
+
)
|
| 959 |
+
|
| 960 |
+
config_path = os.path.join(_PROJECT_ROOT, "checkpoints", "hybrid_focus_config.json")
|
| 961 |
+
write_hybrid_config(use_xgb_for_hybrid, best_w_mlp, best_w_xgb, config_path,
|
| 962 |
+
combiner_path=combiner_path, combiner_threshold=combiner_threshold)
|
| 963 |
+
|
| 964 |
+
write_report(model_stats, extended_stats, geo_f1, best_alpha,
|
| 965 |
+
hybrid_mlp_f1, best_w_mlp,
|
| 966 |
+
hybrid_xgb_f1, best_w_xgb,
|
| 967 |
+
use_xgb_for_hybrid, dist_stats,
|
| 968 |
+
lr_combiner_f1=lr_combiner_f1)
|
| 969 |
print("\nDone.")
|
| 970 |
|
| 971 |
|
evaluation/plots/confusion_matrix_mlp.png
ADDED
|
evaluation/plots/confusion_matrix_xgb.png
ADDED
|
evaluation/plots/hybrid_xgb_weight_search.png
ADDED
|
models/mlp/train.py
CHANGED
|
@@ -1,14 +1,15 @@
|
|
| 1 |
import json
|
| 2 |
-
import os
|
| 3 |
import random
|
| 4 |
|
| 5 |
import numpy as np
|
|
|
|
| 6 |
import torch
|
| 7 |
import torch.nn as nn
|
| 8 |
import torch.optim as optim
|
| 9 |
from sklearn.metrics import f1_score, roc_auc_score
|
| 10 |
|
| 11 |
-
from data_preparation.prepare_dataset import get_dataloaders
|
| 12 |
|
| 13 |
USE_CLEARML = False
|
| 14 |
|
|
@@ -227,6 +228,13 @@ def main():
|
|
| 227 |
|
| 228 |
print(f"[LOG] Training history saved to: {log_path}")
|
| 229 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 230 |
|
| 231 |
if __name__ == "__main__":
|
| 232 |
main()
|
|
|
|
| 1 |
import json
|
| 2 |
+
import os
|
| 3 |
import random
|
| 4 |
|
| 5 |
import numpy as np
|
| 6 |
+
import joblib
|
| 7 |
import torch
|
| 8 |
import torch.nn as nn
|
| 9 |
import torch.optim as optim
|
| 10 |
from sklearn.metrics import f1_score, roc_auc_score
|
| 11 |
|
| 12 |
+
from data_preparation.prepare_dataset import get_dataloaders, SELECTED_FEATURES
|
| 13 |
|
| 14 |
USE_CLEARML = False
|
| 15 |
|
|
|
|
| 228 |
|
| 229 |
print(f"[LOG] Training history saved to: {log_path}")
|
| 230 |
|
| 231 |
+
# Save scaler and feature names for inference (ui/pipeline.py)
|
| 232 |
+
scaler_path = os.path.join(ckpt_dir, "scaler_mlp.joblib")
|
| 233 |
+
joblib.dump(scaler, scaler_path)
|
| 234 |
+
meta_path = os.path.join(ckpt_dir, "meta_mlp.npz")
|
| 235 |
+
np.savez(meta_path, feature_names=np.array(SELECTED_FEATURES["face_orientation"]))
|
| 236 |
+
print(f"[LOG] Scaler and meta saved to {ckpt_dir}")
|
| 237 |
+
|
| 238 |
|
| 239 |
if __name__ == "__main__":
|
| 240 |
main()
|
requirements.txt
CHANGED
|
@@ -8,6 +8,7 @@ opencv-contrib-python>=4.8.0
|
|
| 8 |
numpy>=1.24.0
|
| 9 |
scikit-learn>=1.2.0
|
| 10 |
joblib>=1.2.0
|
|
|
|
| 11 |
fastapi>=0.104.0
|
| 12 |
uvicorn[standard]>=0.24.0
|
| 13 |
aiosqlite>=0.19.0
|
|
|
|
| 8 |
numpy>=1.24.0
|
| 9 |
scikit-learn>=1.2.0
|
| 10 |
joblib>=1.2.0
|
| 11 |
+
torch>=2.0.0
|
| 12 |
fastapi>=0.104.0
|
| 13 |
uvicorn[standard]>=0.24.0
|
| 14 |
aiosqlite>=0.19.0
|
ui/README.md
CHANGED
|
@@ -14,7 +14,7 @@ Live camera demo and real-time inference pipeline.
|
|
| 14 |
| Pipeline | Features | Model | Source |
|
| 15 |
|----------|----------|-------|--------|
|
| 16 |
| `FaceMeshPipeline` | Head pose + eye geometry | Rule-based fusion | `models/head_pose.py`, `models/eye_scorer.py` |
|
| 17 |
-
| `MLPPipeline` | 10 selected features | PyTorch MLP | `checkpoints/
|
| 18 |
| `XGBoostPipeline` | 10 selected features | XGBoost | `models/xgboost/checkpoints/face_orientation_best.json` |
|
| 19 |
|
| 20 |
## 3. Running
|
|
|
|
| 14 |
| Pipeline | Features | Model | Source |
|
| 15 |
|----------|----------|-------|--------|
|
| 16 |
| `FaceMeshPipeline` | Head pose + eye geometry | Rule-based fusion | `models/head_pose.py`, `models/eye_scorer.py` |
|
| 17 |
+
| `MLPPipeline` | 10 selected features | PyTorch MLP (10→64→32→2) | `checkpoints/mlp_best.pt` + `scaler_mlp.joblib` |
|
| 18 |
| `XGBoostPipeline` | 10 selected features | XGBoost | `models/xgboost/checkpoints/face_orientation_best.json` |
|
| 19 |
|
| 20 |
## 3. Running
|
ui/live_demo.py
CHANGED
|
@@ -13,7 +13,7 @@ if _PROJECT_ROOT not in sys.path:
|
|
| 13 |
|
| 14 |
from ui.pipeline import (
|
| 15 |
FaceMeshPipeline, MLPPipeline, HybridFocusPipeline,
|
| 16 |
-
XGBoostPipeline,
|
| 17 |
)
|
| 18 |
from models.face_mesh import FaceMeshDetector
|
| 19 |
|
|
@@ -149,16 +149,15 @@ def main():
|
|
| 149 |
)
|
| 150 |
available_modes.append(MODE_GEO)
|
| 151 |
|
| 152 |
-
# 2. MLP & Hybrid
|
| 153 |
-
|
| 154 |
-
if
|
| 155 |
-
# Fallback to MLP/models
|
| 156 |
alt_dir = os.path.join(_PROJECT_ROOT, "MLP", "models")
|
| 157 |
-
|
| 158 |
-
if mlp_path:
|
| 159 |
model_dir = alt_dir
|
|
|
|
| 160 |
|
| 161 |
-
if
|
| 162 |
try:
|
| 163 |
pipelines[MODE_MLP] = MLPPipeline(model_dir=model_dir, detector=detector)
|
| 164 |
available_modes.append(MODE_MLP)
|
|
|
|
| 13 |
|
| 14 |
from ui.pipeline import (
|
| 15 |
FaceMeshPipeline, MLPPipeline, HybridFocusPipeline,
|
| 16 |
+
XGBoostPipeline, _mlp_artifacts_available,
|
| 17 |
)
|
| 18 |
from models.face_mesh import FaceMeshDetector
|
| 19 |
|
|
|
|
| 149 |
)
|
| 150 |
available_modes.append(MODE_GEO)
|
| 151 |
|
| 152 |
+
# 2. MLP & Hybrid (PyTorch MLP from mlp_best.pt + scaler_mlp.joblib)
|
| 153 |
+
mlp_available = _mlp_artifacts_available(model_dir)
|
| 154 |
+
if not mlp_available and not args.mlp_dir:
|
|
|
|
| 155 |
alt_dir = os.path.join(_PROJECT_ROOT, "MLP", "models")
|
| 156 |
+
if _mlp_artifacts_available(alt_dir):
|
|
|
|
| 157 |
model_dir = alt_dir
|
| 158 |
+
mlp_available = True
|
| 159 |
|
| 160 |
+
if mlp_available:
|
| 161 |
try:
|
| 162 |
pipelines[MODE_MLP] = MLPPipeline(model_dir=model_dir, detector=detector)
|
| 163 |
available_modes.append(MODE_MLP)
|
ui/pipeline.py
CHANGED
|
@@ -7,6 +7,8 @@ import sys
|
|
| 7 |
|
| 8 |
import numpy as np
|
| 9 |
import joblib
|
|
|
|
|
|
|
| 10 |
|
| 11 |
_PROJECT_ROOT = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
|
| 12 |
if _PROJECT_ROOT not in sys.path:
|
|
@@ -72,13 +74,17 @@ class _OutputSmoother:
|
|
| 72 |
|
| 73 |
|
| 74 |
DEFAULT_HYBRID_CONFIG = {
|
|
|
|
| 75 |
"w_mlp": 0.3,
|
|
|
|
| 76 |
"w_geo": 0.7,
|
| 77 |
"threshold": 0.35,
|
| 78 |
"use_yawn_veto": True,
|
| 79 |
"geo_face_weight": 0.7,
|
| 80 |
"geo_eye_weight": 0.3,
|
| 81 |
"mar_yawn_threshold": float(MAR_YAWN_THRESHOLD),
|
|
|
|
|
|
|
| 82 |
}
|
| 83 |
|
| 84 |
|
|
@@ -237,23 +243,45 @@ class FaceMeshPipeline:
|
|
| 237 |
self.close()
|
| 238 |
|
| 239 |
|
| 240 |
-
|
| 241 |
-
|
| 242 |
-
|
| 243 |
-
|
| 244 |
-
|
| 245 |
-
|
| 246 |
-
|
| 247 |
-
|
| 248 |
-
|
| 249 |
-
|
| 250 |
-
|
| 251 |
-
|
| 252 |
-
|
| 253 |
-
|
| 254 |
-
|
| 255 |
-
|
| 256 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 257 |
|
| 258 |
|
| 259 |
def _load_hybrid_config(model_dir: str, config_path: str | None = None):
|
|
@@ -270,18 +298,29 @@ def _load_hybrid_config(model_dir: str, config_path: str | None = None):
|
|
| 270 |
if key in file_cfg:
|
| 271 |
cfg[key] = file_cfg[key]
|
| 272 |
|
| 273 |
-
cfg["
|
|
|
|
|
|
|
| 274 |
cfg["w_geo"] = float(cfg["w_geo"])
|
| 275 |
-
|
| 276 |
-
|
| 277 |
-
|
| 278 |
-
|
| 279 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 280 |
cfg["threshold"] = float(cfg["threshold"])
|
| 281 |
cfg["use_yawn_veto"] = bool(cfg["use_yawn_veto"])
|
| 282 |
cfg["geo_face_weight"] = float(cfg["geo_face_weight"])
|
| 283 |
cfg["geo_eye_weight"] = float(cfg["geo_eye_weight"])
|
| 284 |
cfg["mar_yawn_threshold"] = float(cfg["mar_yawn_threshold"])
|
|
|
|
|
|
|
| 285 |
|
| 286 |
print(f"[HYBRID] Loaded config: {resolved}")
|
| 287 |
return cfg, resolved
|
|
@@ -290,18 +329,11 @@ def _load_hybrid_config(model_dir: str, config_path: str | None = None):
|
|
| 290 |
class MLPPipeline:
|
| 291 |
def __init__(self, model_dir=None, detector=None, threshold=0.23):
|
| 292 |
if model_dir is None:
|
| 293 |
-
# Check primary location
|
| 294 |
model_dir = os.path.join(_PROJECT_ROOT, "MLP", "models")
|
| 295 |
if not os.path.exists(model_dir):
|
| 296 |
model_dir = os.path.join(_PROJECT_ROOT, "checkpoints")
|
| 297 |
|
| 298 |
-
|
| 299 |
-
if mlp_path is None:
|
| 300 |
-
raise FileNotFoundError(f"No MLP artifacts in {model_dir}")
|
| 301 |
-
self._mlp = joblib.load(mlp_path)
|
| 302 |
-
self._scaler = joblib.load(scaler_path)
|
| 303 |
-
meta = np.load(meta_path, allow_pickle=True)
|
| 304 |
-
self._feature_names = list(meta["feature_names"])
|
| 305 |
self._indices = [FEATURE_NAMES.index(n) for n in self._feature_names]
|
| 306 |
|
| 307 |
self._detector = detector or FaceMeshDetector()
|
|
@@ -312,7 +344,7 @@ class MLPPipeline:
|
|
| 312 |
self._temporal = TemporalTracker()
|
| 313 |
self._smoother = _OutputSmoother()
|
| 314 |
self._threshold = threshold
|
| 315 |
-
print(f"[MLP] Loaded {
|
| 316 |
|
| 317 |
def process_frame(self, bgr_frame):
|
| 318 |
landmarks = self._detector.process(bgr_frame)
|
|
@@ -344,12 +376,13 @@ class MLPPipeline:
|
|
| 344 |
out["s_eye"] = float(vec[_FEAT_IDX["s_eye"]])
|
| 345 |
out["mar"] = float(vec[_FEAT_IDX["mar"]])
|
| 346 |
|
| 347 |
-
X = vec[self._indices].reshape(1, -1).astype(np.
|
| 348 |
X_sc = self._scaler.transform(X)
|
| 349 |
-
|
| 350 |
-
|
| 351 |
-
|
| 352 |
-
|
|
|
|
| 353 |
out["mlp_prob"] = float(np.clip(mlp_prob, 0.0, 1.0))
|
| 354 |
out["raw_score"] = self._smoother.update(out["mlp_prob"], True)
|
| 355 |
out["is_focused"] = out["raw_score"] >= self._threshold
|
|
@@ -370,6 +403,13 @@ class MLPPipeline:
|
|
| 370 |
self.close()
|
| 371 |
|
| 372 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 373 |
class HybridFocusPipeline:
|
| 374 |
def __init__(
|
| 375 |
self,
|
|
@@ -380,17 +420,8 @@ class HybridFocusPipeline:
|
|
| 380 |
):
|
| 381 |
if model_dir is None:
|
| 382 |
model_dir = os.path.join(_PROJECT_ROOT, "checkpoints")
|
| 383 |
-
mlp_path, scaler_path, meta_path = _latest_model_artifacts(model_dir)
|
| 384 |
-
if mlp_path is None:
|
| 385 |
-
raise FileNotFoundError(f"No MLP artifacts in {model_dir}")
|
| 386 |
-
|
| 387 |
-
self._mlp = joblib.load(mlp_path)
|
| 388 |
-
self._scaler = joblib.load(scaler_path)
|
| 389 |
-
meta = np.load(meta_path, allow_pickle=True)
|
| 390 |
-
self._feature_names = list(meta["feature_names"])
|
| 391 |
-
self._indices = [FEATURE_NAMES.index(n) for n in self._feature_names]
|
| 392 |
-
|
| 393 |
self._cfg, self._cfg_path = _load_hybrid_config(model_dir=model_dir, config_path=config_path)
|
|
|
|
| 394 |
|
| 395 |
self._detector = detector or FaceMeshDetector()
|
| 396 |
self._owns_detector = detector is None
|
|
@@ -400,11 +431,41 @@ class HybridFocusPipeline:
|
|
| 400 |
self.head_pose = self._head_pose
|
| 401 |
self._smoother = _OutputSmoother()
|
| 402 |
|
| 403 |
-
|
| 404 |
-
|
| 405 |
-
|
| 406 |
-
|
| 407 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 408 |
|
| 409 |
@property
|
| 410 |
def config(self) -> dict:
|
|
@@ -465,15 +526,32 @@ class HybridFocusPipeline:
|
|
| 465 |
}
|
| 466 |
vec = extract_features(landmarks, w, h, self._head_pose, self._eye_scorer, self._temporal, _pre=pre)
|
| 467 |
vec = _clip_features(vec)
|
| 468 |
-
|
| 469 |
-
|
| 470 |
-
|
| 471 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 472 |
else:
|
| 473 |
-
|
| 474 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 475 |
|
| 476 |
-
focus_score = self._cfg["w_mlp"] * out["mlp_prob"] + self._cfg["w_geo"] * out["geo_score"]
|
| 477 |
out["focus_score"] = self._smoother.update(float(np.clip(focus_score, 0.0, 1.0)), True)
|
| 478 |
out["raw_score"] = out["focus_score"]
|
| 479 |
out["is_focused"] = out["focus_score"] >= self._cfg["threshold"]
|
|
|
|
| 7 |
|
| 8 |
import numpy as np
|
| 9 |
import joblib
|
| 10 |
+
import torch
|
| 11 |
+
import torch.nn as nn
|
| 12 |
|
| 13 |
_PROJECT_ROOT = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
|
| 14 |
if _PROJECT_ROOT not in sys.path:
|
|
|
|
| 74 |
|
| 75 |
|
| 76 |
DEFAULT_HYBRID_CONFIG = {
|
| 77 |
+
"use_xgb": False,
|
| 78 |
"w_mlp": 0.3,
|
| 79 |
+
"w_xgb": 0.0,
|
| 80 |
"w_geo": 0.7,
|
| 81 |
"threshold": 0.35,
|
| 82 |
"use_yawn_veto": True,
|
| 83 |
"geo_face_weight": 0.7,
|
| 84 |
"geo_eye_weight": 0.3,
|
| 85 |
"mar_yawn_threshold": float(MAR_YAWN_THRESHOLD),
|
| 86 |
+
"combiner": None,
|
| 87 |
+
"combiner_path": None,
|
| 88 |
}
|
| 89 |
|
| 90 |
|
|
|
|
| 243 |
self.close()
|
| 244 |
|
| 245 |
|
| 246 |
+
# PyTorch MLP matching models/mlp/train.py BaseModel (10 -> 64 -> 32 -> 2)
|
| 247 |
+
class _FocusMLP(nn.Module):
|
| 248 |
+
def __init__(self, num_features: int, num_classes: int = 2):
|
| 249 |
+
super().__init__()
|
| 250 |
+
self.network = nn.Sequential(
|
| 251 |
+
nn.Linear(num_features, 64),
|
| 252 |
+
nn.ReLU(),
|
| 253 |
+
nn.Linear(64, 32),
|
| 254 |
+
nn.ReLU(),
|
| 255 |
+
nn.Linear(32, num_classes),
|
| 256 |
+
)
|
| 257 |
+
|
| 258 |
+
def forward(self, x):
|
| 259 |
+
return self.network(x)
|
| 260 |
+
|
| 261 |
+
|
| 262 |
+
def _mlp_artifacts_available(model_dir: str) -> bool:
|
| 263 |
+
pt_path = os.path.join(model_dir, "mlp_best.pt")
|
| 264 |
+
scaler_path = os.path.join(model_dir, "scaler_mlp.joblib")
|
| 265 |
+
return os.path.isfile(pt_path) and os.path.isfile(scaler_path)
|
| 266 |
+
|
| 267 |
+
|
| 268 |
+
def _load_mlp_artifacts(model_dir: str):
|
| 269 |
+
"""Load PyTorch MLP + scaler from checkpoints. Returns (model, scaler, feature_names)."""
|
| 270 |
+
pt_path = os.path.join(model_dir, "mlp_best.pt")
|
| 271 |
+
scaler_path = os.path.join(model_dir, "scaler_mlp.joblib")
|
| 272 |
+
if not os.path.isfile(pt_path):
|
| 273 |
+
raise FileNotFoundError(f"No MLP checkpoint at {pt_path}")
|
| 274 |
+
if not os.path.isfile(scaler_path):
|
| 275 |
+
raise FileNotFoundError(f"No scaler at {scaler_path}")
|
| 276 |
+
|
| 277 |
+
num_features = len(MLP_FEATURE_NAMES)
|
| 278 |
+
num_classes = 2
|
| 279 |
+
model = _FocusMLP(num_features, num_classes)
|
| 280 |
+
model.load_state_dict(torch.load(pt_path, map_location="cpu", weights_only=True))
|
| 281 |
+
model.eval()
|
| 282 |
+
|
| 283 |
+
scaler = joblib.load(scaler_path)
|
| 284 |
+
return model, scaler, list(MLP_FEATURE_NAMES)
|
| 285 |
|
| 286 |
|
| 287 |
def _load_hybrid_config(model_dir: str, config_path: str | None = None):
|
|
|
|
| 298 |
if key in file_cfg:
|
| 299 |
cfg[key] = file_cfg[key]
|
| 300 |
|
| 301 |
+
cfg["use_xgb"] = bool(cfg.get("use_xgb", False))
|
| 302 |
+
cfg["w_mlp"] = float(cfg.get("w_mlp", 0.3))
|
| 303 |
+
cfg["w_xgb"] = float(cfg.get("w_xgb", 0.0))
|
| 304 |
cfg["w_geo"] = float(cfg["w_geo"])
|
| 305 |
+
if cfg["use_xgb"]:
|
| 306 |
+
weight_sum = cfg["w_xgb"] + cfg["w_geo"]
|
| 307 |
+
if weight_sum <= 0:
|
| 308 |
+
raise ValueError("[HYBRID] Invalid config: w_xgb + w_geo must be > 0")
|
| 309 |
+
cfg["w_xgb"] /= weight_sum
|
| 310 |
+
cfg["w_geo"] /= weight_sum
|
| 311 |
+
else:
|
| 312 |
+
weight_sum = cfg["w_mlp"] + cfg["w_geo"]
|
| 313 |
+
if weight_sum <= 0:
|
| 314 |
+
raise ValueError("[HYBRID] Invalid config: w_mlp + w_geo must be > 0")
|
| 315 |
+
cfg["w_mlp"] /= weight_sum
|
| 316 |
+
cfg["w_geo"] /= weight_sum
|
| 317 |
cfg["threshold"] = float(cfg["threshold"])
|
| 318 |
cfg["use_yawn_veto"] = bool(cfg["use_yawn_veto"])
|
| 319 |
cfg["geo_face_weight"] = float(cfg["geo_face_weight"])
|
| 320 |
cfg["geo_eye_weight"] = float(cfg["geo_eye_weight"])
|
| 321 |
cfg["mar_yawn_threshold"] = float(cfg["mar_yawn_threshold"])
|
| 322 |
+
cfg["combiner"] = cfg.get("combiner") or None
|
| 323 |
+
cfg["combiner_path"] = cfg.get("combiner_path") or None
|
| 324 |
|
| 325 |
print(f"[HYBRID] Loaded config: {resolved}")
|
| 326 |
return cfg, resolved
|
|
|
|
| 329 |
class MLPPipeline:
|
| 330 |
def __init__(self, model_dir=None, detector=None, threshold=0.23):
|
| 331 |
if model_dir is None:
|
|
|
|
| 332 |
model_dir = os.path.join(_PROJECT_ROOT, "MLP", "models")
|
| 333 |
if not os.path.exists(model_dir):
|
| 334 |
model_dir = os.path.join(_PROJECT_ROOT, "checkpoints")
|
| 335 |
|
| 336 |
+
self._mlp, self._scaler, self._feature_names = _load_mlp_artifacts(model_dir)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 337 |
self._indices = [FEATURE_NAMES.index(n) for n in self._feature_names]
|
| 338 |
|
| 339 |
self._detector = detector or FaceMeshDetector()
|
|
|
|
| 344 |
self._temporal = TemporalTracker()
|
| 345 |
self._smoother = _OutputSmoother()
|
| 346 |
self._threshold = threshold
|
| 347 |
+
print(f"[MLP] Loaded PyTorch MLP from {model_dir} | {len(self._feature_names)} features | threshold={threshold}")
|
| 348 |
|
| 349 |
def process_frame(self, bgr_frame):
|
| 350 |
landmarks = self._detector.process(bgr_frame)
|
|
|
|
| 376 |
out["s_eye"] = float(vec[_FEAT_IDX["s_eye"]])
|
| 377 |
out["mar"] = float(vec[_FEAT_IDX["mar"]])
|
| 378 |
|
| 379 |
+
X = vec[self._indices].reshape(1, -1).astype(np.float32)
|
| 380 |
X_sc = self._scaler.transform(X)
|
| 381 |
+
with torch.no_grad():
|
| 382 |
+
x_t = torch.from_numpy(X_sc).float()
|
| 383 |
+
logits = self._mlp(x_t)
|
| 384 |
+
probs = torch.softmax(logits, dim=1)
|
| 385 |
+
mlp_prob = float(probs[0, 1])
|
| 386 |
out["mlp_prob"] = float(np.clip(mlp_prob, 0.0, 1.0))
|
| 387 |
out["raw_score"] = self._smoother.update(out["mlp_prob"], True)
|
| 388 |
out["is_focused"] = out["raw_score"] >= self._threshold
|
|
|
|
| 403 |
self.close()
|
| 404 |
|
| 405 |
|
| 406 |
+
def _resolve_xgb_path():
|
| 407 |
+
p = os.path.join(_PROJECT_ROOT, "models", "xgboost", "checkpoints", "face_orientation_best.json")
|
| 408 |
+
if os.path.isfile(p):
|
| 409 |
+
return p
|
| 410 |
+
return os.path.join(_PROJECT_ROOT, "checkpoints", "xgboost_face_orientation_best.json")
|
| 411 |
+
|
| 412 |
+
|
| 413 |
class HybridFocusPipeline:
|
| 414 |
def __init__(
|
| 415 |
self,
|
|
|
|
| 420 |
):
|
| 421 |
if model_dir is None:
|
| 422 |
model_dir = os.path.join(_PROJECT_ROOT, "checkpoints")
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 423 |
self._cfg, self._cfg_path = _load_hybrid_config(model_dir=model_dir, config_path=config_path)
|
| 424 |
+
self._use_xgb = self._cfg["use_xgb"]
|
| 425 |
|
| 426 |
self._detector = detector or FaceMeshDetector()
|
| 427 |
self._owns_detector = detector is None
|
|
|
|
| 431 |
self.head_pose = self._head_pose
|
| 432 |
self._smoother = _OutputSmoother()
|
| 433 |
|
| 434 |
+
self._combiner = None
|
| 435 |
+
combiner_path = self._cfg.get("combiner_path")
|
| 436 |
+
if combiner_path and self._cfg.get("combiner") == "logistic":
|
| 437 |
+
resolved_combiner = combiner_path if os.path.isabs(combiner_path) else os.path.join(model_dir, combiner_path)
|
| 438 |
+
if not os.path.isfile(resolved_combiner):
|
| 439 |
+
resolved_combiner = os.path.join(_PROJECT_ROOT, combiner_path)
|
| 440 |
+
if os.path.isfile(resolved_combiner):
|
| 441 |
+
blob = joblib.load(resolved_combiner)
|
| 442 |
+
self._combiner = blob.get("combiner")
|
| 443 |
+
if self._combiner is None:
|
| 444 |
+
self._combiner = blob
|
| 445 |
+
print(f"[HYBRID] LR combiner loaded from {resolved_combiner}")
|
| 446 |
+
else:
|
| 447 |
+
print(f"[HYBRID] combiner_path not found: {resolved_combiner}, using heuristic weights")
|
| 448 |
+
if self._use_xgb:
|
| 449 |
+
from xgboost import XGBClassifier
|
| 450 |
+
xgb_path = _resolve_xgb_path()
|
| 451 |
+
if not os.path.isfile(xgb_path):
|
| 452 |
+
raise FileNotFoundError(f"No XGBoost checkpoint at {xgb_path}")
|
| 453 |
+
self._xgb_model = XGBClassifier()
|
| 454 |
+
self._xgb_model.load_model(xgb_path)
|
| 455 |
+
self._xgb_indices = [FEATURE_NAMES.index(n) for n in XGBoostPipeline.SELECTED]
|
| 456 |
+
self._mlp = None
|
| 457 |
+
self._scaler = None
|
| 458 |
+
self._indices = None
|
| 459 |
+
self._feature_names = list(XGBoostPipeline.SELECTED)
|
| 460 |
+
mode = "LR combiner" if self._combiner else f"w_xgb={self._cfg['w_xgb']:.2f}, w_geo={self._cfg['w_geo']:.2f}"
|
| 461 |
+
print(f"[HYBRID] XGBoost+geo | {xgb_path} | {mode}, threshold={self._cfg['threshold']:.2f}")
|
| 462 |
+
else:
|
| 463 |
+
self._mlp, self._scaler, self._feature_names = _load_mlp_artifacts(model_dir)
|
| 464 |
+
self._indices = [FEATURE_NAMES.index(n) for n in self._feature_names]
|
| 465 |
+
self._xgb_model = None
|
| 466 |
+
self._xgb_indices = None
|
| 467 |
+
mode = "LR combiner" if self._combiner else f"w_mlp={self._cfg['w_mlp']:.2f}, w_geo={self._cfg['w_geo']:.2f}"
|
| 468 |
+
print(f"[HYBRID] MLP+geo | {len(self._feature_names)} features | {mode}, threshold={self._cfg['threshold']:.2f}")
|
| 469 |
|
| 470 |
@property
|
| 471 |
def config(self) -> dict:
|
|
|
|
| 526 |
}
|
| 527 |
vec = extract_features(landmarks, w, h, self._head_pose, self._eye_scorer, self._temporal, _pre=pre)
|
| 528 |
vec = _clip_features(vec)
|
| 529 |
+
|
| 530 |
+
if self._use_xgb:
|
| 531 |
+
X = vec[self._xgb_indices].reshape(1, -1).astype(np.float32)
|
| 532 |
+
prob = self._xgb_model.predict_proba(X)[0]
|
| 533 |
+
model_prob = float(np.clip(prob[1], 0.0, 1.0))
|
| 534 |
+
out["mlp_prob"] = model_prob
|
| 535 |
+
if self._combiner is not None:
|
| 536 |
+
meta = np.array([[model_prob, out["geo_score"]]], dtype=np.float32)
|
| 537 |
+
focus_score = float(self._combiner.predict_proba(meta)[0, 1])
|
| 538 |
+
else:
|
| 539 |
+
focus_score = self._cfg["w_xgb"] * model_prob + self._cfg["w_geo"] * out["geo_score"]
|
| 540 |
else:
|
| 541 |
+
X = vec[self._indices].reshape(1, -1).astype(np.float32)
|
| 542 |
+
X_sc = self._scaler.transform(X)
|
| 543 |
+
with torch.no_grad():
|
| 544 |
+
x_t = torch.from_numpy(X_sc).float()
|
| 545 |
+
logits = self._mlp(x_t)
|
| 546 |
+
probs = torch.softmax(logits, dim=1)
|
| 547 |
+
mlp_prob = float(probs[0, 1])
|
| 548 |
+
out["mlp_prob"] = float(np.clip(mlp_prob, 0.0, 1.0))
|
| 549 |
+
if self._combiner is not None:
|
| 550 |
+
meta = np.array([[out["mlp_prob"], out["geo_score"]]], dtype=np.float32)
|
| 551 |
+
focus_score = float(self._combiner.predict_proba(meta)[0, 1])
|
| 552 |
+
else:
|
| 553 |
+
focus_score = self._cfg["w_mlp"] * out["mlp_prob"] + self._cfg["w_geo"] * out["geo_score"]
|
| 554 |
|
|
|
|
| 555 |
out["focus_score"] = self._smoother.update(float(np.clip(focus_score, 0.0, 1.0)), True)
|
| 556 |
out["raw_score"] = out["focus_score"]
|
| 557 |
out["is_focused"] = out["focus_score"] >= self._cfg["threshold"]
|