Spaces:

FocusGuard
/

final

Sleeping

App Files Files Community

final / evaluation /feature_selection_justification.md

k22056537

chore: MLP pipeline, evaluation updates, feature importance, confusion matrices

8b47064 7 days ago

preview code

raw

history blame contribute delete

1.77 kB

	# Feature selection justification

	The face_orientation model uses 10 of 17 extracted features. This document summarises empirical support.

	## 1. Domain rationale

	The 10 features were chosen to cover three channels:
	- Head pose: head_deviation, s_face, pitch
	- Eye state: ear_left, ear_right, ear_avg, perclos
	- Gaze: h_gaze, gaze_offset, s_eye

	Excluded: v_gaze (noisy), mar (rare events), yaw/roll (redundant with head_deviation/s_face), blink_rate/closure_duration/yawn_duration (temporal overlap with perclos).

	## 2. XGBoost feature importance (gain)

	From the trained XGBoost checkpoint (gain on the 10 features):

	\| Feature \| Gain \|
	\|---------\|------\|
	\| head_deviation \| 8.83 \|
	\| s_face \| 10.27 \|
	\| s_eye \| 2.18 \|
	\| h_gaze \| 4.99 \|
	\| pitch \| 4.64 \|
	\| ear_left \| 3.57 \|
	\| ear_avg \| 6.96 \|
	\| ear_right \| 9.54 \|
	\| gaze_offset \| 1.80 \|
	\| perclos \| 5.68 \|

	Top 5 by gain: s_face, ear_right, head_deviation, ear_avg, perclos.

	## 3. Leave-one-feature-out ablation (LOPO)

	Baseline (all 10 features) mean LOPO F1: 0.8327.

	\| Feature dropped \| Mean LOPO F1 \| Δ vs baseline \|
	\|------------------\|--------------\|---------------\|
	\| head_deviation \| 0.8395 \| -0.0068 \|
	\| s_face \| 0.8390 \| -0.0063 \|
	\| s_eye \| 0.8342 \| -0.0015 \|
	\| h_gaze \| 0.8244 \| +0.0083 \|
	\| pitch \| 0.8250 \| +0.0077 \|
	\| ear_left \| 0.8326 \| +0.0001 \|
	\| ear_avg \| 0.8350 \| -0.0023 \|
	\| ear_right \| 0.8344 \| -0.0017 \|
	\| gaze_offset \| 0.8351 \| -0.0024 \|
	\| perclos \| 0.8258 \| +0.0069 \|

	Dropping h_gaze hurts most (F1=0.8244), consistent with it being important.

	## 4. Conclusion

	Selection is supported by (1) domain rationale (three attention channels), (2) XGBoost gain importance, and (3) leave-one-out ablation. SHAP or correlation-based pruning can be added in future work.

	# Feature selection justification

	The face_orientation model uses 10 of 17 extracted features. This document summarises empirical support.

	## 1. Domain rationale

	The 10 features were chosen to cover three channels:
	- Head pose: head_deviation, s_face, pitch
	- Eye state: ear_left, ear_right, ear_avg, perclos
	- Gaze: h_gaze, gaze_offset, s_eye

	Excluded: v_gaze (noisy), mar (rare events), yaw/roll (redundant with head_deviation/s_face), blink_rate/closure_duration/yawn_duration (temporal overlap with perclos).

	## 2. XGBoost feature importance (gain)

	From the trained XGBoost checkpoint (gain on the 10 features):

	\| Feature \| Gain \|
	\|---------\|------\|
	\| head_deviation \| 8.83 \|
	\| s_face \| 10.27 \|
	\| s_eye \| 2.18 \|
	\| h_gaze \| 4.99 \|
	\| pitch \| 4.64 \|
	\| ear_left \| 3.57 \|
	\| ear_avg \| 6.96 \|
	\| ear_right \| 9.54 \|
	\| gaze_offset \| 1.80 \|
	\| perclos \| 5.68 \|

	Top 5 by gain: s_face, ear_right, head_deviation, ear_avg, perclos.

	## 3. Leave-one-feature-out ablation (LOPO)

	Baseline (all 10 features) mean LOPO F1: 0.8327.

	\| Feature dropped \| Mean LOPO F1 \| Δ vs baseline \|
	\|------------------\|--------------\|---------------\|
	\| head_deviation \| 0.8395 \| -0.0068 \|
	\| s_face \| 0.8390 \| -0.0063 \|
	\| s_eye \| 0.8342 \| -0.0015 \|
	\| h_gaze \| 0.8244 \| +0.0083 \|
	\| pitch \| 0.8250 \| +0.0077 \|
	\| ear_left \| 0.8326 \| +0.0001 \|
	\| ear_avg \| 0.8350 \| -0.0023 \|
	\| ear_right \| 0.8344 \| -0.0017 \|
	\| gaze_offset \| 0.8351 \| -0.0024 \|
	\| perclos \| 0.8258 \| +0.0069 \|

	Dropping h_gaze hurts most (F1=0.8244), consistent with it being important.

	## 4. Conclusion

	Selection is supported by (1) domain rationale (three attention channels), (2) XGBoost gain importance, and (3) leave-one-out ablation. SHAP or correlation-based pruning can be added in future work.