Spaces:

FocusGuard
/

integration_test2

Running

App Files Files Community

integration_test2 / README.md

Abdelrahman Almatrooshi

FocusGuard with L2CS-Net gaze estimation

7b53d75 5 days ago

preview code

raw

history blame contribute delete

4.47 kB

metadata

title: FocusGuard
colorFrom: indigo
colorTo: purple
sdk: docker
pinned: false

FocusGuard

Webcam-based focus detection: MediaPipe face mesh -> 17 features (EAR, gaze, head pose, PERCLOS, etc.) -> MLP or XGBoost for focused/unfocused. React + FastAPI app with WebSocket video.

Project layout

├── data/                 collected_<name>/*.npz
├── data_preparation/     loaders, split, scale
├── notebooks/            MLP/XGB training + LOPO
├── models/               face_mesh, head_pose, eye_scorer, train scripts
│   ├── gaze_calibration.py   9-point polynomial gaze calibration
│   ├── gaze_eye_fusion.py    Fuses calibrated gaze with eye openness
│   └── L2CS-Net/              In-tree L2CS-Net repo with Gaze360 weights
├── checkpoints/          mlp_best.pt, xgboost_*_best.json, scalers
├── evaluation/           logs, plots, justify_thresholds
├── ui/                   pipeline.py, live_demo.py
├── src/                  React frontend
│   ├── components/
│   │   ├── FocusPageLocal.jsx      Main focus page (camera, controls, model selector)
│   │   └── CalibrationOverlay.jsx  Fullscreen calibration UI
│   └── utils/
│       └── VideoManagerLocal.js    WebSocket client, frame capture, canvas rendering
├── static/               built frontend (after npm run build)
├── main.py, app.py       FastAPI backend
├── requirements.txt
└── package.json

Setup

python -m venv venv
source venv/bin/activate
pip install -r requirements.txt

To rebuild the frontend after changes:

npm install
npm run build
mkdir -p static && cp -r dist/* static/

Run

Web app: Use the venv and run uvicorn via Python so it picks up your deps (otherwise you get ModuleNotFoundError: aiosqlite):

source venv/bin/activate
python -m uvicorn main:app --host 0.0.0.0 --port 7860

Then open http://localhost:7860.

Frontend dev server (optional, for React development):

npm run dev

OpenCV demo:

python ui/live_demo.py
python ui/live_demo.py --xgb

Train:

python -m models.mlp.train
python -m models.xgboost.train

Data

9 participants, 144,793 samples, 10 features, binary labels. Collect with python -m models.collect_features --name <name>. Data lives in data/collected_<name>/.

Models

Model	What it uses	Best for
Geometric	Head pose angles + eye aspect ratio (EAR)	Fast, no ML needed
XGBoost	Trained classifier on head/eye features (600 trees, depth 8)	Balanced accuracy/speed
MLP	Neural network on same features (64->32)	Higher accuracy
Hybrid	Weighted MLP + Geometric ensemble	Best head-pose accuracy
L2CS	Deep gaze estimation (ResNet50, Gaze360 weights)	Detects eye-only gaze shifts

Model numbers (15% test split)

Model	Accuracy	F1	ROC-AUC
XGBoost (600 trees, depth 8)	95.87%	0.959	0.991
MLP (64->32)	92.92%	0.929	0.971

L2CS Gaze Tracking

L2CS-Net predicts where your eyes are looking, not just where your head is pointed. This catches the scenario where your head faces the screen but your eyes wander.

Standalone mode

Select L2CS as the model - it handles everything.

Boost mode

Select any other model, then click the GAZE toggle. L2CS runs alongside the base model:

Base model handles head pose and eye openness (35% weight)
L2CS handles gaze direction (65% weight)
If L2CS detects gaze is clearly off-screen, it vetoes the base model regardless of score

Calibration

After enabling L2CS or Gaze Boost, click Calibrate while a session is running:

A fullscreen overlay shows 9 target dots (3x3 grid)
Look at each dot as the progress ring fills
The first dot (centre) sets your baseline gaze offset
After all 9 points, a polynomial model maps your gaze angles to screen coordinates
A cyan tracking dot appears on the video showing where you're looking

Pipeline

Face mesh (MediaPipe 478 pts)
Head pose -> yaw, pitch, roll, scores, gaze offset
Eye scorer -> EAR, gaze ratio, MAR
Temporal -> PERCLOS, blink rate, yawn
10-d vector -> MLP or XGBoost -> focused / unfocused

Stack: FastAPI, aiosqlite, React/Vite, PyTorch, XGBoost, MediaPipe, OpenCV, L2CS-Net.