Spaces:

FocusGuard
/

IntegrationTest

Sleeping

App Files Files Community

IntegrationTest / README.md

Abdelrahman Almatrooshi

Integrate L2CS-Net gaze estimation

2eba0cc 6 days ago

preview code

raw

history blame contribute delete

3.62 kB

metadata

title: FocusGuard
colorFrom: indigo
colorTo: purple
sdk: docker
pinned: false

FocusGuard - Real-Time Focus Detection

A web app that monitors whether you're focused on your screen using your webcam. Combines head pose estimation, eye behaviour analysis, and deep learning gaze tracking to detect attention in real time.

How It Works

Open the app and click Start - your webcam feed appears with a face mesh overlay.
Pick a model from the selector bar (Geometric, XGBoost, L2CS, etc.).
The system analyses each frame and shows FOCUSED or NOT FOCUSED with a confidence score.
A timeline tracks your focus over time. Session history is saved for review.

Models

Model	What it uses	Best for
Geometric	Head pose angles + eye aspect ratio (EAR)	Fast, no ML needed
XGBoost	Trained classifier on head/eye features	Balanced accuracy/speed
MLP	Neural network on same features	Higher accuracy
Hybrid	Weighted MLP + Geometric ensemble	Best head-pose accuracy
L2CS	Deep gaze estimation (ResNet50)	Detects eye-only gaze shifts

L2CS Gaze Tracking

L2CS-Net predicts where your eyes are looking, not just where your head is pointed. This catches the scenario where your head faces the screen but your eyes wander.

Standalone mode

Select L2CS as the model - it handles everything.

Boost mode

Select any other model, then click the GAZE toggle. L2CS runs alongside the base model:

Base model handles head pose and eye openness (35% weight)
L2CS handles gaze direction (65% weight)
If L2CS detects gaze is clearly off-screen, it vetoes the base model regardless of score

Calibration

After enabling L2CS or Gaze Boost, click Calibrate while a session is running:

A fullscreen overlay shows 9 target dots (3x3 grid)
Look at each dot as the progress ring fills
The first dot (centre) sets your baseline gaze offset
After all 9 points, a polynomial model maps your gaze angles to screen coordinates
A cyan tracking dot appears on the video showing where you're looking

Tech Stack

Backend: FastAPI + WebSocket, Python 3.10
Frontend: React + Vite
Face detection: MediaPipe Face Landmarker (478 landmarks)
Gaze estimation: L2CS-Net (ResNet50, Gaze360 weights)
ML models: XGBoost, PyTorch MLP
Deployment: Docker on Hugging Face Spaces

Running Locally

# install Python deps
pip install -r requirements.txt

# install frontend deps and build
npm install && npm run build

# start the server
uvicorn main:app --port 8000

Open http://localhost:8000 in your browser.

Project Structure

main.py                     # FastAPI app, WebSocket handler, API endpoints
ui/pipeline.py              # All focus detection pipelines (Geometric, MLP, XGBoost, Hybrid, L2CS)
models/
  face_mesh.py              # MediaPipe face landmark detector
  head_pose.py              # Head pose estimation from landmarks
  eye_scorer.py             # EAR/eye behaviour scoring
  gaze_calibration.py       # 9-point polynomial gaze calibration
  gaze_eye_fusion.py        # Fuses calibrated gaze with eye openness
  L2CS-Net/                 # In-tree L2CS-Net repo with Gaze360 weights
src/
  components/
    FocusPageLocal.jsx      # Main focus page (camera, controls, model selector)
    CalibrationOverlay.jsx  # Fullscreen calibration UI
  utils/
    VideoManagerLocal.js    # WebSocket client, frame capture, canvas rendering
Dockerfile                  # Docker build for HF Spaces