Spaces:
Sleeping
Sleeping
| title: FocusGuard | |
| colorFrom: indigo | |
| colorTo: purple | |
| sdk: docker | |
| pinned: false | |
| # FocusGuard - Real-Time Focus Detection | |
| A web app that monitors whether you're focused on your screen using your webcam. Combines head pose estimation, eye behaviour analysis, and deep learning gaze tracking to detect attention in real time. | |
| ## How It Works | |
| 1. **Open the app** and click **Start** - your webcam feed appears with a face mesh overlay. | |
| 2. **Pick a model** from the selector bar (Geometric, XGBoost, L2CS, etc.). | |
| 3. The system analyses each frame and shows **FOCUSED** or **NOT FOCUSED** with a confidence score. | |
| 4. A timeline tracks your focus over time. Session history is saved for review. | |
| ## Models | |
| | Model | What it uses | Best for | | |
| |-------|-------------|----------| | |
| | **Geometric** | Head pose angles + eye aspect ratio (EAR) | Fast, no ML needed | | |
| | **XGBoost** | Trained classifier on head/eye features | Balanced accuracy/speed | | |
| | **MLP** | Neural network on same features | Higher accuracy | | |
| | **Hybrid** | Weighted MLP + Geometric ensemble | Best head-pose accuracy | | |
| | **L2CS** | Deep gaze estimation (ResNet50) | Detects eye-only gaze shifts | | |
| ## L2CS Gaze Tracking | |
| L2CS-Net predicts where your eyes are looking, not just where your head is pointed. This catches the scenario where your head faces the screen but your eyes wander. | |
| ### Standalone mode | |
| Select **L2CS** as the model - it handles everything. | |
| ### Boost mode | |
| Select any other model, then click the **GAZE** toggle. L2CS runs alongside the base model: | |
| - Base model handles head pose and eye openness (35% weight) | |
| - L2CS handles gaze direction (65% weight) | |
| - If L2CS detects gaze is clearly off-screen, it **vetoes** the base model regardless of score | |
| ### Calibration | |
| After enabling L2CS or Gaze Boost, click **Calibrate** while a session is running: | |
| 1. A fullscreen overlay shows 9 target dots (3x3 grid) | |
| 2. Look at each dot as the progress ring fills | |
| 3. The first dot (centre) sets your baseline gaze offset | |
| 4. After all 9 points, a polynomial model maps your gaze angles to screen coordinates | |
| 5. A cyan tracking dot appears on the video showing where you're looking | |
| ## Tech Stack | |
| - **Backend**: FastAPI + WebSocket, Python 3.10 | |
| - **Frontend**: React + Vite | |
| - **Face detection**: MediaPipe Face Landmarker (478 landmarks) | |
| - **Gaze estimation**: L2CS-Net (ResNet50, Gaze360 weights) | |
| - **ML models**: XGBoost, PyTorch MLP | |
| - **Deployment**: Docker on Hugging Face Spaces | |
| ## Running Locally | |
| ```bash | |
| # install Python deps | |
| pip install -r requirements.txt | |
| # install frontend deps and build | |
| npm install && npm run build | |
| # start the server | |
| uvicorn main:app --port 8000 | |
| ``` | |
| Open `http://localhost:8000` in your browser. | |
| ## Project Structure | |
| ``` | |
| main.py # FastAPI app, WebSocket handler, API endpoints | |
| ui/pipeline.py # All focus detection pipelines (Geometric, MLP, XGBoost, Hybrid, L2CS) | |
| models/ | |
| face_mesh.py # MediaPipe face landmark detector | |
| head_pose.py # Head pose estimation from landmarks | |
| eye_scorer.py # EAR/eye behaviour scoring | |
| gaze_calibration.py # 9-point polynomial gaze calibration | |
| gaze_eye_fusion.py # Fuses calibrated gaze with eye openness | |
| L2CS-Net/ # In-tree L2CS-Net repo with Gaze360 weights | |
| src/ | |
| components/ | |
| FocusPageLocal.jsx # Main focus page (camera, controls, model selector) | |
| CalibrationOverlay.jsx # Fullscreen calibration UI | |
| utils/ | |
| VideoManagerLocal.js # WebSocket client, frame capture, canvas rendering | |
| Dockerfile # Docker build for HF Spaces | |
| ``` | |